Houston, we (may) have a problem ...

Updated: 3 days ago

I hear from many of my Houston area friends how excited they are that lockdowns have been lifted, and they can return to some sense of normalcy -- people can get their hair done, visit their favorite restaurants (and Houston has some great ones we really miss!), and most importantly many people who have been out of work can go back to work. I know a lot of people here in Pennsylvania who are a bit jealous and wish our lockdowns would be lifted.

If you have followed my blog, you know that my viewpoint is that I agree that total lockdowns/stay-at-home orders were the right strategy early on before we knew much and as the disease started spreading out of control from New York, but that as we have learned more, I came to believe that they were overkill -- a blunt instrument -- unnecessarily strict and causing great collateral damage to other elements of society -- economy, education, psychosocial, as well as public health to the degree that some people were not seeking care for potentially serious issues or not receiving preventative care because of fear. Based on what we've learned about this virus, I have come to believe that we can mitigate the spread of the virus with more targeted steps, retaining key aspects of the lockdown that are most important based on emerging empirical evidence and scientific papers, while relaxing others that might not be so important. I think this balanced approach is the best management solution we have as a nation right now.

As we look around the country, many places in which lockdowns have been lifted have not yet experienced large upticks in infections, and many haven't experienced any upticks yet. Given the lag between infection, testing, and confirmation, we need to keep watching the data, but this is encouraging. I have seen reports from some of these places that although the government has lifted lockdowns, the people understand the virus is serious and a risk, and are personally taking steps to reduce their risk of contracting/spreading the virus -- avoiding crowded settings, setting up social distancing guidelines at work, practicing good hygiene to prevent spread on solid surfaces, staying home when sick, and wearing masks in public settings. If these steps are taken, it may be possible to mitigate the spread and manage the disease without requiring lockdowns, and thus decreasing the collateral damage they produced.

However, from what I am hearing this is not so much the case in Texas and Houston in particular. I hear reports of more people going to crowded places, practically no mask wearing, and many seem to have a sense that the danger of the virus was contrived and manufactured by media, and was in fact no more dangerous than a common flu or cold, or only was a matter of concern to older people or people with pre-existing conditions. While it is clearly a greater risk of death for certain groups, it is still a nasty virus with characteristics that make it much worse than cold and flu -- it spreads at a faster rate, spreads from asymptomatic infected much more and likely for longer periods of time, and can do severe damage all over a person's body if it takes root -- and this horrendous damage doesn't just occur in older people.

There is a research group in the PolicyLab at Children's Hospital of Philadelphia (CHOP), affiliated with University of Pennsylvania that has built a county-by-county model of COVID-19 incidence and spread that learns across counties how important factors like population density, temperature and humidity, social distancing, and demographics affect the spread rate of the virus. The model also adjusts for testing rate and also the high % of infections that are not counted in the cases (about 90% of them by best estimates), and learns from county-specific time trends using a time series model. This model learns general principles across counties from all over the country, yet has the flexibility to capture county-specific trends. I have been contributing to this effort in the past few weeks, and can say I believe it is one of the most complete such efforts out there. The work is led by David Rubin, an epidemiologist at CHOP, and the statistical modeling is led by my colleague Assistant Professor Jing Huang, and includes an exceptional multidisciplinary team of collaborators meeting daily to improve this modeling.

This model has put out new 4-week projections yesterday. You can go through and check them out. It projects the "R" or effective reproductive rate of the virus, the average number infected from each case (if below 1, incidence decreases, greater than one, it starts growing exponentially), and then can get case projections from this estimate. R0, the basic reproductive rate for this virus, can be thought of as the average number of infected people per case if unconstrained. For SARS-CoV-2, R0 is estimated to be somewhere between 2.5-3.0. The effective R value is an estimate of the average spread given current practices, i.e. effective viral mitigation practices like social distancing or mask wearing reduce the effective R of the virus, and if kept low enough the viral spread could be suppressed and eventually eliminated.

Looking at the results, we see that for a vast majority of the counties in the USA, the virus is currently more or less under control and even with lifting of lockdowns and decreased social distancing, there is not much projected growth in cases. There are a few hot spots around the country -- southeastern Florida, Virginia area near DC, two cities in Alabama, and a few other isolated counties, one in Iowa, one in North Carolina, and Baton Rouge looks bad. The northeast including Pennsylvania, New York and New Jersey look pretty well under control. However, looking through all the counties, what sticks out like a sore thumb are the projections in Texas, in particular in Dallas and Houston metro areas. The model projects for these two cities to become national epicenter of the disease in the coming weeks. Let's look at the projections for Harris, Fort Bend, Montgomery, and Brazoria Counties:

Here is the estimate of the effective reproductive rate R for Harris County. See how it greatly decreased by early April, and has steadily increased in the past month to about 1.5, and the model projects it to 2.0 in the coming days and remain at that level, assuming current levels of social distancing, which are 39% below pre-covid levels, down from about 70% below pre-covid levels in the heart of the lockdown. Note that with no interventions, we'd expect R to be 2.5 to 3.0, so current practice is not restraining the virus much at all. This R would be predicted to increase even greater if, as expected, the effective social distancing levels continue to decrease.

Below are the projected cases. Given >250 current cases (and probably another 2000+ unmeasured infections given that it is likely only about 10% of infections are officially counted as cases) and and an R of around 2, exponential growth would kick in. This is by far the most dramatic projected increase in the USA. The model predicts little to no growth for most counties.

If you check on the website at Montgomery, Fort Bend and Brazoria counties, the counts are much lower, but the growth rates and R are similar.

Why does this model predict such an outbreak in Houston, and not other places that have opened up? There are several factors I can think of:

  1. People's behavior -- majorly relaxed social distancing.  Most places were at 70% reduction from pre-covid levels during the lockdown – Harris county is now down to 39% as of Tuesday, and Brazoria County down to 19%, and Montgomery 23%.  These are some of the most relaxed numbers in the country. These measures are based on cell phone data measuring travel to "non-essential business", but was the variable the group found was most strongly associated with the observed R levels. Part of this extra relaxing of social distancing may involve high risk activities like large indoor gatherings and without masks that have great potential for seeding super-spread events that characterize this virus.

  2. There are a relatively large number of cases still active in the county – estimated 250 or so cases last week – and given about 90%+ of infections are not counted as cases that means it is likely 2000-4000 infectious individuals in Harris county.  That is enough critical mass that with too much social interaction, especially risky types involving crowds, it could start spreading exponentially again.

  3. The testing in Texas is some of the worst per capita in the country – this also raises the spectre that there might be more cases than those being counted. Testing is improved, but the lack of testing in the past suggests there may be a higher proportion of cases that have not been counted. Also, I have heard reports that antibody tests are being conflated with viral tests in some counties -- this needs to be cleaned up to ensure the reported data are as accurate and complete as possible.

  4. If you look at the estimation of “R”, the time-varying effective reproduction rate of the virus in Harris county, you see it has steadily increased over the last month.  The model is an autoregressive model with county-specific random effect lags, so this would tell the model that the spread is starting to increase and this clear trend is what is troubling. This part of the model is customized to the county-specific data, and may pick up on specific individual practices of residents producing increased spread -- this may be driven by the individual behavior of large numbers of people who are showing no real sense of caution.

Based on my understanding of the model, these are some of the major factors contributing to this projection.

Will this actually happen? I don't know. This model, like all models, is limited by the data put into it and its underlying assumptions. I can vouch that a smart, hard working group of people have worked very hard to build this model that captures the factors that the data tell us are important, and that as a statistical data scientist I believe the overall modeling approach is sound and state-of-the art.

One limitation is that the temperature effect cannot properly account for what will happen when there is 95 degree temperature and high humidity, as will be regular in Texas soon, since these temperatures have not been seen in the USA during the covid crisis yet. It is possible that this extra temperature effect will overcome the factors above and the spread will not strongly increase. But it is also possible that it will make things worse, since studies have shown the virus is 19x more likely to spread indoors than outdoors, and as temperatures increase Houstonians will undoubtedly be spending more time inside in air conditioning, in cool and humid enclosed environs.

I encourage people to keep an eye on the local data on incidence, but think the most reliable harbinger of and emerging surge is COVID-19-related hospitalizations. If there is indeed a surge brewing, we should start seeing hospitalizations start to steeply grow. The hospitalizations are about 1-2 weeks lagged from infection, but assuming a relatively stable proportion of infections require hospitalization over time and space, they are pretty reliable indicators and less dependent on variable testing practices than incidence counts. Look at the current COVID-19 related hospitalization trends for the TMC in Houston:

This looks pretty good right now - suggesting that there hasn't been much of a surge in infections as of 5/7-5/8. But given that the R has sharply increased since then, there is a chance that hospitalizations will soon start growing and if so, this could indicate a surge is happening. BTW, my friend David Hong, MD from Houston told me on 5/21 that they have seen sudden increase in past 3 days in Houston, and another friend John Stroh, MD told me that he has seen an uptick in ICU cases). It may be instructive to consider Dallas.

Let's look at Dallas county's plots. Here are the R and Incidence plots:

You see that Dallas had an R value near 2.0 for a few weeks in late April and Early May, and look at the corresponding increase in cases. This had leveled off for a few days before 5/14, but the model projects it to steeply grow. We see very similar curves in surrounding counties in the Dallas metro area.

Let's look at COVID-19 related hospitalizations in Dallas.

These have been growing steadily since last April -- doubling in the two weeks from 5/4 to 5/19 and growing 10-fold in a month (even after subtracting off discharged patients. This indicates the rising incidence seen above is real and being manifest in hospitals, and if it continues the medical system in Dallas could become strained.

Again, it may not happen like this in Houston, but needs to be watched carefully.  Houston is a bit behind Dallas in this trend but could be on the same road.

So what do I think people in Houston should do? I don't think broad stay-at-home lockdowns are the answer, and I not saying everyone should shut themselves into their house and not go out at all. I recommend for individuals to take reasonable precautions. Super spread events have been the major problem with this crisis -- one or two of these can lead to major outbreak in an area. So avoid big crowds, especially indoor as the virus seems to spread less efficiently outdoors. Especially avoid indoor crowds where people are singing or talking loudly, yelling, screaming (or booing!), or taking deep breaths because of intense exercise -- these all have been shown to expel large volumes of respiratory and aerosol particles which seem to be the key modes of viral spread. At work or in stores, practice basic social distancing to keep from being too close to too many people, try not to touch public surfaces that most people wouldn't touch, and keep your hands clean. Wear masks.

Wearing masks is not a sign of weakness -- asymptomatic spread is one of the major problems with this outbreak -- if people wear masks, especially when in enclosed, crowded situations, it could significant limit spread. From what has been learned it appears respiratory and aerosol droplets are the primary mode of viral spread in SARS-CoV-2. The key in getting infected is exposure to a certain load of virus -- even homemade masks, properly worn, will block a percentage of virus and could make a difference. The primary benefit of the mask is not for you but for others just in case you are infected but asymptomatic -- if everyone were to play along and do this, it would be very hard for the virus to spread.

With these basic steps, I think the viral spread can be significantly limited and outbreaks prevented, even without lockdowns. But if too many people are careless and don't show any caution, it increases the chance for super-spread events that would lead to a major outbreak and make Houston (and Dallas) the epicenter of the crisis -- which the model thinks could easily happen, and then there will be major lockdowns again. And not just in Texas -- in response, it is likely that lockdowns would again spread all over the country out of fear.

If people use their liberty to show a degree of caution, smartly putting into practice what we have learned to date, then this crisis can be managed without top-down government intervention. However, if people do not show enough caution, municipalities may see no other option but to impose external restrictions, and likely extreme ones.

Be safe and be smart! We love you guys!

©2020 by Covid Data Science. Proudly created with Wix.com