Given this article is somewhat long and involved, I first summarize its key points:
A new SARS-CoV-2 variant has emerged and taken over the UK in the past two months.
This variant has numerous mutations that suggest it might be more transmissible.
Its rapid displacement of other variants in the UK suggests evolutionary advantages.
Numerous modeling efforts have suggested this variant results in ~50% increased transmission -- with several of them carefully adjusting for potential regional or temporal confounding effects so providing solid statistical evidence that this is indeed characteristic of the variant.
This level of increased transmissibility would be a game changer, leading to much higher number of hospitalizations and deaths if no changes are made in societal behavior or mitigation strategies, and counterintuitively far more dangerous than a variant producing a 50% increase in lethality but no increased transmission.
On a positive note, there is no evidence that this variant leads to more serious disease, greater chance of reinfection, or a reduction of PCR test accuracy or vaccine efficacy.
Given that the variant has already been document around the USA, we should prepare ourselves for the possibility that this variant will also come to dominate SARS-CoV-2 in this country, and its increased transmissibility requires more care in mitigation and more urgency in our vaccine distribution.
In the past few weeks, there have been alarming media reports about a new variant that has taken over the UK, leading to new strong restrictions in the UK and acceleration of the emergency approval and distribution of vaccines, motivating untested alteration of vaccination schedules, and producing talk that this variant may be transmitted 50-70% more efficiently than most existing variants. This is being talked about as a potential pandemic game-changer that could make things much worse. Is there cause for alarm? I was initially skeptical, but after evaluating the available data reports, it appears so.
In this post, I will summarize some of the key data from reports on this variant and use a statistical perspective to assess the level of evidence for the claims made by the media and UK officials, and discuss what this means for our pandemic response in 2021.
Background: what is this variant?
In November, researchers from the Covid-19 Genomics UK Consortium (COG-UK), a group doing systematic genomic sequencing of many individual UK cases of SARS-CoV-2, discovered a new genetic variant from a sample from an infected patient in Kent, in southeastern England taken in late September 2020. This variant has been dubbed the B.1.1.7 lineage by cov-lineages, and 20B/501Y.V1 by nextstrain.org, and Public Health England has designated this variant as "variant of concern 2020-20/01", or VOC202012/01. In this post, I will simply refer to this variant as VOC for short.
In less than two months, this variant has become the predominant variant in the UK, leading to concern that it may be far more transmissible than previous variants and potentially not controllable by previously used mitigation strategies. This has prompted the UK government to impose stricter lockdowns, closing schools and universities, and also rapidly accelerate their vaccine approval and distribution process, approving the AstraZeneca/Oxford vaccine based on limited data that was deemed insufficient by regulators in the USA and EU and implementing vaccine distribution procedures such as cutting vaccine doses in half or extending time between first and second dose that have little to no support from existing scientific studies. Are these steps overreactions, or warranted precautionary steps?
Aggregating data and results from reports that have been publicly released to date, in this post I will try to answer this and other important questions about this variant, including:
What is unusual about this variant, and why is it so concerning?
Is it really inherently more transmissible or are they misinterpreting the data?
Does it lead to more serious disease and/or greater chance of reinfection?
Is it still detectable by current PCR tests?
Are current vaccines likely to be effective against it?
Why is increased transmission such a big deal?
What does this mean for what we should expect from the pandemic in the coming months?
Why is this variant so unusual?
Viruses are constantly mutating and producing new variants, and there are literally thousands of different genetic variants of SARS-CoV-2 that have arisen during the pandemic so far, as revealed by cooperative efforts to sequence the genomes of many SARS-CoV-2 for research purposes, including COG-UK sequencing >130,000 cases to date and Nextstrain.org sequencing almost 4000, and doing something called phylogenetic analysis to build a tree that shows how the different variants relate to each other based on their time of emergence and similarity of genetic code.
Here is the nextstrain tree for European variants as of 1/5/2021, with each dot a sample and the tree showing relationships based on genetic similarity and time of sample collection. The VOC (called 20B/501Y.V1 in nextstrain) is in orange, which emerged in late fall 2020.
This branch really stands out from the rest of the data, and comprises a substantial proportion of sequenced samples since early December. Here is a plot representing genetic distance of individual cases from from the initial sequenced case of SARS-CoV-2, for a set of VOC cases (lineage B.1.1.7) and the other cases closest to it on the phylogenetic tree (bottom):
It is clear that this variant is an outlier, and tends to have far more mutations than the rest of the cases. Typically, SARS-CoV-2 has a mutation rate of 1-2/month so the 17 mutations in this variant makes it unusual and suggests some accelerated evolution. It appears that this variant has arisen via some unusual evolutionary events, possibly within an immunodeficient long-hauler whose condition allowed unusual accumulation of mutations in their virus, which subsequently spread into the community.
Why does this variant raise concern?
It is perfectly natural for viruses to mutate and produce new variants over time, but new variants rarely raise any problems unless they lead to greater transmission, more severe disease, evade testing, or are resistant to existing treatments or vaccines. So why all the concern about this one?
One concern about this variant is the large number of mutations that occur in the spike protein, the defining protein in the SARS-CoV-2 virus that enables it to enter host human cells. Eight of the 17 mutations characterizing VOC occur in the spike protein, with three of them known mutations with plausible links to increased transmissibility:
N50Y mutation: this mutation is known to increase binding with the human ACE2 receptor, which suggests more efficient entry into human host cells.
69-70del: this mutation involves the loss of 2 specific amino acids in the spike protein, which changes the shape of the spike protein that might lead to transmission advantages. By the way, this deletion causes one of the 3 main gene targets used in PCR tests to fail, an event called "Spike Gene Target Failure" (SGTF). This does NOT lead to false positive PCR tests since the 2 other main gene targets are still accurately detected, but provides an interesting surrogate measure to estimate the prevalence of this variant based on PCR data without having to sequence the virus in places where most 69-70del cases are VOC.
P861H mutation: the precise functional effects of this mutation are not well characterizes, but it changes a site at which the spike protein cleaves before entering human cells so is plausibly important to viral replication and spread.
These provide support for the notion that VOC might be more transmissible, but only theoretically. It is important to see whether the epidemiological data suggest higher transmission of this variant is occurring.
Shortly after its discovery, UK researchers noticed that this variant was dramatically increasing in prevalence (i.e. proportion of cases) in certain regions of England, especially Kent in the Southeast of England where it was first discovered, London, and the East of England, and they also noted that the virus also seemed to be spreading faster in these regions.
On 12/23, a technical report was published online by Davies, et al. (2020) from the Centre for Mathematical Modelling of Infectious Diseases that provided data to support these claims, plotting the prevalence of this variant in London/Southeast/East regions of England vs. all other regions of England through 12/1 in their Figure 1A:
While only comprising 5-10% of cases in these three regions in early November, it rapidly overtook other variants to make up near 50% of total cases by early December in these three regions, and Figure 1B in the paper plotted the reproduction number, Rt, representing the average number of others infected by each confirmed case, vs. the proportion of samples with S gene dropout (SGTF), a surrogate for prevalence of VOC, for local authority areas in England, with the color indicating the broad region of England in which the local authority area resides.
The linear trend in this plot suggests that the areas with higher proportions of VOC also had faster transmission. However, this does not necessarily mean that there is a causal link between VOC and faster transmission, for several reasons:
If you look at the trends in Figure 1B within each color group, there is not a strong association between Rt and SGTF within the regions (i.e. within each color group) -- the apparent linear trend in the plot is driven by differences between the regions. There is a statistical phenomenon known as Simpson's paradox that suggests that plots like this aggregating across regions can be misleading when the results are driven by regional effects.
It is possible that the increase of Rt and the increase in prevalence of the VOC may have been coincidental -- that there were other environmental factors at work in these regions that led to increased transmission but had nothing to do with the VOC. The concentration of this variant within these three regions of the UK also allowed for the potential that it was simply a regional effect. Statistical adjustments for regional effects would be necessary to rule out these potential explanations.
Viral transmission as measured by Rt was increasing in November and December in many places all over the northern hemisphere, with most having no evidence for the new variant, as a result of various environmental and societal factors including colder weather, holiday gatherings, people spending more time indoors, and pandemic fatigue causing people to let their guards down. Thus, any variant that happened to arise in November or December somewhere in the world in the midst of such a surge would appear to be driving transmission whether it indeed was or not. Statistical adjustments for potentially confounding time effects would be necessary to rule out these potential explanations.
If I had written this blog post two weeks ago and only had access these data, I would express extreme skepticism about the claims that this VOC was inherently more transmissible because of the lack of adjustment for regional and time effects, and my perspective would have been that the media reports were possibly alarmism driven by improper analysis of the data. However, more data and reports have come out in the past two weeks that provide much stronger support for the increased transmissibility of VOC.
On 12/28, Tom Wenseleers did an analysis that he reported on Twitter that plotted the change in VOC prevalence in different UK regions from September 1 through December 15:
While London, East of England, and the Southeast show the highest VOC prevalence and fastest growth rate, these plots show that the VOC is present and increasing exponentially in all regions of the UK, with some regions further behind but on the same trajectory. If the increasing prevalence and transmission of this variant was simply a regional effect, we would not see the variant increasing all over the UK, and the fact that its prevalence is growing exponentially everywhere suggests it has some type of evolutionary advantage over other strains.
A plot (Figure S1) from a technical report released on 12/31 by a group led by Erik Voltz at the MRC Centre for Global Infectious Disease Analysis in London visually illustrates how the VOC has quickly overtaken the UK since its origin in Southeast England in early November through the middle of December (with VOC prevalence estimated by prevalence of SGTF, which has been shown to be a reliable surrogate for VOC in the UK for this time period).
This demonstrates that the VOC has rapidly displaced all others and came to dominate the UK over a period of just 6 weeks. It is hard to imagine how such a rapid displacement could possibly occur on this scale without the variant having a substantial fitness advantage, for which increased transmissibility is the most likely explanation.
The report by Voltz et al. also contained an Rt-based analysis to assess whether the VOC had increased transmission, but unlike the Davies analysis, this one stratified by region and time so adjusted for the potential confounding factors mentioned above. For each region and week, they estimated the VOC prevalence based on SGTF data, and used the corresponding case counts to estimate the reproduction rate Rt for the VOC and non-VOC samples for each region and week. Figure 6 of their paper plots the paired Rt for VOC and non-VOC samples for all STP subregions within the regions of the UK.
The scatterplot on the right shows that even when controlling for region and time, the reproductive number Rt is systematically greater for VOC samples than non-VOC samples. The median Rt for VOC regions is 1.45 and for non-VOC regions is 0.92, for a ratio of 1.58, suggesting that the VOC has a 58% transmission advantage over non-VOC variants. Note that Rt~1.00 suggests a stable situation in which the number of new cases remains constant from day to day, meaning that the level 4 lockdown in place in the UK during that time was sufficient to suppress viral surge for non-VOC samples. However, during this same lockdown, the Rt~1.50 for VOC samples suggests substantial exponential growth in number of daily cases over time that is of the magnitude seen during local surges, with this level of transmission leading to doubling of daily case counts every two weeks or so, which is an enormous rate of spread.
Another report publicly released on 12/31 led by Harald Vohringer at the European Molecular Biology Laboratory also performed an Rt-based analysis that stratified by region and time so adjusted for these potential confounding factors. For each of 382 lower tier local authorities (LTLA, subregions) in the UK and week, they estimated the prevalence of VOC from sequencing data, and estimated the reproductive number Rt for each LTLA and week for VOC (called B.1.1.7) and non-VOC samples. These results are presented in box plots (Figure 2B) and a scatterplot (Figure 2D) of their repot:
They found that during the lockdown, the median Rt was 0.85 for non-VOC samples, suggesting the lockdown was sufficient to prevent viral surge, while the median Rt was 1.25 for VOC samples, and yielding a ratio of 1.47 suggesting that the VOC was 47% more transmissible. For 83% of the LTLA sub-regions, they found Rt>1 for VOC and Rt<1 for non-VOC samples suggesting the increased transmission of VOC was being seen throughout the UK.
These stratified analyses provide strong evidence that the VOC is indeed more transmissible, and that it is not simply an artifact of specific environmental, regional, or temporal effects. While it is difficult to precisely infer the magnitude of increase in transmissibility in general from the specific circumstances of one country in one specific month, the consistency of these number suggest the media reports of 50% greater transmissibility seem about right and strongly supported by the data.
There are two other analyses that have provided support for the increased transmissibility, that seem interesting but have lower levels of statistical evidence.
First, a 12/28 report by Public Health England presented results from an analysis demonstrating that the secondary attack rate for VOC samples is 15.1% and 9.8% for non-VOC samples, for a ratio of 1.54. The secondary attack rate is the testing positivity percentage of contacts who are tested because of exposure to a known positive confirmed cases as part of a nationwide testing and tracing program in the UK. These results are congruent with a VOC that is 50% more transmissible, but this result does not adjust for potential regional or time confounding, so should be taken with a grain of salt.
Additionally, a technical report published online on 12/24 by a group led by Michael Kidd of Public Health England and University Hospitals Birmingham NHS Foundation Trust UK compared viral loads between a sample of VOC and non-VOC individuals (estimated based on the SGTF surrogate, called S-neg and S-pos, respectively). The viral loads were measured by Ct values from PCR, with lower Ct values indicating greater viral loads. Following is the Figure 3 from their paper plotting the values for the ORF and N gene targets, the two targets in PCR tests that are measurable in both VOC and non-VOC samples.
These results suggest that the median viral loads for VOC samples were considerably higher than non-VOC samples, indicating the median viral levels are 10 to 100 fold higher in VOC than non-VOC samples given these measurements are on a logarithmic scale. This provides a potential mechanistic explanation for the increased transmissibility hypothesis, since it is generally expected that cases with higher viral loads tend to be more transmissible. However, again this analysis did not adjust for regional or time confounding effects, so it is possible that at least some of this effect is not causally related to the mutations in the VOC but artifacts of the massive December viral surge and/or regional or environmental effects in London and the other areas in which the VOC took over in December.
Does this variant cause more severe disease, lead to greater risk of reinfection, or present any problems for PCR testing or vaccine efficacy?
For reasons I will outline below, a variant with 50% greater transmissibility, if spreading all over the world, will cause major problems for our pandemic management, and this is bad news indeed. However, there are several bits of good news coming out of the data that should temper some of our trepidation over the new variant:
There is no evidence that the VOC results in more severe disease, hospitalization, or deaths. The 12/28 Public Health England report presented data from a paired VOC/non-VOC samples done with ~1700 samples each matched by region, age, and time of sample, and found that there was virtually no difference between hospitalization and mortality rates between the VOC and non-VOC samples. The paired nature of this analysis appropriately adjusts for any potential confounding effects of region, age, or time.
There is no evidence that VOC results in higher probability of reinfection. The same paired analysis also looked at reinfection rate, simply defined as positive PCR tests 90 days apart, and also found virtually no difference between VOC and anti-VOC samples.
There is no difficulty detecting VOC samples with existing PCR tests. While as mentioned before, the VOC samples will fail for one of the 3 main gene targets -- the S target, they are accurately detected on the other two main gene targets -- ORF and N gene targets. This means the existing testing paradigm can continue to be used, and if this VOC rapidly displaces other variants other places in the world as it has done in the UK, then the S Gene Target Failure (SGTF) can be a useful surrogate for tracking the VOC variant cases from PCR data.
There is no reason to expect the vaccine efficiency will be affected by this variant. While the VOC is characterized by 8 mutations in the spike protein, the target of all vaccines approved or under development, experts to date have all affirmed that they expect the current vaccines should remain efficacious for this variant. As mentioned in a 1/1/21 article in Science, BioNTech CEO Ugar Sahin has emphasized that the VOC differs in only 9 of 1270 amino acids from the spike protein encoded in the BioNTech/Pfizer vaccine (and also underlying the Moderna and other vaccines), so the immune response it produces should still be effective for this variant.
If VOC doesn't lead to more lethal disease, then why should I be concerned about increased transmissibility?
Counterintuitively, a variant that increases transmission by 50% but has the same case mortality rate (CMR) is much more dangerous than a variant with the same transmission rate but a 50% higher CMR. The reasoning is that while a 50% higher CMR will linearly increase deaths by 50%, a 50% higher transmission rate will cause the cases to grow exponentially, which leads to far more deaths assuming a constant CMR (and data consistently show that in the USA the CMR has remained a relatively constant 2% since the summertime), not to mention can overwhelm the medical systems which we have seen in the Spring results in an increase of the CMR.
To illustrate this point, I used the nice epidemic calculator developed by Gabriel Goh, using parameters relevant for Philadelphia, with population of approximately 1.5 million and starting with 5000 active cases, which is the roughly the number of new confirmed cases in the past 7-10 days. Currently, the Rt for Philadelphia is roughly 1.24, and the CMR is roughly 2.1%. Holding these assumptions constant, following is the growth of cases, hospitalizations, and deaths over time:
Under this scenario, COVID-19 would cause 232 deaths by 30 days, 937 by 60 days, and 2314 by 90 days. If this were a variant with the same transmission but a death rate of 2.9% (38% higher), then here are the projected results:
In this case, deaths are increased with 320 by 30 days, 1293 by 60 das, and 3196 by 90 days, providing a linear 38% increase in death rates. Suppose, however, that the death rate remained 2.1% but the transmission was increased by roughly 50%, increasing the Rt from 1.24 to 1.88:
We can see this is drastically worse, leading to 549 deaths by day 30, 5,546 by day 60, and 14,747 by day 90, with a 50% increased in transmission producing a nearly 5-fold increase in deaths in 3 months. Here is a plot of the deaths over time for these three scenarios:
Not only would increased transmissions lead to increased deaths, but also far greater hospitalizations. From these scenarios, the peak number hospitalized is more than 4-fold greater in the scenario with 50% higher transmission, increasing the chance of hospitals being overwhelmed, which would further increase the CFR and lead to even more deaths.
This illustration is not meant to be a prediction for what is expected to happen in Philadelphia -- it assumes the Rt increases by 50% and remains constant over time, but illustrates how much increasing transmission results in worse outcomes. It does not just affect case counts, but deaths, hospitalizations, and the entire ability of our society to manage the pandemic and keep it from growing out of control.
My purpose in this very long blog post is to look at some of the key data that has arisen about this new UK variant, and assess whether all the alarm is warranted. While the pandemic is hard to predict and there is considerable uncertainty, there is solid statistical evidence for greater transmissibility even after adjusting for key confounding factors, and a number of different analyses support the widely cited estimate of 50% more transmissibility. Also, the speed with which the VOC is displacing the other variants all around the UK is remarkable, strongly suggesting an evolutionary advantage, and providing more support that it is indeed more transmissible. Combined with its characteristic mutations that are known to affect biological processes related to replication and spread of the virus, this together provides strong evidence that this strain is inherently more transmissible.
Will this variant similarly displace all other variants around the world? While this is not clear, it is not unreasonable to expect it to do so. Early in the pandemic, a European strain that appeared to have a mild transmission advantage, D614G, quickly became the dominant strain and ended up displacing all others around the world, with 99.3% prevalence by November 2020.
There are already reports of the UK VOC in various parts of the USA, first reported in Colorado on 12/29, with documented cases also now documented in California, Florida, Georgia, New York, and Texas, and many experts thinking it may have been spreading in the USA since mid-November. It is even possible that the disparate viral surges currently seen in New York and Southern California may be at least partially fueled by this variant -- the lack of broad sequencing of cases in the USA means we currently have limited data to investigate this possibility. It would be wise to prepare ourselves for the possibility that this variant may become the predominant one in the USA in the coming months.
If this happens, and indeed does prove to be 50% more transmissible, then we would expect the increased transmission we have seen throughout the late fall/early winter to continue and maybe even increase in magnitude. As can be seen in the UK data, an increase of 50% in transmissibility can transform a stable situation to one with out-of-control exponential spread, and make policies and practices that were keeping spread under control to no longer be sufficient.
This would have numerous implications for our personal behavior as well as policy decisions:
On a personal level, we would need to be extra vigilant. In spite of any pandemic fatigue we might feel, we need to keep up the precautionary practices that we know best protect us from infection: avoiding crowded indoor places, especially if poorly ventilated, gathering outdoors, and practicing distancing and mask-wearing when together with others. But with increased transmissibility, these might not be enough to prevent surges, so if possible it might be a good idea to hunker down even more during times of high transmission, and be extra careful not to expose our vulnerable family and friends just in case we are infected.
Policy makers would have difficult decisions to make whether stricter mitigation strategies including lockdowns are necessary to get transmission under control. Much of country has little appetite for that, and with the pending inauguration of a new president and the fractured political climate in the country, it might be especially difficult to get broad buy-in for these strong mitigation strategies. They may be put in a difficult position of having to choose between two unattractive alternatives: (1) allowing much greater transmission, with its accompanying deaths and potentially overwhelming the healthcare systems, or (2) imposing stronger societal restrictions that would be damaging and extremely unpopular.
This would further raise the urgency to optimize the vaccine distribution process. The best chance for ending the pandemic has been and remains achieving herd immunity through vaccination, and as higher proportions of the society are vaccinated we will see reductions in transmission, hospitalizations and deaths long before any ultimate herd immunity threshold is reached. The good news is that we expect the vaccines to still be effect against this variant, so these plans would not change. However, if we are in the midst of a nationwide surge, the speed and efficiency with which we overcome the supply and logistical challenges of distribution and the vaccine hesitancy in many pockets of our society will have even greater implications on the ultimate consequences of the pandemic for many.
In conclusion, while it is not certain that this variant will overtake the USA or that it will produce a 50% transmission everywhere, the data and analyses presented to date strongly support this possibility, so we should prepare ourselves for that eventuality. I am hopeful that our scientists will find ways to track the spread of this variant around the country through increased sequencing and/or collection and aggregation of regional SGTF data to provide data on whether this is indeed occurring.
We should be encouraged that there is no reason to believe this variant leads to more severe disease, greater chance of reinfection, or interferes with the operation of the testing and vaccination procedures we have in place. But as demonstrated above, increased transmissibility leads to many serious consequences, and may require stronger mitigation as we work towards herd immunity via vaccination to ensure we minimize the number of preventable deaths.
Image from NPR article