top of page

Evaluating USA state-level excess death levels vs. total vaccination rates during the pandemic

There is a great deal of interest in assessing the individual and population-level effects of vaccination on various outcomes during the pandemic, including all-cause deaths.


One important question is to assess whether vaccination programs have led to a reduction of all-cause deaths relative to what would have occurred had there been no vaccination.


This question is very difficult to answer analytically, requiring causal inference techniques applied to data sets containing all-cause death outcomes and vaccination status at the individual level, modeled in a way that can adjust for key confounding factors that are related to both vaccination status and risk of death and obtain valid causal effects of vaccination.


While numerous studies of this type have been done for various outcomes such as infection, severe disease, hospitalized disease, and fatal disease to produce estimates of vaccine effectiveness vs. infection, severe disease, or fatal disease, to my knowledge there are no such studies to date looking at this question for all-cause deaths.


One reason is that it is a challenge to assemble individual-level all-cause death data over time for a large population that includes complete information on vaccination as well as key confounders including age, health status, socioeconomic status, previous SARS-CoV-2 infection status, and healthcare usage history. I am hoping that some researchers can assemble such data and perform a rigorous analysis to answer this question. It seems more plausibly done in countries with centralized healthcare systems like Canada, UK, or Israel than in a country with a fragmented healthcare system like the USA.


In the absence of such studies, social media is rife with attempts to address this question by correlating aggregated all-cause death data for various municipalities (countries/states/counties) with aggregated vaccination rate data for the same municipality for a given period of time, for example using scatterplots.


These scatterplots are easily misused when people try to infer causal principles about vaccines from them.


Hazards of trying to make causal claims about vaccines using scatterplots

In the setting of trying to relate COVID-19 infection to vaccination, a correspondence (i.e. letter to editor) was published in the European Journal of Epidemiology in Fall 2021 using this approach received an inordinate amount of attention, with >2m accesses and and Altimetric score of 26,412, garnering this letter more attention than almost any other published paper of 2021.


Basically what they did was download from a public repository COVID PCR-confirmed infection rates for 68 countries for a 7-day period from August 28, 2021 - September 3, 2021 and plot against the vaccination rates for those countries.

The lack of a negative relationship between vaccination rate and infection in this scatterplot was used to imply that vaccines were not working to reduce risk of infection, and this narrative is what led to this published correspondence going viral on social media.


This claim is contradicted by a large literature of vaccine effectiveness studies that modeled individual level infection and vaccination data while adjusting for confounders demonstrating vaccines had high effectiveness in reducing risk of infection (at least for the Delta variant prevalent at this time).


Trying to draw inference on individual vaccine effects based on country-level associations like this is an example of the well-known ecological fallacy, trying to make inference on individual-level associations from group-level associations. The problem is that country-level confounders that are related both to vaccination proportion and infection rate in that specific week can drive the associations in this plot, not the effect of an individual's vaccination on their risk of infection. In the case of this paper, potential factors confounded with vaccination rate and infection rate for that one specific week include country-wide COVID-19 testing and reporting rate, population density, and the timing of their Delta COVID-19 surge -- whether they happened to be in the midst of it during the arbitrarily chosen week of August 28, 2021 - September 3, 2021.


The final point highlights how time can be the largest confounder and country-level associations can strongly vary over time, so any investigation of correlations like these must cover a large time frame or better yet be considered dynamically over time to avoid potential Simpson's paradox effects.


In spite of these concerns and limitations, many people are interested in looking at these types of scatterplots, and they continue to appear prominently in public discourse. For this reason, I will consider them in this blog post -- generating such plots exploring the USA state-level excess death data posted by the CDC, and specifically evaluating correlation with state-specific vaccination rates. I present plots illustrating this correlation, and discuss the caveats and severe limitations of what we can learn from such aggregated state-level associations.


CDC Data on Excess Deaths

For this blog post, I downloaded with state-level weekly percent excess death data posted on the CDC website . These data use several years of state-level death rates for each week of the year to estimate a background death rate for each week of the year from which the percent deaths above baseline is computed for each state for each week of the pandemic.


Technical notes on how they calculate percent excess deaths are available through this link. These are based on provisional deaths based on aggregating reported deaths from various USA municipalities, so the most recent months' data will tend to be incomplete as there are some lags in reporting from some municipalities. Official death counts are not finalized until at least 12 months after the close of a given year. Although the CDC also releases weighted data that attempts to adjust for the estimated incompleteness for this reporting lag, here we model the raw values.


I relate these values to the state-level percent fully vaccinated as of July 31, 2022, which are contained in the following file:

State fully vaccinated rate 2022-07-31
.xlsx
Download XLSX • 10KB

I call this the state-level "propensity to vaccinate", summarizing the vaccine uptake for the given state.

I plot scatterplots of state-level percent excess deaths vs. propensity to vaccinate, overlay a smooth nonlinear fit to the data along with 95% error bars, and compute the Spearman rank correlation as a nonparametric measure of association between the variables along with the associated p-value testing whether the Spearman correlation is nonzero.


Following is an R script to reproduce the plots contained in this blog post:


USA_state_excess_analysis
.R
Download R • 6KB


Looking at the Data

To start, here is a plot of total estimated percent excess deaths since the beginning of the pandemic until 7/31/22 plotted vs. state-level full vaccination rates as of 7/31/22.

Over the full period of time, we see a negative correlation, in that states with higher propensity to vaccinate tended to have lower excess death rates during the pandemic.


As I have repeatedly emphasized, this should not be taken as a measure of "vaccine effectiveness vs. excess deaths" because of all of the potential state-level confounding factors. Causal inference from observational data is notoriously difficult, especially in complex dynamic times like the COVID-19 pandemic, but any reasonable causal estimate must come from studies modeling individual outcomes by their vaccine status using analytical methods and/or study designs that can at a minimum adjust for key confounding factors.


Note that "percent excess death" already adjusts for systematic differences in death rates in the different states, since it computes change from a state-specific baseline death rate at the same time of year in pre-pandemic years. But there still may be remaining confounding from numerous state-level factors.


We know that states with a "high propensity to vaccinate", i.e. high vaccination states, also tend to have:

  1. more affluence

  2. better baseline health

  3. more developed healthcare systems

  4. more dense urban populations

  5. stronger mitigation strategies (e.g. business/school closures and stay at home orders) for longer periods of time in 2020-2021.

  6. populations tending to show higher levels of concern about COVID

  7. populations with better attitudes towards mitigation/healthcare

  8. geographic location in Northeast, Middle-Atlantic or West Coast

These factors potentially affect percent excess deaths so it is not possible to tell whether differences in state-level percent excess deaths are due to vaccination or these factors, or other time-varying factors such as infection rates that vary by region.


For example, although the "percent excess deaths" calculations already adjust for higher death rate inherent to a population with older demographics, it does not adjust for the extra risk of COVID-19 death if infected, so the latter factor is a potential confounder if states with higher age demographics tended to have higher/lower vaccination.


Given different regions have different climates and had COVID-19 waves at different times, regional effects are majorly confounded with vaccination propensity:

  1. High vaccination rates: Northeast, Middle Atlantic. West Coast

  2. Low vaccination rates: Southeast, Appalachia, Southwest, Northern Plains/Mountain

  3. Moderate vaccination rates: Midwest, Mountain West

Thus, we need to consider the regional COVID-19 infection patterns over time when interpreting these data.


As a result of these confounding factors, one would expect that this correlation of state level excess deaths and vaccination rates might vary substantially over time. Indeed, this variability over time would allow individuals with certain agendas to cherry-pick certain time periods that have correlations that match their preferred narrative, be it pro-vaccine or anti-vaccine.


As a result, here is a movie file plotting state-level excess deaths vs. vaccination propensity for 1-month long windows from the beginning of the pandemic to the present time.

You can explore the entire video above, but I will summarize the % excess death/vaccination propensity for key time periods and mention what was going on in terms of COVID-19 surges/mitigation/vaccination during those time periods, highlighting the variation of these associations over time based these factors..


In the month immediately preceding the pandemic, most states had little to no evidence of excess deaths above baseline relative to pre-pandemic years.

In Spring 2020, the pandemic first hit the Northeast in NY/NJ/New England with massive surges and excess deaths. Given the initial surge was localized in the Northeast there is a weak positive association between % excess deaths and vaccination propensity:



In Summer 2020, as people moved indoors to A/C in hot southern climates, this was met with a massive COVID surge producing high levels of excess deaths (TX/FL/AZ/MS/AL/LA/GA), leading to a moderate negative correlation between excess deaths and vaccination propensity



In late Fall 2020, we see the COVID surge hit the Midwest (IA, NE, KS, MO, MN, IL, WI) and Mountain West (CO/UT/NV/MT/AK) with moderate vaccination propensity, and low in the Northeast, producing weak to moderate negative associations between excess deaths and vaccination



In early Winter 2020-21, the Northeastern/Middle Atlantic states have a surge and catch up, with a weak negative association with high excess deaths giving way to a flat excess death profile with low association and low excess deaths as vaccine rollouts were just starting




In Spring 2021 most of the adult vaccination occurred in the USA, and the excess death rates were lowest of the pandemic. The COVID-19 levels came down very low from the winter surge, and there was a weak negative association with vaccination propensity.



In late Summer 2021, Delta surge hit, especially in the hot southern climates, and leading to very high excess death rates and moderate negative associations with vaccination propensity. BTW, as is clear in other CDC data as well as Aggregated National Group Life Insurance Data, young/middle age adults had their highest excess deaths of the pandemic at this time, with these deaths focused primarily in the Southeastern region of the USA.


As in 2020, the COVID wave hit the Midwest/Mountain West in the fall 2021 and produce moderately high excess deaths in the late fall, while the Northeast/Middle Atlantic stayed low, leading to weak-to-no association of excess deaths with vaccination propensity.



In Winter 2021-22, a massive wave of immune-escape BA.1 Omicron variant hit the entire country. While infection fatality rate was lower than previous variants, the high infection rates produced high excess deaths, and little to no association with vaccination propensity.


In late winter, excess death rates sharply declined more rapidly in the Midwest, Mountain West, Northeast, Middle Atlantic, & West Coast with moderate to high vaccination propensity, leading to a moderately negative association between excess deaths and vaccination propensity


In Spring/Early Summer 2022, the BA.2 wave hit the USA, and starting May we see weakly positive associations with vaccination propensity with some Northeastern states (DE/CT/VT) having high death rates that flattens out by summer, but with excess death rates very low. Given this is within the most recent 3 months of data, these time points are more subject to change than the earlier ones as lagged death reports continue to roll in.



Conclusions from these Plots

One observation we can make from these plots is that it appears much of the state-level excess death-vaccination propensity associations can be explained by regional variations in COVID-19 surges. This makes sense given that COVID-19 accounts for the lion's share of excess deaths during the pandemic (as also found by the national group life insurance excess deaths report).

There may have been vaccine effects in reducing excess deaths relative to a counterfactual scenario of what would have occurred sans vaccines, especially in 2021 during the Delta surge, but we can't know from these data alone. Again, we can only make such inferences from careful studies done on data sets with individual vaccination and event levels and adjusting for confounders. We have such studies looking at severe and fatal COVID-19, but not on excess deaths.


However, these data demonstrate the implausibility of persistent claims in some circles of high numbers of vaccine-induced deaths given that the lowest excess death rates of the pandemic occurred at the time most adults were vaccinated (late winter/spring 2021), and these claims are largely fueled by passive reporting system VAERs, for which the vast majority of post-vaccination reported deaths are within a week or two of the shots.


Any vaccine-related deaths must be needles in the haystack of excess deaths dominated by COVID-19 deaths, given that the spikes of excess deaths track with COVID-19 surges and deaths. This is also seen in other countries, as well, as seen by the plots of excess deaths vs. COVID-19 deaths and vaccination rates across the many countries in the world assembled by @hmatejx, including for example:


While we might try to glean some insights from these state-level associations, the key principle is that we cannot draw any conclusions about individual-associations or causal factors from these summaries.


Poverty level is one of the key state-level factors that illustrates the impossibility of disentangling causal effects of any individual factor from state-level summaries.


Poverty is correlated with all kinds of relevant factors, including robustness of health care system, attitudes and practices towards healthcare, rigor of Covid mitigation strategies and testing practices, preexisting medical conditions, and vaccination uptake.


The excess death calculation already adjusts for the inherently higher death rate from pre existing conditions, healthcare system, and lack of use of preventative care for a given state, but it does not adjust for COVID-19-related or other pandemic-related factors including rigor of state level testing and mitigation practices, personal attitudes and behaviors towards the virus and mitigation, or vaccination uptake.


It also doesn’t adjust for interaction effects — if the effect of pre existing conditions or weaknesses of state healthcare systems are exacerbated by the pandemic conditions then these could still be confounding factors for excess death rate during the pandemic and obscure any attempt to use state-level summaries to extract effects of mitigation or vaccination strategies.


Poverty appears associated with higher COVID-19 mortality, but we don't know the causal factors. We know poverty is correlated with preexisting conditions like obesity, high blood pressure, etc. that have been linked to poor COVID-19 outcomes, but it is also linked with lower healthcare usage, living in areas with less access to quality healthcare, perhaps higher viral load exposure based on vocation, less adherence to mitigation practices, and higher population density. Disentangling complex associated factors like these from observational data is exceedingly difficult, and impossible from aggregated state-level summaries like the ones considered in this blog post.










2,027 views26 comments

26 Comments


Irish Arabic
Irish Arabic
Dec 29, 2023

The COVID-19 levels came down very low from the winter surge, and there was a weak negative association with vaccination propensity.

snow rider 3d


Like

Colby Adkins
Colby Adkins
May 23, 2023

Can I share it? geometry dash scratch

Like

remaining
remaining
May 16, 2023

A wonderful approach to spend time with someone and get to know them better is to play games with them. fireboy and watergirl

Like

Lisa
Lisa
Aug 30, 2022

https://igorchudov.substack.com/p/proven-relationship-covid-boosters

Like
teresamitus58
teresamitus58
Aug 31, 2022
Replying to

Lisa,

Oftentimes on this blog there are too many comments and not enough links to the data. I like that you simply sent the link so people can read it and take from it what they can.

Like

Ulrich Hendel
Ulrich Hendel
Aug 28, 2022

Thanks for explaining the pitfalls of drawing conclusions from just looking at simple charts - now I just wish I could get my conspiracy-minded and reality-averse friends to read and actually understand your post...

Like
Post: Blog2_Post
bottom of page