Israeli data: How can efficacy vs. severe disease be strong when 60% of hospitalized are vaccinated?
Updated: Oct 20, 2021
A surge involving the rapidly-transmitting Delta variant in heavily vaccinated countries has led to much hand-wringing that the vaccines are not effective against Delta, or vaccine effectivenss wanes after 4-6 months. This has fueled anti-vaccine sentiment suggesting the vaccines are not working, and causing much stress in vaccinated people that they are not as protected as they thought they would be.
In this post, I will focus on vaccine effectiveness vs. severe disease/hospitalization, which is the key factor for public health. I will not deal with vaccine effectiveness vs. symptomatic or asymptomatic disease here -- that has its own set of nuances that I will save for a future post.
One disturbing result that has been repeated about several locations is that a high proportion of patients hospitalized for COVID-19 are vaccinated. For example, we can see from data from the the Israeli government data dashboard that nearly 60% of all patients currently hospitalized for COVID-19 (as of August 15, 2021) are vaccinated (downloaded data set and details are found at the bottom of this post). Out of 515 patients currently hospitalized with severe cases in Israel, 301 (58.4%) of these cases were fully vaccinated, meaning two doses of the Pfizer vaccine.
I have seen this statistic of "nearly 60% of Israeli hospitalized COVID-19 patients are fully vaccinated" mentioned in numerous media reports and social media posts, for example see here.
From many, I have seen this statistic used as evidence to support a narrative suggesting vaccines don't work or have lost their effectiveness vs. severe disease, and I have seen other articles quote this type of figure as further evidence for the reduction of effectiveness of the vaccines in trying to justify 3rd shot boosters.
However, while these numbers are true, to quote them as evidence for low vaccine effectiveness is wrong and misleading. Sometimes, with observational data there is confounding of multiple factors that can make it easy to misinterpret simple percentages like this, and the current vaccination situation in Israel brings a perfect storm of confounding factors that lead to confusion if not thought through carefully.
In particular, the key factors here that contribute to this confusion are:
High vaccination rates in the country (nearly 80% of all residents >12yr)
Age disparity in vaccinations, including
Nearly all older people being vaccinated (>90% of residents >50yr) and
The vast majority of unvaccinated being younger people (>85% of unvaccinated <50yr)
Older people are orders of magnitude more likely to be hospitalized with a respiratory virus than young people (residents >50yr are >20x more likely to have hospitalized serious infections than residents <50yr, and residents 90+ are >1600x more likely to have hospitalized serious infections than residents 12-15yr)
After accounting for the vaccination rates and stratifying by age groups, from these same data we can see that the vaccines retain high effectiveness (85-95%) vs. severe disease, showing that when it comes to preventing severe disease, the Pfizer vaccine is still performing very well vs. Delta, even in Israel from whence the most concerning data have arisen. I will present the raw data in tables and step though these results. I will focus on fully vaccinated vs. unvaccinated to streamline the presentation, but the same data also show partial vaccination also provides a decent level of protection vs. severe disease (75-85%).
Adjusting for Vaccination Rate
It is true that nearly 60% of active serious cases are vaccinated, but such an analysis based on raw counts can be misleading since it is heavily influenced by the vaccination rates.
When vaccination rates are low, use of raw counts can exaggerate the vaccine effectiveness, and when vaccination rates are high, use of raw counts like this can attenuate the vaccine effectiveness, making it seem lower than it in fact is.
Note that a high proportion (nearly 80%) of all Israeli residents >=12yr have been vaccinated. To adjust for vaccination rates, one should normalize the counts, of severe cases in our setting, for example by computing number "per 100,000"
After this adjustment, we see that the rate of severe cases is 16.4/5.3=3.1x higher in unvaccinated individuals than fully vaccinated individuals. This suggests the vaccines are suppressing severe disease.
Here I will define effectiveness vs. severe disease as 1 - V/N, where V=rate of infection per 100k for fully vaccinated, N=rate of infection per 100k for unvaccinated. This represents percent reduction in serious infection rate in the vaccinated group relative to the unvaccinated group.
The effectiveness of vaccine vs. severe disease can be computed from this ratio by:
Vaccine Effectiveness vs. Severe disease = 1 - 5.3/16.4 = 67.5%.
The interpretation of this number is that the vaccines are preventing >2/3 of the serious infections leading to hospitalization that would have occurred sans vaccination.
Note that this is considerably lower than the >95% efficacy vs. severe disease that has been previously touted. This number makes it seem like the vaccine effectiveness vs. severe disease has substantially waned over time with this Delta variant.
However, this number is also misleading because of previously mentioned confounding of age with vaccination status and risk of disease, i.e. that older people are more likely to be vaccinated and inherently at higher risk of severe disease. We have to be careful, as I will now explain.
Imbalanced vaccination rates by age
If we split out the data by younger (<50yr) and older (>50yr), we can see that there is a sharp disparity in vaccination rates by age.
The vast majority of older people (>90%) have been vaccinated, while only 73% of younger people have been vaccinated. Looking at it another way, we see that 1,116,834/1,302,912 = 85.7% of unvaccinated individuals are younger (<50yr). Disparity in severe disease risk by age
This vaccination disparity by age matters because there is also a major disparity in risk of severe disease by age, with older people having an inherently much higher probability of severe disease requiring hospitalization than younger people.
If we look at just the unvaccinated population, we see the risk of severe cases is 91.9/3.9=23.6x higher in older (>50yr) than younger (<50yr) people. Looking at fully vaccinated individuals, we see the risk of severe cases is 13.6/0.3=43.2x higher in older (>50) than younger (<50) people.
Vaccine effectiveness vs. severe disease by age cohort
However, since we have the data split out by age groups, we can easily compute the vaccine effectiveness vs. severe disease for each age group:
Vaccine effectiveness vs. severe disease for younger (<50yr) = 1 - 0.3/3.9 = 91.8%
Vaccine effectiveness vs. severe disease for older (>50yr) = 1- 13.6/91.9 = 85.2%
These effectiveness measures are quite high and suggests the vaccines are doing a very good job of preventing severe disease in both older and young cohorts. These levels of effectiveness are much higher than the 67.5% estimate we get if the analysis is not stratified by age. How can there be such a discrepancy between the age-stratified and overall effectiveness numbers?
This is an example of Simpson's Paradox, a well-known phenomenon in which misleading results can sometimes be obtained from observational data in the presence of confounding factors.
Simpson's paradox explained
There are various nice explanations of Simpson's paradox online, including here and here.
I will borrow a plot from the latter reference and give a simple illustration:
Suppose the horizontal axis is dosage of a particular drug and the vertical axis is a measure of recovery probability, and that the red dots are older people and blue dots are younger people. From the plot on the right, we see that in both younger and older people, higher doses indicate lower recovery probabilities, so the drug clearly does not work for either age group and thus is a big bust overall. However, if we do not stratify our analysis by age, the plot on the left shows a positive relationship between dosage and recovery probability, and could lead to an erroneous conclusion that the drug was in fact working with those having higher doses having higher recovery probabilities. The reason for this paradoxical result that is that the both dosage and recovery probability were systematically higher in one group (younger) and lower in the other group (older). This creates a specific type of confounding that can produce such a paradox. Thus, if we do not stratify by the confounding factor (age), then the overall analysis gives a blatantly misleading result. In the case of vaccine effectiveness vs. severe disease, it is the fact that both vaccination status and risk of severe disease are systematically higher in the older age group that makes overall effectiveness numbers if estimated without stratifying by age misleading, producing a paradoxical result that the overall effectiveness (67.5%) is much lower than the effectiveness for either of the age groups (91.8% and 85.2%). Since the <50yr and >50yr groups are quite heterogeneous in terms of vaccination rates and risk of severe disease, it is instructive to stratify by even finer age groups:
We see quite high effectiveness in all age groups, with the 80-89 group having the lowest effectiveness (81.1%) and all others between 88.7% and 100%. We see that the current Israeli data provide strong evidence that the Pfizer vaccine is still strongly protecting vs. severe disease, even for the Delta variant, when analyzed properly to stratify by age.
In conclusion, as long as there is a major age disparity in vaccination rates, with older individuals being more highly vaccinated, then the fact that older people have an inherently higher risk of hospitalization when infected with a respiratory virus means that it is always important to stratify results by age; if not the overall effectiveness will be biased downwards and a poor representation of how well the vaccine is working in preventing serious disease (the same holds for effectiveness vs. death). Even more fundamentally, it is important to use infection and disease rates (per 100k, e.g.) and not raw counts to compare unvaccinated and vaccinated groups to adjust for the proportion vaccinated. Use of raw counts exaggerates the vaccine effectiveness when vaccinated proportion is low and attenuates the vaccine effectiveness when, like in Israel, vaccines proportions are high. To do this is to fall for the base rate fallacy. This is not just an issue of making vaccines look worse than they are ... any summary computing "proportion of hospitalized that are unvaccinated" that covers a period of time in which the proportion vaccinated was low can be similarly misleading, especially if there was a massive Covid-19 surge during that time periods. For example, computing total proportion of hospitalized covid infections in the USA from unvaccinated individuals while aggregating over the entire 2021 (January to present), a time periods that includes the early months in which virtually all USA residents were unvaccinated and there was a massive winter surge, will be similarly misleading. Thus, these artifacts can be used by some to make the vaccines look better than they in fact are, e.g. any report suggesting things like 99.9% of hospitalizations are from unvaccinated when covering a long period of time like this.
The bottom line is there is very strong evidence that the vaccines have high effectiveness protecting against severe disease, even for Delta, and even in these Israeli data that on the surface appear to suggest the Pfizer vaccine might have waning effectiveness. This is clearly evident if the data are analyzed carefully, and agrees with all other published results to date from other countries.
While this is just a snapshot of currently active infections on August 15, 2021, the principles apply to other analyses done on Israeli data, as well as others.
One caveat with any effectiveness analyses with the Israeli dashboard data is that the previously infected are not separated out. Note that:
Israel did not allow previously infected to be vaccinated until 3 months into the vaccination campaign (in March)
Then made only optional (given they awarded immunity passports to previously infected even if unvaccinated) and only limited them to one shot.
Given the high vaccination rate, it is plausible that a substantial proportion of unvaccinated were previously infected. Given the overwhelming evidence that previous infection confers strong and lasting immune protection from dozens of published papers, this means those unvaccinated have strong immune protection (possible comparable to vaccinated). This would serve to attenuate the effectiveness estimates, and may be one reason why the effectiveness vs. severe disease is not higher than 85-92%. Also, this might make their single-dose effectiveness appear much higher than other places since it also includes those previously infected who were eventually vaccinated. More caveats to keep in mind ... By the way, earlier reports on vaccinated cases at Israeli hospitals when there were 152 hospitalized breakthrough infections showed that a full 40% of these cases were immunocompromised, and 96% had co-morbidities including hypertension (71%), diabetes (48%), congestive heart failure (27%), chronic kidney and lung diseases (24% each), dementia (19%) and cancer (24%). At that time point, virtually none of the active serious breakthrough infections in Israel were in individuals without significant pre-existing conditions.
Similar effects could be lurking in other variables and settings, e.g. if people who have particular jobs like health care workers both have (1) higher vaccination rates and (2) higher probability of exposure to SARS-CoV-2, then this phenomenon could similarly bias the overall effectiveness vs. infection numbers if results not stratified by these factors that might differentially affect the probability of exposure. This comes into play especially when assessing whether vaccine effectiveness vs. infection wanes over time, given that in most countries the subset of young people who were vaccinated early are nearly all HCW who also have disproportionally high exposures to SARS-CoV-2 and thus higher probabilities of infection than the younger people vaccinated later who are not HCW or other "essential personnel" prioritized for early vaccination. Similarly, we can expect that immunocompromised people were in the earliest priority vaccination group, and thus it is possible that the reduced effectiveness in people vaccinated earlier could be in part due to these factors if they are not taken into account in the analysis.
With real-world observational data, we always need to think carefully about factors like these when trying to assess vaccine effectiveness against infection, severe disease, or death.
As a result, we should be wary of any claims that simply report raw counts or overall effectiveness figures without stratification, and we need to look to careful data analyses from published papers that take these factors into account using available statistical methods for causal inference, transparently described in detail, if we want an accurate sense of the potential causal effect of vaccines. Many of the papers I have seen published from Israel, the UK, Canada, the USA and elsewhere have used rigorous methodology to adjust for these factors, which can include stratification, re-weighting, matching by confounding factors or propensity scores, or covariate adjustment, but the details of how they adjust for such factors always must be carefully evaluated when trying to interpret the implications of results from any observational study.
A few details to point out about the data and analysis:
The data used in this blog post were downloaded from the Israeli Ministry of Health Dashboard. The box on the far left, second from the top has a down-arrow that can be clicked to obtain the data of currently active serious covid-19 cases by age and vaccination status. This data includes only Israeli residents age 12 and older. This is the data I downloaded on August 15, 2021, for this illustration and analysis. Here is the data set just as I downloaded it (the only change is I used google translate to get English headers since I don't read Hebrew)
Given they had both raw counts of cases for unvaccinated, partially vaccinated, and fully vaccinated as well as counts per 100k, I back calculated the number of fully/partially/unvaccinated in each age group. I focused here on severe infections, but the table also has numbers for total infections in each age group. For simplicity of presentation, I focused on fully vaccinated vs. unvaccinated, although the data is there for partially vaccinated as well. I also aggregated data into "young (<50)" and "old (>50)" groups to simplify the presentation, but present effectiveness estimates for each age group at the end. Here is the data set after these columns and rows were added that I used for the analyses presented:
For brevity, I focused the tables on fully and non-vaccinated only, and didn't include partially vaccinated (1 dose Pfizer). This is why the % don't add up to 100%, but if you take 100% - %unvax - %fullvax you get % partialvax. For example, overall it is 100%-18.2%-78.7%=3.5% partially vaccinated
BTW, My original table had two typos -- the 91.9 was 90.9 and 2,133,516 was 2,170,563. These were powerpoint cut and paste typos, and did not affect the %, cases per 100k, or effectiveness numbers. These are all correct.
Update, September 3, 2021
I have rerun these same analyses on the MOH data for active cases as of September 2, 2021.
Here are the updated numbers overall and split out by older (>50) and younger (<50)
We see that the number of currently active severe cases has increased a great deal for the unvaccinated, while for the fully vaccinated it has increased less for the <50yr and actually decreased for the >50yr group. Note that the "effectiveness vs. severe disease" numbers as defined have improved. We still see a mild attenuation in the overall estimate from Simpson's paradox-like effect, but it is not as severe as in the August 15 data.
Here are the data split out by age decade groups:
We see that the estimates of "effectiveness vs. severe disease" as defined are all between 93.2% and 100% based on the current active cases, with the exception being the 80-89 age group that is 90.7%
to which I applied the same basic calculations described above for the August 15, 2021 data set:
Some important caveats to keep in mind: NOTES:
Caveats about these data include the fact that a high proportion of the >50yr population was given 3rd shot boosters starting August 1st, so some of the improvement in numbers from the older population could be due to any extra effectiveness from the third shot.
Israel initially did not vaccinate those previously infected individuals, and then starting March only offered them a single shot. Thus, the "fully vaccinated" group does not generally include anyone previously infected, while the "unvaccinated" group contains many previously infected. Given the extensive data showing that previously infected have strong immune protection even if unvaccinated, this may serve to attenuate any "effectiveness" measurements from Israel MoH data unless the previously infected are removed from the analysis or analysis is stratified based on previous infection. The information necessary to do that are not present here.
This analysis is based on a snapshot of currently active cases according to Israel's Ministry of Health website. Thus, effectiveness vs. severe disease is defined as "% reduction in currently active severe cases in fully vaccinated vs. unvaccinated." This number is different than computing effectiveness over a longer period of time, or following a cohort of people, but does provide some information about how well the vaccines are preventing severe disease.
This analysis does not adjust for any confounding factors other than age. Other important factors include co-morbidities, occupation, testing frequency, time of infection, date of vaccination, and whether the person received a 3rd booster shot yet. These variables were not available in this database.
I do not recommend that the simple effectiveness calculation done here is state of the art. Given data over time for all of Israel, with other information including the factors mentioned above as well as test date, I would recommend using a matched case/control or test negative design, with an advanced statistical model accounting for the confounders mentioned above as well as follow up time for each person. I did not do that here because I did not have the data, and because my primary purpose was to illustrate some of the key naive misinterpretations that can come from observational data like these and how something as basic as stratification by age can greatly improve the estimation. With complex observational data, advanced statistical modeling is often needed to elucidate causal factors from them such as "effectiveness of vaccination"
Following are the updated MoH data with active serious infections as of September 20, 2021, split out by boosted/not boosted:
Note that we see that the effectiveness vs. severe disease in the older (>70yr) unboosted is lower, but the booster restores high effectiveness. For the younger groups, while the boosted effectiveness is very high, the effectiveness of vaccination is still very high (>85%) even without boosting.