A Probabilistic Analysis of Recent Findings

On August 29, 2020, for the first time ever, a randomized controlled trial showed a significant and dramatic reduction in COVID-19 severity. 

Study participants that received a large dose of vitamin D (as calcifediol) experienced a 50-fold reduction in the odds of admissions to intensive care, which likely translates to a similar reduction in death rates (see further analysis). If these findings are accurate, the end of the pandemic is near. The study has multiple limitations that would normally warrant waiting for studies, but given the circumstances, it is important to dig deeper and accurately assess its implications. We will estimate the probability that the finding is true, and analyze the risks of adopting the treatment now vs. waiting for further studies.

Is the finding true?

A randomized controlled trial, where patients are randomly selected to receive the treatment or serve as control, makes it possible to isolate whether some clinical finding results from the treatment or from another factor. So far, there have been many studies on vitamin D and COVID-19, which demonstrated a strong link between the two, but causality had not been established. For example, people with poor health may have low vitamin D levels due to low sun exposure, creating a correlation with COVID-19 severity that is not causal.

We now have the results of the first randomized controlled trial on the effect of vitamin D on Covid-19 patients. If it was properly conducted, causality has finally been established, and an effective treatment was found. Unfortunately, the study has several limitations that may distort its result. 

Let’s review these possible problems, and their significance. A more mathematically rigorous analysis may be found in the appendix below.

1. The sample size is small, so the findings may be due to chance 

It is always possible that the patients that were randomly assigned to the treatment group suffered less deterioration by mere chance. This possibility is calculated using the p-value, which measures the probability of obtaining the study result (or stronger) by chance. The authors disclose it only as less than 1 in 1,000, but the actual number is less than 1 in 1,000,000 (can be verified here, using the study results of 13:13 vs 49:1).

It is important to understand that once a p-value has been obtained, the sample size no longer matters. The goal of a large sample is to reduce the random differences between the two groups, thus making the difference in treatment a larger factor in the final result. The p-value improves both with study size, and with effectiveness of the treatment.

In this case, the effect was so strong that the relatively small sample (76 people) turned out to be much larger than required.

It can be said with certainty that if the experimental results are incorrect, it is not because of the sample size or chance.

2. The control group included more people with risk factors

The control group happened to have significantly more people with hypertension, so it is expected they would have more admissions to intensive care. The researchers identified this issue and performed another analysis (logistic regression) that accounted for it, and the findings were only mildly weakened, from a 50-fold to a 30-fold reduction, with 95% confidence that the result is between 4-fold and 300-fold. We will use 12-fold as a conservative estimate.

We performed another analysis, which assumed that only those with high blood pressure could deteriorate (i.e. removing patients without hypertension from the sample), and the findings still remained very significant, with a p-value of 1 in 5,000, far better than the standard threshold of 1 in 20, or 0.05.

Another issue to evaluate is whether this imbalance indicates a deeper problem with randomization or reporting. The reported p-value of the difference in hypertension is 0.0023, meaning a 1:435 chance it would happen in a random assignment. However, this is just one of at least 10 parameters that could affect the study, and the p-value also accounts for an opposite effect (2-sided instead of 1-sided), so the probability that one of them would happen is only around 1:21, meaning 1 in 21 such studies would have such an imbalance by mere chance – hardly remarkable. Given that randomization was done electronically upon patient admission, such a mistake is unlikely, and as fraud it won’t make much sense (especially as it is later reported and corrected for).

This clearly seems like a chance occurrence, and we see no reason to reduce the estimate beyond what the investigators already did.

3. Patients in both groups were also treated with hydroxychloroquine and azithromycin

Patients in both groups received the standard treatment, which at the time was hydroxychloroquine and azithromycin, a treatment that has since fallen out of favor. Could it be that the findings result from vitamin D neutralizing negative effects of those drugs? This option is unlikely. Trials have shown differing results for hydroxychloroquine and azithromycin, with a few pointing only to a mild risk.

It is also possible that vitamin D only works in combination with the other treatments. Given these mechanisms of action, this is highly unlikely.

We estimate that, at most, this possibility reduces the effect from 12-fold to 8-fold (i.e. vitamin D may have neutralized a 50% increase in severity caused by the other drugs).

4. The experiment was not double-blind placebo-controlled

To prevent distortion of the results by the trial participants or the researchers (even unconsciously), it is preferable that neither know which patients were randomized to the control group and which to the treatment group. This was not the case in this experiment.

This is certainly a weakness of this study. It was mitigated by delegating the decision regarding transfer to intensive care to a committee of experts that included members of the hospital’s ethics committee, who were not aware which group the patient was assigned to, and reached decisions based on a structured protocol.

We have reached out to the investigators to learn more about the procedures, and learnt that this was a result of logistic problems in placebo manufacturing. We got the impression that an honest effort was made to mask the data as much as possible, and the two groups were not otherwise treated differently.

We still need to account for the possibility of outright fraud enabled by this weakness, in which case the findings are false. Since there are no commercial interests around vitamin D, and the fraud would be exposed in later studies, we assign this a probability of 10% at most.

5. There may be another, yet unidentified, factor 

Of course, there may be another source for the dramatic difference between the two groups, which has not yet been identified. This would usually be the responsibility of the publishing journal to expose. In this case, the publication has been peer-reviewed and published in a small journal specializing in vitamin D. The publisher is Elsevier, which also publishes the Lancet and Cell.

Such a major finding should ideally be published in a world-leading journal, but given the limitations above, and the likely urgency to publish, it is not unreasonable to choose a smaller journal.

Given the relative simplicity of the trial, we do not see unknown factors as a major risk, at most accounting for a further reduction from 8-fold to 6-fold, and a 10% probability of it invalidating the results.

6. Is the prior probability of the study findings low?

Equally important is the likelihood that vitamin D could cure Covid-19, based on the information known before the article was published. For example, if a study finds that five minutes of neck massage cures lung cancer, it is very likely that there is some error in the study, even if its statistical significance was high.

In this case, the opposite is true:

However, so far the indication has been for a weaker effect – about a 50% reduction in severity, not 30-fold, so the new finding indicating a near cure is initially surprising. But on further examination, there may not be any contradiction between the studies. A re-examination of a study that published detailed data shows that the rate of infection drops to nearly zero with high levels of vitamin D in the blood (above 50 ng/ml). That is, it is possible that the effectiveness increases with dose, and in the very high doses, as used in the study, near healing is achieved. 

It should also be noted that the earlier observational studies used vitamin D levels that were measured a significant time before infection. By the time patients got sick their levels may have changed, which would cause a possible strong correlation to appear weaker.

Another possibility is that the use of short term high dose calcifediol is more effective than the long term supplementation of vitamin D3.

Additionally, the investigators have informed us that the protocol has been used on patients after the trial completed, with similar results.

Overall, according to the prior knowledge, a mild effect is more likely than a strong or no effect, reducing our conservative estimate from 6-fold to 5-fold.

Summary – The findings are true

None of the possibilities mentioned invalidates the significant finding that emerges from the experiment. It is very possible that some minor biases occurred that exaggerated the effect, but it is unlikely that vitamin D had no positive effect.

Summarizing the numbers above, we estimate:

  • 20% probability that vitamin D has no significant effect on COVID-19 severity
  • 80% probability that it reduces severity and death, probably around 5-fold, and possibly much more.

Risk Management

In order to make a treatment policy decision, one must consider not only the likelihood that the finding is true, but also the potential harm and benefits of each possible course of action.

Alternative 1 – Wait

The easy decision is to wait for further studies to verify the new finding. This is what medical experts would normally decide after a first publication of a successful trial.

If the treatment is ineffective, there are no costs and risks to this decision.

If the treatment is effective, then based on the analysis above, the results in the study’s control group, and typical outcomes for hospitalized patients, the harm to a typical hospitalized patient, can be estimated as:

  • Additional 20% chance of suffering severe disease, with likely long-term implications.
  • Additional 5% chance of death

Alternative 2 – Adopt treatment

The second alternative is to immediately adopt the protocol for hospitalized patients. In this case the harm to patients is from the large vitamin D dose (whether or not the treatment is effective).

As vitamin D is already a popular treatment, there is abundant information on its risks.

  • The dose used in the study is about 10 times the maximum recommended dose for prolonged use.

However, the treatment protocol in the study is relatively short – until release of the patient or transfer to intensive care. Previous studies on short treatments at similar doses found them to be safe.

  • The risk in vitamin D is with increased use that maintains very high levels in the blood over a long period of time.

Even then, the risks are relatively limited, and can be corrected by a low-calcium diet and steroids. For hospitalized patients that can be monitored closely the risk is likely further reduced.

  • Covid-19 specific risk: vitamin D increases the expression of ACE2 in cells, which acts as a receptor for the coronavirus. Therefore, until now, there has been apprehension about its use. Since the new trial focused on COVID-19 patients and doesn’t show such negative effects, the concern seems to have been alleviated. There is still some low likelihood that the study results were completely wrong, either intentionally or due to a catastrophic mistake  that hid a worse outcome in the treatment group.

Based on existing knowledge, the risks in the proposed protocol appear to be low.

The risk can be further reduced by monitoring vitamin D levels in the patients’ blood, and keeping them in a high yet safe range, for example 80 ng/ml.

It is safe to assume the risks of the protocol are much lower than:

  • 5% chance of severe complications.
  • 1% chance of death

Conclusion

Given that both:

  • The likelihood that the treatment is very effective is greater than 50%;
  • The benefit of the treatment, if effective, is far higher than twice the risk of the treatment;

it is obvious that the right decision is immediate adoption of the treatment protocol.

Hospitals deciding to wait for further studies should have very strong reasoning that outweighs the apparent harm to patients by delaying treatment.

Global Implications on the COVID-19 pandemic

This analysis shows that if the protocol is widely adopted, COVID-19 severity can likely be reduced to that of the seasonal flu, allowing alleviation of certain limitations, which could bring a major improvement in the economy and social health.

A further conclusion, although with lower confidence, is that vitamin D could be effective at earlier stages of the disease. Previous studies have shown a correlation between high vitamin D levels and lower infection rates. The new study establishes a causal connection at late stages, increasing the likelihood that the correlation at earlier stages is also causal. This would mean that widespread vitamin D therapy (e.g. bringing blood levels to a healthy 30-40 ng/ml) could reduce R0. If that reduction is as significant as indicated by the studies, R0 could drop below 1, and stop the pandemic. 

Since vitamin D deficiency is already common, and risks are negligible at this dose, governments should immediately encourage and subsidize vitamin D tests and supplementation for the general population.

UPDATE: Following this analysis, Rootclaim is offering a $100,000 bet that vitamin D cures COVID-19 in order to show that the reluctance to immediately implement a vitamin D protocol is irrational.

Appendix – Bayesian Analysis

For those with a background in probability, following is a more rigorous analysis using Bayesian inference. By explicitly stating prior probabilities of hypotheses, and calculating the conditional probabilities of the study results under each hypothesis, a more accurate and robust result is achieved, removing the need to analyze sample sizes, p-values, or confidence intervals.

Hypotheses

We will define five hypotheses to be considered:

  • Damage – Vitamin D worsens COVID-19.
  • Nothing – No effect
  • 2-fold – Vitamin D reduces the odds for severe COVID-19 by around 2.
  • 5-fold – Vitamin D reduces the odds for severe COVID-19 by around 5.
  • 20-fold – Vitamin D reduces the odds for severe COVID-19 by around 20.

Prior

First we shall estimate the probability of each hypothesis based on what was known before the new study. As a baseline, few drugs are effective for any specific disease, but as described above, there are multiple studies showing correlation between vitamin D and COVID-19, and several proposed mechanisms of actions. On the flip side, there is the aforementioned risk that vitamin D could actually exacerbate COVID-19 by increasing ACE2.

We will represent these facts with the following prior probabilities:

  • Damage – 10%
  • Nothing – 67%
  • 2-fold – 15%
  • 5-fold – 5%
  • 20-fold – 3%

Adjustments to Study

Given the limitations discussed above, we will make the following adjustments to the study results:

  • Move 2 cases from ICU to non-ICU in the control group, which we attribute to the higher hypertension cases there.
  • Move 2 cases from non-ICU to ICU within the treatment group, and do the opposite in the control group, due to unknown weaknesses not yet identified.
  • Estimate at 20%, as above, the probability that the study was grossly mismanaged, and should be ignored.

So instead of the reported matrix of:

Vitamin DControl
Admitted to ICU113
Not admitted to ICU4913

We will use:

Vitamin DControl
Admitted to ICU39
Not admitted to ICU4717

Conditional Probabilities

Next we estimate the probability of getting the adjusted study results, under each of the five hypotheses. To do that, we will use the odds of the control group (9:17 = 9/26 = 34.6%), and adjust by the hypothesis factor, to receive the expected odds in the treatment group. For example, the expected odds in the 2-fold hypotheses would be 9:17*2 = 9:34, or a probability of 20.9%. We then use a binomial distribution formula to estimate the conditional probability of getting the exact study result (3 out of 50 trials) given those expected odds. This is then normalized to sum 100%. Lastly we average with the prior probabilities at a weight of 20%:80%, accounting for the 20% possibility that the study is meaningless.

The full calculation:

HypothesisDamageNothing2-fold5-fold20-fold
Prior probability10%67%15%5%3%
Odds ratio (OR)0.712520
Convert odds
to probability =
9/(9+17*OR)
0.4310.3460.2090.0960.026
Conditional probability
from binomial formula
000.00290.15180.0985
Posterior =
Prior * Conditional
000.00040.00760.003
Posterior,
normalized to 100%
0.0%0.0%4.0%69.1%26.9%
Account for 20%
failed study possibility
(final result)
2.0%13.4%6.2%56.3%22.1%

Summary

This more rigorous analysis reaches a very similar conclusion of around 80% likelihood that vitamin D is effective against COVID-19, with a 5-fold reduction being the most probable.