A recent study, the first randomized controlled trial of its kind, showed that high doses of vitamin D (calcifediol) reduce the severity of Covid-19 in hospitalized patients. The researchers reported a 30-fold(!) reduction in intensive care admissions of Covid-19 patients. We at Rootclaim analyzed these findings and concluded that even under conservative assumptions accounting for limitations in the study, the effect is still significant and likely around 5-fold. We further demonstrated that since the risks of treatment are low, this treatment protocol should be immediately implemented.
Many health professionals, government officials, and other decision makers worldwide have seen the study results, but they are yet to update treatment guidelines. This may happen despite their best intentions, and their vast medical knowledge:
Not many have the background in statistics and probability required to assess the validity of this study, and distinguish it from the dozens of previous invalidated claims.
They’re affected by omission bias – they default to the “safe” alternative of inaction, waiting for more data, rather than choosing action. It’s easier to later defend inaction rather than be criticized for acting too soon.
In this particular case the decision to act now is clear:
Similar treatments have been performed for decades, and the risks are known to be low, especially in this setting when patients can be monitored at the hospital.
The benefits of the treatment, on the other hand, are potentially enormous, effectively reducing Covid-19 severity to that of the seasonal flu.
While caution is often the correct path when dealing with public health, this is a case where decisions should be made swiftly, using the best available models. At Rootclaim we develop such models, so when our analysis exposed the implications of the new findings, we decided to promote the adoption of the proposed treatment. We hope that this unique challenge will allow the information to reach more decision makers, and save the over 100,000 lives that will likely be lost while waiting for further studies.
Rootclaim is willing to bet $100,000 that vitamin D (e.g. calcifediol) is effective in reducing the severity of Covid-19.
Our claim: By April 1st, 2022, it will be accepted by health professionals that a vitamin D treatment protocol similar to that used in the study is better than existing treatments (remdesivir and corticosteroids) in reducing the odds of severe outcomes. These existing treatments are estimated to effect a 1.5 reduction in odds of admission to the ICU, which will be used as the threshold.
This challenge is intended to show that the reluctance to implement a vitamin D protocol today is irrational. A decision maker who is not pushing to adopt the proposed protocol is effectively claiming that the probability that this protocol is better than existing treatments is low. But that would also imply that taking the bet is very profitable. Therefore, any professional not accepting this challenge, is implicitly admitting that their decision not to promote the treatment is wrong.
The challenger needs to show that they can commit $100,000. We are open to discussing lower or higher amounts, and the funds can be pooled from multiple sources.
Both sides will agree in advance on the specifics of how a winner is determined, and what arbitration mechanism to use, if need be.
The challenger needs to declare that they do not have access to any relevant non-public information. This is to protect from abuse in case of unpublished research (there is still a small chance that further research will discover the treatment is ineffective).
For the same reason, we may update these terms or withdraw the offer, as new information emerges. Of course, once a bet is made it is final and cannot be withdrawn.
If you’re not willing to risk your own money betting against vitamin D, why are you willing to risk someone else’s life?
Study participants that received a large dose of vitamin D (as calcifediol) experienced a 50-fold reduction in the odds of admissions to intensive care, which likely translates to a similar reduction in death rates (see further analysis). If these findings are accurate, the end of the pandemic is near. The study has multiple limitations that would normally warrant waiting for studies, but given the circumstances, it is important to dig deeper and accurately assess its implications. We will estimate the probability that the finding is true, and analyze the risks of adopting the treatment now vs. waiting for further studies.
Is the finding true?
A randomized controlled trial, where patients are randomly selected to receive the treatment or serve as control, makes it possible to isolate whether some clinical finding results from the treatment or from another factor. So far, there have been many studies on vitamin D and COVID-19, which demonstrated a strong link between the two, but causality had not been established. For example, people with poor health may have low vitamin D levels due to low sun exposure, creating a correlation with COVID-19 severity that is not causal.
We now have the results of the first randomized controlled trial on the effect of vitamin D on Covid-19 patients. If it was properly conducted, causality has finally been established, and an effective treatment was found. Unfortunately, the study has several limitations that may distort its result.
Let’s review these possible problems, and their significance. A more mathematically rigorous analysis may be found in the appendix below.
1. The sample size is small, so the findings may be due to chance
It is always possible that the patients that were randomly assigned to the treatment group suffered less deterioration by mere chance. This possibility is calculated using the p-value, which measures the probability of obtaining the study result (or stronger) by chance. The authors disclose it only as less than 1 in 1,000, but the actual number is less than 1 in 1,000,000 (can be verified here, using the study results of 13:13 vs 49:1).
It is important to understand that once a p-value has been obtained, the sample size no longer matters. The goal of a large sample is to reduce the random differences between the two groups, thus making the difference in treatment a larger factor in the final result. The p-value improves both with study size, and with effectiveness of the treatment.
In this case, the effect was so strong that the relatively small sample (76 people) turned out to be much larger than required.
It can be said with certainty that if the experimental results are incorrect, it is not because of the sample size or chance.
2. The control group included more people with risk factors
The control group happened to have significantly more people with hypertension, so it is expected they would have more admissions to intensive care. The researchers identified this issue and performed another analysis (logistic regression) that accounted for it, and the findings were only mildly weakened, from a 50-fold to a 30-fold reduction, with 95% confidence that the result is between 4-fold and 300-fold. We will use 12-fold as a conservative estimate.
We performed another analysis, which assumed that only those with high blood pressure could deteriorate (i.e. removing patients without hypertension from the sample), and the findings still remained very significant, with a p-value of 1 in 5,000, far better than the standard threshold of 1 in 20, or 0.05.
Another issue to evaluate is whether this imbalance indicates a deeper problem with randomization or reporting. The reported p-value of the difference in hypertension is 0.0023, meaning a 1:435 chance it would happen in a random assignment. However, this is just one of at least 10 parameters that could affect the study, and the p-value also accounts for an opposite effect (2-sided instead of 1-sided), so the probability that one of them would happen is only around 1:21, meaning 1 in 21 such studies would have such an imbalance by mere chance – hardly remarkable. Given that randomization was done electronically upon patient admission, such a mistake is unlikely, and as fraud it won’t make much sense (especially as it is later reported and corrected for).
This clearly seems like a chance occurrence, and we see no reason to reduce the estimate beyond what the investigators already did.
3. Patients in both groups were also treated with hydroxychloroquine and azithromycin
Patients in both groups received the standard treatment, which at the time was hydroxychloroquine and azithromycin, a treatment that has since fallen out of favor. Could it be that the findings result from vitamin D neutralizing negative effects of those drugs? This option is unlikely. Trials have shown differing results for hydroxychloroquine and azithromycin, with a few pointing only to a mild risk.
It is also possible that vitamin D only works in combination with the other treatments. Given these mechanisms of action, this is highly unlikely.
We estimate that, at most, this possibility reduces the effect from 12-fold to 8-fold (i.e. vitamin D may have neutralized a 50% increase in severity caused by the other drugs).
4. The experiment was not double-blind placebo-controlled
To prevent distortion of the results by the trial participants or the researchers (even unconsciously), it is preferable that neither know which patients were randomized to the control group and which to the treatment group. This was not the case in this experiment.
This is certainly a weakness of this study. It was mitigated by delegating the decision regarding transfer to intensive care to a committee of experts that included members of the hospital’s ethics committee, who were not aware which group the patient was assigned to, and reached decisions based on a structured protocol.
We have reached out to the investigators to learn more about the procedures, and learnt that this was a result of logistic problems in placebo manufacturing. We got the impression that an honest effort was made to mask the data as much as possible, and the two groups were not otherwise treated differently.
We still need to account for the possibility of outright fraud enabled by this weakness, in which case the findings are false. Since there are no commercial interests around vitamin D, and the fraud would be exposed in later studies, we assign this a probability of 10% at most.
5. There may be another, yet unidentified, factor
Of course, there may be another source for the dramatic difference between the two groups, which has not yet been identified. This would usually be the responsibility of the publishing journal to expose. In this case, the publication has been peer-reviewed and published in a small journal specializing in vitamin D. The publisher is Elsevier, which also publishes the Lancet and Cell.
Such a major finding should ideally be published in a world-leading journal, but given the limitations above, and the likely urgency to publish, it is not unreasonable to choose a smaller journal.
Given the relative simplicity of the trial, we do not see unknown factors as a major risk, at most accounting for a further reduction from 8-fold to 6-fold, and a 10% probability of it invalidating the results.
6. Is the prior probability of the study findings low?
Equally important is the likelihood that vitamin D could cure Covid-19, based on the information known before the article was published. For example, if a study finds that five minutes of neck massage cures lung cancer, it is very likely that there is some error in the study, even if its statistical significance was high.
A recent analysis associates COVID-19 severity with a “bradykinin storm”, and offers vitamin D as possible treatment.
Other effects of vitamin D on COVID-19 quoted in the study include: regulating the renin‑angiotensin system, modulating neutrophil activity, maintaining the integrity of the pulmonary epithelial barrier, stimulating epithelial repair, and tapering down the blood’s increased coagulability.
Many previous studies (such as here, here, and here) have already shown a correlative (but not causal) connection between low vitamin D and COVID-19 severity. The new publication only verifies that the connection is causal.
See additional discussions including potential mechanisms here and here.
However, so far the indication has been for a weaker effect – about a 50% reduction in severity, not 30-fold, so the new finding indicating a near cure is initially surprising. But on further examination, there may not be any contradiction between the studies. A re-examination of a study that published detailed data shows that the rate of infection drops to nearly zero with high levels of vitamin D in the blood (above 50 ng/ml). That is, it is possible that the effectiveness increases with dose, and in the very high doses, as used in the study, near healing is achieved.
It should also be noted that the earlier observational studies used vitamin D levels that were measured a significant time before infection. By the time patients got sick their levels may have changed, which would cause a possible strong correlation to appear weaker.
Another possibility is that the use of short term high dose calcifediol is more effective than the long term supplementation of vitamin D3.
Additionally, the investigators have informed us that the protocol has been used on patients after the trial completed, with similar results.
Overall, according to the prior knowledge, a mild effect is more likely than a strong or no effect, reducing our conservative estimate from 6-fold to 5-fold.
Summary – The findings are true
None of the possibilities mentioned invalidates the significant finding that emerges from the experiment. It is very possible that some minor biases occurred that exaggerated the effect, but it is unlikely that vitamin D had no positive effect.
Summarizing the numbers above, we estimate:
20% probability that vitamin D has no significant effect on COVID-19 severity
80% probability that it reduces severity and death, probably around 5-fold, and possibly much more.
In order to make a treatment policy decision, one must consider not only the likelihood that the finding is true, but also the potential harm and benefits of each possible course of action.
Alternative 1 – Wait
The easy decision is to wait for further studies to verify the new finding. This is what medical experts would normally decide after a first publication of a successful trial.
If the treatment is ineffective, there are no costs and risks to this decision.
If the treatment is effective, then based on the analysis above, the results in the study’s control group, and typical outcomes for hospitalized patients, the harm to a typical hospitalized patient, can be estimated as:
Additional 20% chance of suffering severe disease, with likely long-term implications.
Additional 5% chance of death
Alternative 2 – Adopt treatment
The second alternative is to immediately adopt the protocol for hospitalized patients. In this case the harm to patients is from the large vitamin D dose (whether or not the treatment is effective).
As vitamin D is already a popular treatment, there is abundant information on its risks.
The dose used in the study is about 10 times the maximum recommended dose for prolonged use.
However, the treatment protocol in the study is relatively short – until release of the patient or transfer to intensive care. Previous studies on short treatments at similar doses found them to be safe.
The risk in vitamin D is with increased use that maintains very high levels in the blood over a long period of time.
Even then, the risks are relatively limited, and can be corrected by a low-calcium diet and steroids. For hospitalized patients that can be monitored closely the risk is likely further reduced.
Covid-19 specific risk: vitamin D increases the expression of ACE2 in cells, which acts as a receptor for the coronavirus. Therefore, until now, there has been apprehension about its use. Since the new trial focused on COVID-19 patients and doesn’t show such negative effects, the concern seems to have been alleviated. There is still some low likelihood that the study results were completely wrong, either intentionally or due to a catastrophic mistake that hid a worse outcome in the treatment group.
Based on existing knowledge, the risks in the proposed protocol appear to be low.
The risk can be further reduced by monitoring vitamin D levels in the patients’ blood, and keeping them in a high yet safe range, for example 80 ng/ml.
It is safe to assume the risks of the protocol are much lower than:
5% chance of severe complications.
1% chance of death
Given that both:
The likelihood that the treatment is very effective is greater than 50%;
The benefit of the treatment, if effective, is far higher than twice the risk of the treatment;
it is obvious that the right decision is immediate adoption of the treatment protocol.
Hospitals deciding to wait for further studies should have very strong reasoning that outweighs the apparent harm to patients by delaying treatment.
Global Implications on the COVID-19 pandemic
This analysis shows that if the protocol is widely adopted, COVID-19 severity can likely be reduced to that of the seasonal flu, allowing alleviation of certain limitations, which could bring a major improvement in the economy and social health.
A further conclusion, although with lower confidence, is that vitamin D could be effective at earlier stages of the disease. Previous studies have shown a correlation between high vitamin D levels and lower infection rates. The new study establishes a causal connection at late stages, increasing the likelihood that the correlation at earlier stages is also causal. This would mean that widespread vitamin D therapy (e.g. bringing blood levels to a healthy 30-40 ng/ml) could reduce R0. If that reduction is as significant as indicated by the studies, R0 could drop below 1, and stop the pandemic.
Since vitamin D deficiency is already common, and risks are negligible at this dose, governments should immediately encourage and subsidize vitamin D tests and supplementation for the general population.
For those with a background in probability, following is a more rigorous analysis using Bayesian inference. By explicitly stating prior probabilities of hypotheses, and calculating the conditional probabilities of the study results under each hypothesis, a more accurate and robust result is achieved, removing the need to analyze sample sizes, p-values, or confidence intervals.
We will define five hypotheses to be considered:
Damage – Vitamin D worsens COVID-19.
Nothing – No effect
2-fold – Vitamin D reduces the odds for severe COVID-19 by around 2.
5-fold – Vitamin D reduces the odds for severe COVID-19 by around 5.
20-fold – Vitamin D reduces the odds for severe COVID-19 by around 20.
First we shall estimate the probability of each hypothesis based on what was known before the new study. As a baseline, few drugs are effective for any specific disease, but as described above, there are multiple studies showing correlation between vitamin D and COVID-19, and several proposed mechanisms of actions. On the flip side, there is the aforementioned risk that vitamin D could actually exacerbate COVID-19 by increasing ACE2.
We will represent these facts with the following prior probabilities:
Damage – 10%
Nothing – 67%
2-fold – 15%
5-fold – 5%
20-fold – 3%
Adjustments to Study
Given the limitations discussed above, we will make the following adjustments to the study results:
Move 2 cases from ICU to non-ICU in the control group, which we attribute to the higher hypertension cases there.
Move 2 cases from non-ICU to ICU within the treatment group, and do the opposite in the control group, due to unknown weaknesses not yet identified.
Estimate at 20%, as above, the probability that the study was grossly mismanaged, and should be ignored.
So instead of the reported matrix of:
Admitted to ICU
Not admitted to ICU
We will use:
Admitted to ICU
Not admitted to ICU
Next we estimate the probability of getting the adjusted study results, under each of the five hypotheses. To do that, we will use the odds of the control group (9:17 = 9/26 = 34.6%), and adjust by the hypothesis factor, to receive the expected odds in the treatment group. For example, the expected odds in the 2-fold hypotheses would be 9:17*2 = 9:34, or a probability of 20.9%. We then use a binomial distribution formula to estimate the conditional probability of getting the exact study result (3 out of 50 trials) given those expected odds. This is then normalized to sum 100%. Lastly we average with the prior probabilities at a weight of 20%:80%, accounting for the 20% possibility that the study is meaningless.
The full calculation:
Odds ratio (OR)
Convert odds to probability = 9/(9+17*OR)
Conditional probability from binomial formula
Posterior = Prior * Conditional
Posterior, normalized to 100%
Account for 20% failed study possibility (final result)
This more rigorous analysis reaches a very similar conclusion of around 80% likelihood that vitamin D is effective against COVID-19, with a 5-fold reduction being the most probable.
There have been many hypotheses surrounding Stonehenge and its long-lost origins. Can a mathematical assessment by Rootclaim help shed light on its original purpose?
The large prehistoric monument in rural England is comprised of a circle of upright stones. It was constructed around 5,000 years ago, under circumstances that have long been lost in the annals of history. Some believe that Stonehenge served a religious function, while others say it was used as a burial site or a place of mystical healing. Yet others argue that it served as a giant calendar. There are also conflicting claims about how the bluestones used to build Stonehenge were moved from their place of origin 150 miles away. One explanation is that a glacier flow moved the stones, while another view says that people transported them from quarries in Preseli.
How can probability theory help?
Probability theory helps researchers measure uncertainty. And there’s plenty of uncertainty surrounding Stonehenge. By using a probabilistic framework, we can model the likelihood of each piece of evidence relating to Stonehenge.Continue reading
The human mind doesn’t deal well with complexity. It seeks shortcuts, often being fooled by one of many cognitive biases. One of the goals at Rootclaim is to reduce uncertainty by breaking down complex questions into more manageable pieces. This whole system is strengthened by the open crowd-sourced approach, which increases the breadth, depth, and creativity of the analysis. A good case in point is the contrast between the recent UN Joint Investigative Mechanism (JIM) report on the Khan Sheikhoun chemical attack, and the Rootclaim analysis of the same incident.
On September 6, 2017, the UN Human Rights Council (HRC) published a report addressing the April 4 Khan Shaykhun attack. The report found “reasonable grounds to believe Syrian forces dropped an aerial bomb dispersing sarin in Khan Shaykhun.” This finding seems to bolster the hypothesis that the Syrian Army was responsible for the attack. That would justify inclusion in the related Rootclaim analysis. However, a closer look reveals that this is not the case.
For almost a decade, Osama Bin Laden eluded escape, despite a $25 million bounty on his head. That ended in May 2011, when two American helicopters touched down outside a walled compound in Abbottabad, Pakistan. American soldiers stormed Bin Laden’s safehouse, shooting and killing him. Bin Laden’s death raised more questions than it answered. The most glaring question: did the Pakistani government realize that Bin Laden was hiding under their noses?
The Rootclaim analysis of this question looked at extensive evidence. This included information reported about the Bin Laden compound, leaked communications, US behavior following the raid, statements by Pakistani leaders, and the findings of the Abbottabad Commission Report.Continue reading
What Caused the Disappearance of Malaysia Airlines Flight 370?
Malaysia Airlines flight 370 disappeared on March 8, 2014. The location of the plane and the reason it is missing remain unknown. In order to find the most probable solution to this mystery, the Rootclaim analysis of this story currently considers nine hypotheses: the pilot committed suicide; the co-pilot committed suicide; passengers hijacked the plane; the pilot hijacked the plane; in-flight fire; turbulence; the flight was shot down; fuselage crack; and that an improperly repaired wing-tip caused the crash. At the moment, the evidence suggests that the pilot crashed the plane while committing suicide.
One common trap of human intuition is failing to take into account the plausibility of an event before considering the context-specific evidence of the case at hand. Without knowing how plausible a hypothesis is in general, it is easy to fall into the trap (test yourself!) of overestimating the initial probability for inherently unlikely theories. This is known as the Prosecutor’s fallacy, one of the main flaws of human reasoning.
As discussed in a previous post, logic fails in the real world. Real-world problems do not lend themselves to logical reasoning since any non-trivial issue involves some uncertainty (Was the report accurate? Is the test result a false-positive?). When the problem is also complex, involving large amounts of information with intricate dependencies, then uncertainty can render the logical argument meaningless.
How Does Anyone Make Decisions?
Given these hurdles, how has humanity managed to make any progress at all? How do we deal with uncertain information in cases of high complexity? Continue reading
What are you more afraid of: boarding a plane, or getting into a car? For many people, flying comes with nervousness or trepidation. Such concerns don’t reflect realistic concerns. Car crashes claim more lives each year by orders of magnitude. But they do reflect something else: a cognitive trap to which almost everyone is susceptible.
Making decisions can be difficult. To help us along, our minds use a number of cognitive shortcuts. These shortcuts, called heuristics, allow us to make more rapid decisions with minimal calculations. Unfortunately, while generally efficient, heuristics can also lead us astray.
Rootclaim is a collaborative analysis platform that transforms how people understand complex issues by combining the power and reach of crowdsourced information with the mathematical validity of Bayesian statistics.