Author: Ben

Rootclaim’s COVID-19 Origins debate results

And the winner of Rootclaim’s COVID-19 origins debate and the $100,000 prize is…

Unfortunately, not us :). We would like to explain this result, but first, we would like to congratulate our opponent and Rootclaim’s first challenger, Peter Miller. Miller showcased an impressive understanding of the details during the debate, which was hard to match.

While Peter’s victory was well earned within the parameters of the debate, we believe it was also due to our failure to structure an effective debate.

Obviously, one can simply conclude the correct decision was reached and zoonosis is simply the likelier hypothesis. Without resorting to sore losing and given the importance of this issue, regardless of the debate, we would like to explain why we still believe the lab leak hypothesis is the most likely explanation for the origin of COVID-19 and, as our new and updated analysis shows, its likelihood only increased following the deeper analysis we did for the debate.

First, we’d like to clarify, that the judges did an amazing job, putting immense effort, thought, and talent into their decisions:

Will Van Treuren is a microbiologist and immunologist with a PhD from Stanford. He works as Chief Science Officer at a biotech company developing new drugs to treat inflammatory diseases. Will’s written decision can be found here and here and a video summary is available here.
Eric Stansifer is an applied mathematician with a PhD in the Earth sciences from MIT. He has previously done research in a mathematical virology research group, doing simulations of MS2 capsid assembly. Eric’s written decision can be found here and a video summary is available here (you can also read his blog here).

What went wrong?

So, if the judges did their job well and our opponent played by the rules, what went wrong? We believe two things tilted the debate in favor of our opponent and we will correct them in future debates:

First, the debate structure provided a major advantage to the debater with more memorized knowledge of the issue. The debate was live (via video) and Miller exemplified extensive knowledge and superb memory for many details, which we could not compete with in real-time. This was not an issue in the second session about genetics, where we were represented by Yuri Deigin, but our second mistake (below) made his good efforts irrelevant. While such superiority is worthy of victory in normal debates, Rootclaim strives to create a model for reasoning and inference that minimizes the problems with human reasoning. Unfortunately, we structured a debate that rewards it. To fix this, future debates will be held in an offline text format, with only a short video presentation at the end.

The second issue we identified was that we failed to incorporate a process of ongoing feedback from the judges, spending most of our time on issues that had little impact on the final decision. In their ruling, we found major mistakes in their understanding of our analysis, which could have been easily corrected had we built the debate with more direct ongoing feedback from the judges.

For example, we know from years of dealing with probabilistic inference that it is highly unintuitive, and it is a challenge to translate to human language. We therefore focused more on an intuitive understanding of the evidence, with probabilistic inference used only as a background framework.

In practice, we were surprised to see both judges found probabilistic inference to be the best way to reach a decision. We of course agree, but had we known this to be the case, we would’ve focused our efforts on explaining how to do probabilistic inference correctly, describing the major pitfalls we discovered over the years, and how to avoid them. As we failed to do so, errors in the judges’ probabilistic inference resulted in unrealistic numbers assigned to the evidence.

The mistakes were heavily skewed toward zoonosis, since our methodology involves steelmanning and maximizing the likelihoods of both hypotheses, while Miller used figures heavily biased toward zoonosis, in some cases using extreme estimates that are impossible to reach in a robust probabilistic analysis, as we explain below.

The Risks of Strawmanning

This mistake of assigning extreme numbers is similar to strawmanning in human debate, and can demolish an otherwise valid probabilistic analysis. Following is a semi-formal definition of the problem and how to avoid it:

Our goal in a probabilistic analysis is to estimate Bayes factors.
A Bayes factor is the ratio of conditional probabilities.
A conditional probability p(E|H) is the probability the evidence E will occur, assuming H is true.
In real-world situations, there are many ways E can occur, so p(E|H) should integrate over all those ways (using “1−∏(1−pi)”).
In practice, focusing only on the most common way is usually accurate enough, and dramatically reduces the required work, as real world data tends to have extreme distributions, such as a power law distribution.
This is the “best explanation” – the explanation that maximizes the likelihood of the hypothesis – and making a serious effort to find it is steelmanning.
A mistake in this step, even just choosing the 2nd best explanation, could easily result in orders-of-magnitude errors.
To reduce such mistakes, it is crucial to seriously meet the requirement above of “assuming H is true”. That is a very unintuitive process, as humans tend to feel only one hypothesis is true at any time. Rational thinkers are open to replacing their hypothesis in the face of evidence, but constantly switching between hypotheses is difficult.
The example we like to give for choosing a best explanation is in DNA evidence. A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty.
But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. The goal is to truly seek which explanation is most likely for each hypothesis, using the specifics of each case.
Furthermore, it’s important to not only find the best explanation but honestly think about how well we understand the issue and estimate how likely it is there is some best explanation that still evades us (i.e. that we are currently estimating the 2nd best explanation or worse). This too is obvious to researchers who know not to go publish immediately upon finding something, but rather go through rigorous verification that their finding doesn’t have some other mundane explanation.
So, the more complex the issue is, and the weaker our understanding of it, the less justified we are in claiming a low conditional probability. In frequentist terms, the question we should ask ourselves: How often did I face a similar issue only to later find there was a much more mundane explanation? Suppose it’s 1 in 10, then the lower bound on our p is 0.1 times however frequent that mundane explanation happens (say 0.2, for a total of 0.02)
Claiming something like p=0.0001 in a situation where we don’t have a perfect understanding of the situation is a catastrophic mistake.
For well-designed replicated physics experiments p could reach very low (allowing for the five sigma standard), but when dealing with noisy complex systems involving biology, human behavior, exponential growth, etc. it is extremely hard to confidently claim that all confounders (i.e. better explanations for the finding) were eliminated, so claiming a very low p is an obvious mistake.
The last guideline is to also examine our confidence in our process. As we examine best explanations, we also need to account for the possibility that we made mistakes in that process itself.
Suppose the explanations for the DNA match are only “by chance” and “lab mix-up”, and suppose we examined the lab procedures and talked to staff and determined a mix-up was very unlikely, it still doesn’t make “by chance” the most likely explanation, since it is still possible our analysis was wrong, and the combined probability of our mistake and a mix-up (say 0.01*0.01) is still much higher than a chance match (1E-9).

To summarize: Estimating the Bayes factor requires estimating conditional probabilities, which requires finding the best explanation under each hypothesis, which can easily succumb to several pitfalls that cause catastrophic errors. To avoid those: a) Seek and honestly evaluate best explanations under the assumption the hypothesis is true, b) Estimate the likelihood that there is some better explanation that is yet to be found – the more complex the issue is, the higher the likelihood, and c) Estimate the likelihood of mistakes in the estimates themselves.

The Main Mistakes

We therefore never provided extremely low conditional probabilities under zoonosis, and as a result didn’t have any extreme factors in our analysis. Unfortunately, the result of our steelmanning was that when our hypothesis’ explanation was favored, the effect on the final likelihood was much smaller than when Miller’s was. When the judges did not have the tools to conclude between the sides, their result was some average of the two, which of course, given the extreme, strawmanned numbers offered by Peter, favored zoonosis.

Again, to clarify, this is no fault of the judges and is fully our responsibility for structuring the debate incorrectly. We found many such mistakes throughout both judges’ decisions, but in the interest of time would like to focus on the three most important ones that are enough to make lab-leak far more likely, once corrected.

Mistake #1: p=0.0001 for an HSM early cluster

The first mistake in the judges’ decision was accepting an extremely low likelihood for the Huanan Seafood Market (HSM) to form an early cluster of infected patients if Covid originated in a lab. Now that we’ve demonstrated the importance of steelmanning, it’s obvious that it is a mistake to consider HSM to be a random location in Wuhan (i.e. will form an early cluster only once every 10,000 hypothetical SARS2 lab leaks in Wuhan).

Even though we were not able to provide a perfect model for why HSM is a likely early cluster location, the complexity of a virus spreading in an urban area, and especially the huge difference that a small exponential advantage at HSM will have on the final numbers, means there is no way to reach anywhere close to the level of confidence required to claim a number as extreme as p=0.0001.

COVID origins debate insights – Part 2: Does Huanan Seafood Market really point to zoonosis?

While it's linked to many early cases, our debate shows significant disagreement on its role as evidence, questioning how much this coincidence truly favors the zoonosis theory.
— Rootclaim (@Rootclaim) February 2, 2024

Mistake #2: p(Lab leak)<0.01 in priors

The second major mistake in the judges’ decision, again involves using extremely low likelihood instead of steelmanning, this time in the prior likelihood for a lab leak. Each judge made different mistakes, but both reached numbers that, unknowingly to them, imply gain-of-function research is extremely safe, and all the expert warnings and government moratoriums on it were wrong – a level of confidence that is of course impossible to reach without making some outstanding breakthrough in the understanding of the field. See more details here:

Over the next weeks we’ll release a few threads describing interesting insights on COVID origins we had during the debate. Today we will discuss priors.https://t.co/L5uPNIgoHn
— Rootclaim (@Rootclaim) January 20, 2024

Stansifer’s mistakes:

Severe underestimate (0.02) of the probability that at least one researcher in WIV will undertake a project that WIV clearly expressed interest in. The mistake here seems to come from wrongly thinking SARS2 has features that are not covered by DEFUSE. Interestingly, after Stansifer reached his decision, it was discovered that WIV was planned to do a lot more than officially written in DEFUSE.
Severe underestimate (0.02) of the probability that a researcher working on a SARS2-like virus for weeks or months under BSL-2 would get infected. There is good reason to claim this could be an over 50% probability, and we gave it a conservative 15%, but 2% is highly overconfident.
These two mistakes imply the probability of any work in the Wuhan’s Institute of Virology (WIV) causing a leak to be 1 in 17,000 years. Given that WIV was planning to do coronavirus GoF experiments under BSL-2 – meaning they’ll be dealing with a respiratory virus without even a face mask, this could easily be a 100x mistake.

Treuren’s Mistake:

A redundant 0.01 factor was added for requiring WIV to have an unpublished backbone with 98% nucleotide similarity to SARS2. There is no such need. Since our prior was defined as a novel coronavirus pandemic, then all we need to estimate is the probability that a virus capable of that existed in WIV. Specifically, since DEFUSE describes searching for hACE2 matches and adding FCS, then the only question is whether WIV held a virus with a good hACE2 match.

We know BANAL-52 is identical in the RBD to SARS2, so if a relative of it was collected then they have a backbone and we’re done. But we should expand that to any virus with an hACE2 match, even one with 80% similarity to SARS2, so it’s very reasonable that at least one will be found. We gave this 50%.

Another way to look at this mistake: If we arbitrarily limit the engineered backbone to have 98% similarity to SARS2, we should apply the same limitation to the zoonotic progenitor, meaning we should discard from the prior any pandemic that is caused by viruses that doesn’t use hACE2, or those with good hACE2 match but using a different genetic sequence.
If we place this requirement on both hypotheses, the effect cancels out.

Mistake #3: Missing that the FCS estimate is heavily steelmanned

The third major mistake in the judges’ decision, was using a low estimate for the likelihood of the Furin Cleavage Site (FCS) occurring naturally. A naive analysis of the combination of the rare occurrences behind the FCS insertion (which you can read about in our thread here) places us comfortably in a Bayes factor of millions. Ironically, had we just submitted this strawmanned calculation, we could have won the debate. However, since our goal was to actually determine what hypothesis is most likely, we steelmaned this estimate as well, thinking of the most likely way this could happen, truly assuming zoonosis is true.

Today’s insight from our Covid Origins debate is about the Furin Cleavage Site (FCS) – an amino acid sequence that facilitates Covid’s entry into cells. It is generally agreed the existence of an FCS supports the lab leak hypothesis, but to what extent?
— Rootclaim (@Rootclaim) February 15, 2024

Conclusion

As explained, we have updated our debate structure to avoid these problems in the future. Rootclaim’s $100,000 challenge is still open to anyone, including on the COVID-19 origins issue, as we’re still standing behind our analysis and willing to put our money where our mouth is.

We have invited Peter to reapply, using the updated textual debate format with ongoing judge feedback, allowing the sides to fully convey their hypothesis in exactly the problematic areas. Miller has declined a rematch but we respect his decision to move on and invite others to take his place.

The idea behind our challenge and risking money is to provide a strong incentive for deep research and analysis. This was successful beyond our expectations with Miller now probably one of the people with the deepest and most encompassing knowledge about the origins of COVID-19.

In ‘A Journey to the Center of the Earth’, Jules Verne wrote that “Science is made up of mistakes, but they are mistakes which it is useful to make because they lead little by little to the truth”. You don’t go into the probabilistic inference business expecting certainty and In this spirit, we appreciate this loss as our compass to future success.

New evidence resolves the controversy surrounding the 2013 Syria sarin attack

June 18, 2021 / Ben / 5 Comments

Since its founding, Rootclaim has tried to bring clarity to areas of uncertainty surrounding world events. Today we are one step closer to that goal, with new discoveries that effectively resolve the major controversy of who was behind the 2013 chemical attack near Damascus.

Responsibility for the 2013 chemical attack has been a hotly contested, politically divisive issue, with a wide consensus in the West that the Syrian government was to blame, while Syria and its allies claimed that it was a “false flag” opposition attack, intended to bring about US intervention.

Rootclaim’s 2017 analysis went against this Western consensus, calculating an 87% likelihood that the Syrian opposition carried out the attack. Following the discoveries discussed below, this has now been updated to 96%, one of our most certain conclusions. Read our updated probabilistic analysis here, including a summary of the main claims of each side.

The new findings are a result of what we believe to be the most impressive independent open-source investigation in history. It was initiated nearly a year ago by several volunteers who reviewed all the evidence from the attack and managed to uncover incontrovertible evidence implicating an opposition faction, confirming Rootclaim’s conclusion. The full report is available here, and following is a summary of its findings.

Rocket trajectories

The investigation began by examining the many videos of rocket impact sites that were uploaded following the attack. Each video was examined for clues pointing to its exact location and the trajectory of the incoming rocket.

For example, in this video, the chemical rocket penetrated a wall on a roof and continued to the floor below. Several landmarks in the background can be matched to satellite photos, identifying the exact location (33.519130°, 36.354841°).

Stitching together a few shots from the video shows that the rocket first hit the far wall and then the floor below. Connecting the two impacts provides an estimated trajectory for the rocket, with a launch location to the northwest.

This location is especially interesting as it singlehandedly invalidates the current hypothesis for government involvement in the attack.

Originally, the common claim was that the attack originated in a Syrian army base. But when the rockets were discovered to have a short range of around 2 km, this claim had to be retracted, as no bases were within that range.

This prompted Eliot Higgins of Bellingcat, an investigative journalism website that specializes in open-source intelligence, to look for new possible launch locations. In Higgins’ diagram below, the green area is under government control, and the red circles are 2 km from the impact sites (2.1-2.3 km is considered the maximum range). Consequently, he suggested that the Syrian army launched the chemical rockets from the area south of the Air Force Intelligence Branch (yellow rectangle).

However, this entire area lies east of the blue line we added to the diagram, which shows it could not have been the source, as a rocket shot from there would have penetrated the northern wall of the building rather than the western wall, as seen in the video.

This next video was also reexamined. Its location was identified in 2013 (33.520415°, 36.356117°), and the rocket’s trajectory is clearly evident, since it lodged in the ground without bending, pointing to its source.

In two shots from the video the camera is almost directly behind the rocket, and it is seen to align with a tree and buildings in the background.

Connecting these features in satellite imagery provides the trajectory’s azimuth (towards the yellow building).

Interestingly, the UN misreported this angle by 30(!) degrees (towards the purple building below).

The UN also misreported another trajectory, both of which conveniently intersected at a Syrian army base (which we now know is 5 times beyond the rockets’ range). Subsequently, the New York Times printed these mistaken findings on their front page as “forensic” evidence for Syrian army culpability.

Side note: Misuse of power

A similar failure occurred in the OPCW investigation of the 2018 Douma chemical attack, where OPCW personnel who pointed out evidence that the attack was staged were silenced.

This hijacking of international bodies by political and financial interests is becoming a major world threat, hurting the lives of millions. Additional examples exposed in other Rootclaim analyses include the failure of health organizations to realize the efficacy of vitamin D (and other unpatented treatments) in treating Covid-19, and when the scientific community and the WHO suppressed evidence supporting the hypothesis that Covid-19 resulted from a lab leak (see our analysis).

Impact sites

The open-source investigation repeated the process above for seven impact sites, producing this map of all trajectories (triangles represent uncertainty of a trajectory).

The agreement between the trajectories is remarkable, with all of them converging on a small area that also happens to be at the expected ~2 km range from the impact sites.

It is widely recognized that this location was under opposition control at the time (the significance of this spot was not known until now, so both political sides had no problem agreeing who controls it…).

Right in the middle of the identified area is this small field with enough space from which to launch rockets, whose importance will soon become evident.

Video of the chemical attack

A month after the attack, when the US threat to attack Syria had already been removed, a video surfaced, which was claimed to have been found on the bodies of “Syrian terrorists”. The video shows Islamist fighters in gas masks launching the exact same rockets, identifying themselves as Liwa al-Islam (the dominant opposition faction in the area), and announcing the date as August 21st 2013 (the day of the attack).

The existence of video evidence of opposition fighters carrying out the chemical attack is a remarkable story all by itself. What would normally be considered the highest level of evidence, was here dismissed out-of-hand as fake and wasn’t even mentioned in mainstream media, while overconfident unfounded accusations by the US government and false evidence reported by the UN made headlines.

The Rootclaim method prevents this bias by requiring a thorough investigation of all evidence, without filtering. We carefully examined those videos years ago, and also researched video fabrications in general. We found staged videos to be very rare, and that this video has multiple features that are highly uncharacteristic of a fabrication. This finding was a major factor in our initial conclusion.

Thanks to this new investigation, we now have a much deeper understanding of these videos.

The videos are fairly dark with little detail, but a frame-by-frame examination managed to uncover many features of the launch spot, and they perfectly match that same field where all the rocket trajectories intersect.

For example, in several frames the rocket illuminates the area, exposing details such as trees in the background, a field with low vegetation, and a paved platform where the cameraman stands.

In another shot, we see a ditch or edge, while other shots show a few scattered trees and brush.

Other such shots in the video provided further features, which were all modeled in 3D, creating a unified view of the area:

This is a perfect match to our field:

Conclusions

We have a video showing opposition fighters with gas masks launching the rockets used in the chemical attack on the night of the attack. This video strongly matches the characteristics of a small field that lies right at the intersection of seven trajectories calculated from the impact sites, within rocket range of all of them.

Continuing to support the government attack hypothesis in light of this new evidence would require constructing a very unusual scenario. Nevertheless, given the political interests surrounding this issue, we will likely witness such attempts soon.

This breakthrough demonstrates the superiority of Rootclaim’s method, which was able to reach this conclusion years ago, without using the new findings, and with much less information and resources than the Western intelligence agencies who confidently claimed the opposite. That is the strength of probabilistic inference: its ability to extract better insights from less evidence.

Of course, many others also took this position, and have now been proven right, but their position was often politically influenced, causing them to reach the wrong conclusion in cases where the West’s claims happened to be true, such as the downing of flight MH17 over Ukraine (we at Rootclaim concluded that it was downed by the pro-Russian DNR, agreeing with the prevailing narrative in the West).

It is very rare to be consistently correct on contentious issues, when each time the truth supports a different political side. We believe Rootclaim is unique in its consistent success in that aspect.

To summarize the key takeaways from these new discoveries:

Having superior inference methods is far more important than gathering more evidence.
Sometimes, a “smoking gun” is already available, and there is no need to collect more evidence (the information here was all available in 2013). This is especially true for videos and photos, which are so rich in information that there is nearly always another discovery to be made.
The current crisis regarding the public’s trust in authorities and experts is not just about ‘fake news’. Experts are repeatedly failing to serve public interests, due to failures of human reasoning and heavy politicization.
Our society needs to quickly improve its inference methods and especially how our intelligence agencies, courts, international bodies, NGOs, and media operate. The current state of affairs is dramatically increasing the probability of a global catastrophe.

Rootclaim will continue to contribute its part in furthering these goals, by continuing to improve our methodology and by disseminating our analyses to a wider audience.

Promoting Rootclaim is quite a difficult task, when practically every person finds at least one of our findings deeply offensive. But we’re in it for the long run, and will continue to work to consistently provide highly accurate, unbiased analysis of major world events.

Many thanks to the researchers who uncovered these new findings: Michael Kobs, Chris Kabusk, Adam Larson, and many others.

Rootclaim shifts to agile, simplified analyses

December 9, 2020 / Ben / 0 Comments

We’re proud to announce some exciting updates to Rootclaim. Since its inception, Rootclaim has focused on exposing the truth on many issues in public discourse using probabilistic inference. Rootclaim has established an outstanding track record, using proven mathematical models and publicly available information to overcome the flaws of human reasoning. However, this time consuming approach limited us to being able to respond to relatively few events.

We are glad to announce upgrades to Rootclaim that will deliver more agile, simplified, and timely analyses.

Previously, much of the analysis work had gone into the tricky effort of identifying and dealing with evidence dependencies. Going forward, we will group dependent evidence and analyze them jointly. This will allow Rootclaim and the readers to examine the evidence in the context of the investigation that exposed it, easily evaluate interdependencies, and assess the significance of missing evidence. Likewise, previously, a large portion of the effort had gone into the analysis of minor pieces of evidence, which have little effect on the calculated results. When analyzed as a group, minor details often lose their relevance.

These new efficiencies will allow us to publish analyses and provide the most reliable estimates about developing stories within a few days instead of the weeks of work that have been previously required.

The new analyses will be more readable while still maintaining Rootclaim’s high level of accuracy, and they can be read from top to bottom without knowledge of probability theory. Each group of evidence and its effects on the likelihood of the hypotheses will be clearly explained.

We have republished the following two analyses using the new model:

We are also working hard analyzing the 2020 election results for indications of fraud, and will soon publish our analysis on how the COVID-19 pandemic started, with some surprising results…

We invite you to keep following us on social media (and invite your friends!) for updates on new stories, and ask for your help reviewing our analyses and contributing your inputs.

$100,000 Debate Challenge: Chemical Attack in Syria

November 25, 2020 / Ben / 2 Comments

Among the many atrocities of the Syrian Civil War, the one that stood out was the use of chemical weapons, and particularly the nerve agent sarin.

While there is general agreement that there were multiple sarin attacks, most of the Western population has accepted that the attacks were carried out by the Syrian government. This assumption is so entrenched that objections to it are widely considered to be “conspiracy theories”.

Rootclaim, however, examined the evidence using a probabilistic analysis, and the calculated conclusion revealed that it is much more likely that opposition forces were at fault.

Most of Rootclaim’s conclusions on other issues later became the consensus opinion, despite some initial pushback. Since this has yet to happen with regard to the sarin attacks in Syria, we decided to issue an open $20,000 challenge to debate anyone on this matter. This challenge has gone unanswered since April 2018, and we are now presenting it here in more detail, and increasing the bounty to $100,000. By doing so we hope to demonstrate the superiority of reasoning methods that integrate honest consideration of multiple hypotheses, unbiased analysis of evidence, and probabilistic inference.

Update: In June 2021, a video of opposition fighters launching rockets was matched to a field within opposition controlled territory, and that field has been shown to be at the intersection of seven rocket trajectories calculated from images of the impact sites. With this additional evidence we now consider the issue closed, demonstrating again the superiority of Rootclaim’s methods. While the $100,000 challenge is still available, we don’t expect anyone to apply.

The challenge

Win a debate with a Rootclaim team member about the sarin attacks in Syria and take home $100,000. See our Rootclaim Challenge page for additional topics.

The debate

Who carried out the sarin attacks during the Syrian civil war?

Rootclaim will argue that opposition forces were responsible.

Will anyone defend the commonly accepted hypothesis that the Syrian government was responsible?

This is the conclusion reached by the US government, Britain, France, and the joint investigation by the United Nations and OPCW (Organization for the Prohibition of Chemical Weapons).

Do you have another hypothesis (e.g. Russia did it, or that some attacks were by the Syrian government while others were by the opposition)? Write to us and we’ll consider it.

The stakes: $100,000 each

This is the first in a series of Rootclaim Challenges, modeled after projects such as James Randi’s million dollar challenge, offered to anyone who can demonstrate paranormal powers in a lab setting (all attempts failed). To deter repeated submissions with the intention of winning by luck, we require the challenger to risk the same amount. Applicants who can’t afford to risk $100,000 are encouraged to pool funds together or even crowdfund it. We are willing to reduce the stakes as low as $10,000 for applicants already involved in public debate on the issue.

The motivation here is not to make money, but to elevate the level of public discourse (read about how challenges like this may help people reevaluate their positions, something that never happens in a heated online exchange).

Format

Both sides will first agree on two judges with strong analytical skills, relevant experience, no previous endorsement of either side, no relevant political biases, and who declare they will examine both hypotheses equally.

Choosing judges will be done publicly on Twitter, so evasion attempts by either side, such as offering biased judges, are exposed. As an example of our honest approach to this process, in a past discussion, when Nassim Taleb offered Glenn Greenwald as a judge, we agreed to bend the rules and accept him, even though he previously said there is “overwhelming” evidence the government is responsible (contrary to Rootclaim’s conclusion) – because we think he is capable of changing his mind when presented with evidence.

Each side will have 8 hours in total to present its case, including time to respond to the other side’s claims, as part of a two-day event.

The debate will be based on all currently available evidence. The goal here is not to trip up or trap the opponent, but to determine which hypothesis is better supported by the evidence. If you have new evidence, or evidence we overlooked, it should first be shared, so we can update the analysis, and if it doesn’t significantly change the conclusion, the challenge can be accepted. We are not claiming to have better evidence, but rather aim to demonstrate the superiority of probabilistic reasoning over human reasoning, when evaluating the same evidence.

Each judge has to declare which of the two hypotheses is more likely. If both agree, the prize pool, minus the debate expenses, is paid to the winner. Otherwise, it is split.

We are flexible – feel free to contact us with offers.

Who declined so far?

The following people have been sent a tweet offering to participate in the challenge but declined or failed to respond. All of them have publicly expressed very high confidence that the Syrian government is responsible.

Eliot Higgins – Founder of Bellingcat.
Brian Whitaker – Journalist and former Middle East editor of The Guardian.
Chris York – Senior editor of Huffington Post UK.
Josie Ensor – Middle East correspondent for The Telegraph.
Scott Lucas – Editor of EA WorldView and Professor at University of Birmingham.
Richard Hall – Middle East correspondent for The Independent.
Julie Leranz – Senior adviser at The Israel Project and a director at The Human Security Centre.
Kristyan Benedict – Amnesty International UK Campaigns Manager.
Dan Kaszeta – Security and CBRN specialist and writer for Bellingcat.
Tobias Schneider – Research fellow at Global Public Policy Institute (GPPi).
Gregory Koblentz – Director of Biodefense Graduate Program at George Mason University.
Numerous other individuals who were very active on social media discussing this issue.

Have you notified anyone of the challenge and they declined it? Let us know and we’ll add them to the list.

Treating Covid-19 with Vitamin D $100,000 Challenge

September 23, 2020 / Ben / 16 Comments

A study from October 2020, the first randomized controlled trial of its kind, showed that high doses of vitamin D (in the form of calcifediol) reduce the severity of Covid-19 in hospitalized patients. The researchers reported a 30-fold(!) reduction in intensive care admissions of Covid-19 patients. At Rootclaim, we analyzed these findings and concluded that even under conservative assumptions accounting for limitations in the study, the effect is still significant and likely around 5-fold. We further demonstrated that since the risks of treatment are low, this treatment protocol should be immediately implemented. Since we published our analysis, additional studies have supported this conclusion.

See Rootclaim’s complete analysis.

Many health professionals, government officials, and other decision makers worldwide have seen the studies, but they have yet to update treatment guidelines. This delay may be due to the following:

Not many have the background in statistics and probability required to assess the data, and distinguish it from the dozens of false claims about COVID treatments.
They’re affected by omission bias – they default to the “safe” alternative of inaction, waiting for more data, rather than choosing action. It’s easier to later defend inaction than face criticism for acting too soon.
Their incentives are completely misaligned with the public. The damage to the public from using vitamin D when it isn’t effective is negligible, but the damage caused by inaction in the case that vitamin D is effective is enormous. To the decision maker in a personal capacity, the damage is similar in either case – one wrong decision on their record.
When it comes to low-risk, low-cost treatments, decision-makers hedging their bets on inaction leads to avoidable deaths.

In this particular case the reasons to act now are clear:

Similar treatments have been performed for decades, and the risks are known to be low, especially in this setting, when patients can be monitored at the hospital.
The benefits of the treatment, on the other hand, are potentially enormous, effectively reducing Covid-19 severity to that of the seasonal flu.

While caution is often the correct path when dealing with public health, this is a case where decisions should be made swiftly, using the best available models. At Rootclaim, we develop such models so when our analysis exposed the implications of the new findings, we decided to promote the adoption of the proposed treatment. We hope that this unique challenge will allow the information to reach more decision makers, and save the millions of lives that will likely be lost while waiting for further studies.

The Challenge

Rootclaim is willing to bet $100,000 that vitamin D is effective in reducing the severity of Covid-19.

This is the second in a series of Rootclaim Challenges and is intended to show that the reluctance to implement a vitamin D protocol today is irrational. A decision maker who is not pushing to adopt the proposed protocol is effectively claiming that the probability that this protocol is better than existing treatments is low. But that also implies that taking the bet would be very profitable. Therefore, any professional not accepting this challenge is implicitly admitting that their decision not to promote the treatment is wrong.

UPDATE (Nov 2022): Since this challenge was published, in the early stages of the pandemic, multiple studies have been done on the subject, with the overwhelming majority finding vitamin D effective.
Over time, the virus has significantly changed its methods, and the population has changed due to vaccines and immunity. Therefore the analysis of vitamin D’s efficacy on the original virus and population is no longer relevant today and should be updated.

Since the pandemic is no longer a major risk, this update is not a priority for Rootclaim. Nevertheless, if you wish to debate the original analysis, please contact us to set the exact criteria.

Procedure

The challenger needs to show that they can commit $100,000. We are open to discussing lower or higher amounts, and the funds can be pooled from multiple sources.
Both sides will agree on an arbitrator who will review the evidence.
The challenger needs to declare that they do not have access to any relevant non-public information. This is to protect from abuse in case of unpublished research (there is still a small chance that further research will discover the treatment is ineffective).
For the same reason, we may update these terms or withdraw the offer, as new information emerges. Of course, once a bet is made it is final and cannot be withdrawn.

If you’re not willing to risk your own money betting against vitamin D, why are you willing to risk someone else’s life?

Stonehenge by the numbers

December 20, 2017 / Ben / 0 Comments

There have been many hypotheses surrounding Stonehenge and its long-lost origins. Can a mathematical assessment by Rootclaim help shed light on its original purpose?

The large prehistoric monument in rural England is comprised of a circle of upright stones. It was constructed around 5,000 years ago, under circumstances that have long been lost in the annals of history. Some believe that Stonehenge served a religious function, while others say it was used as a burial site or a place of mystical healing. Yet others argue that it served as a giant calendar. There are also conflicting claims about how the bluestones used to build Stonehenge were moved from their place of origin 150 miles away. One explanation is that a glacier flow moved the stones, while another view says that people transported them from quarries in Preseli.

How can probability theory help?

Probability theory helps researchers measure uncertainty. And there’s plenty of uncertainty surrounding Stonehenge. By using a probabilistic framework, we can model the likelihood of each piece of evidence relating to Stonehenge. Continue reading

When Logic Goes Wrong: Survival Guide

May 15, 2017 / Ben / 1 Comment

Logic Is Dead

As discussed in a previous post, logic fails in the real world. Real-world problems do not lend themselves to logical reasoning since any non-trivial issue involves some uncertainty (Was the report accurate? Is the test result a false-positive?). When the problem is also complex, involving large amounts of information with intricate dependencies, then uncertainty can render the logical argument meaningless.

How Does Anyone Make Decisions?

Given these hurdles, how has humanity managed to make any progress at all? How do we deal with uncertain information in cases of high complexity? Continue reading

Every Logical Argument You Ever Made Was Wrong

April 15, 2017 / Ben / 8 Comments

Isn’t Logic Great?

Who doesn’t like logic? We idolize Sherlock Holmes’ ability to solve mysteries by “eliminating the impossible.” In arguments with friends, we try to prove we’re right using logic, rather than intuition or emotions. And we especially enjoy pointing out others’ logical fallacies–preferably using latin terms.

Don’t pat yourself on the back just yet. Finding logical fallacies is actually much less impressive than you might think. That’s because in the real world, all arguments violate the principles of formal logic.

Yes, perhaps every logical argument you have ever encountered was flawed.

Calculating Accurate Inputs

January 17, 2017 / Ben / 0 Comments

Serial: A Game of Numbers

Every analyst has their methods: take a poll, measure social media buzz, weigh various key factors. Rootclaim’s analysis of Serial: Who killed Hae Min Lee provides a good example of how the Rootclaim system uses hard numbers in order to “calculate reality” – to determine mathematically which hypothesis is the most likely.

Rootclaim recently took on one of the most controversial criminal convictions: that of Adnan Syed, sentenced to jail for murdering his ex-girlfriend Hae Min Lee. Syed’s conviction has been featured in the podcasts Serial and Undisclosed, and followers of the case have debated the minutiae on forums such as Reddit. Until now, discussion forums have focused on how a few particular pieces of evidence proved one hypothesis or another. Rootclaim has put together the first concerted effort to gather all the relevant information into one cohesive analysis.