On Monday, John went to the mall. John left Green’s Grocery at 7:47 AM and drove home in his 2007 Honda Pilot. John forgot his blue umbrella in the store. That night John’s neighbor Martha, a retired engineer who lives down the block from him, called him. Martha had found the umbrella and was coming to return it. Is that a miracle? Or a largely predictable outcome? The answer depends on which of the many coincidences we consider.

What Is Relevant?

Every real-life event is unique, filled with specific details and unlikely coincidences. However, much of that is “noise”–facts not actually relevant to anything in particular. For example, in John’s story, Monday is one of seven days in the week. There are many makes and models of cars that he could have owned. And Martha could have worked in a great number of professions. Since every situation has many specific details, in a sense every event is highly unlikely. But are any of these details essential for understanding the main event here?

Determining the Essence

Rootclaim’s model provides a probabilistic analysis of any question by listing the most likely hypotheses, collecting all of the relevant evidence, and evaluating the relationship between each hypothesis and each piece of evidence. Key to the process is understanding the essence of a piece of evidence. Properly evaluating the relationship between the evidence and the hypothesis requires finding the main point of that evidence. With each piece of evidence we thus try to distill it to its “essence,” i.e. the simplest common denominator of what needs to be analyzed.

The Essence Balancing Act

When identifying the essence, there is a dual challenge. We want to generalize as much as possible to allow for the use of more reliable data pools. At the same time, we don’t want to overgeneralize. Instead of focusing on what distinguishes between the likelihood of the hypotheses, we’d wind up focusing on something that is equally likely (or unlikely) under each hypothesis. Therefore the first step is identifying the essence. The next step is analyzing the relationship between the evidence and the hypotheses using relevant data and figures.

What Matters with John

In John’s case, we can eliminate details about the color of the umbrella, the specific store, and Matha’s profession. Instead, we can use figures on demographics in John’s neighborhood, and the location of the mall, to estimate the likelihood of someone who knows John finding an item that he had left in a store.

The specific car that John drives doesn’t contribute to the calculation, but knowing that John drove home would make it more likely that John wouldn’t realize that he had forgotten his umbrella. So the essence is that John didn’t walk home in the rain (a similar essence would exist if John had walked home after it had stopped raining).

If we were analyzing a different question, then other details might be more relevant. For example, if this were a question about the likelihood of John’s umbrella being blue, then we might look at data on umbrella production and sales, perhaps incorporating John’s age and gender as factors in what color umbrella he’d be likely to own.

Essence of Bolt

The analysis of whether Usain Bolt used performance enhancing drugs (PEDs) shows how this principle is applied in a Rootclaim analysis. For example, the specific details regarding Bolt winning medals and setting records as a teenager are not as important as the information that Bolt was very fast, very young. Likewise, the specific times for Bolt’s races in 2007 and 2008 aren’t the critical points. The margin of time reduced (.34 seconds) and the period of time in which it took place (a year) are what matter. And Bolt’s specific height is less relevant than the information that he is taller than most other successful sprinters, which is less likely under the hypotheses that Bolt has taken PEDs, especially if he’s taken PEDs since his teens (in which case we could expect Bolt to have suffered from stunted growth).

https://commons.wikimedia.org/wiki/File:Usain_Bolt_2012_Olympics_2.jpg

Jamaican sprinter Usain Bolt in the 2012 Olympics

On the other hand, we don’t want to over-generalize. With Bolt’s criticism of athletes who use drugs, the context of the statements matters. If we analyzed just the likelihood of Bolt ever making anti-drug usage statements, without the context, the likelihoods might be very different. Many athletes have made public statements critical of doping. But that is usually in response to specific allegations or other circumstances which prompt the criticism. Because Bolt’s criticism was unprompted, it is less likely under the hypotheses in which he himself is a PED-user.

Essence in the Starting Point

When evaluating the starting point, identifying the essence is also key. In the Bolt analysis, the initial likelihood is assessed by looking at data on athletes at the top of their sports, not just the general public. This information is then further refined to include data specific to Jamaican sports. The question is thus what is the likelihood of a Jamaican athlete at the top of his sport taking steroids? Jamaican athletes are subject to testing under the Jamaican Anti-Doping Commission, which is distinct from the testing in other countries. We may not be able to get enough information on one individual (Bolt) to reliably calculate the prior likelihood. But we can use Jamaican athletes, a subgroup of top athletes for which there is information. This specificity regarding the essence of the prior helps refine the calculations.

Challenge the Essence

Understanding the essence is critical to properly interpreting claims and evaluating the relationships between evidence and hypotheses. If you disagree with any aspect of the essence, that can affect the weighted likelihoods, and ultimately, the entire analysis. As with the other inputs on Rootclaim, you can challenge the essence and thereby contribute to the analysis. We welcome your input. So visit Rootclaim and add your thoughts to the Bolt analysis, or any of our other analyses.