Here’s a thought-provoking article by David J. Spiegelhalter and Hauke Riesch on how to deal with the unknown in a rational analysis—not just things we know are unknown, but deeper uncertainties.

Take, for example, the following case study:


In February 2010, a story was reported in the newspapers and on prime-time radio about a woman who had bought a box of half-dozen extra-large eggs and found they all had double-yolks. A representative from the Egg Council declared that only 1 in 1000 eggs were double-yolked, and so getting six such eggs in a box was assumed to have a chance of 1 in 1 000 000 000 000 000 000 (1/1000 multiplied together six times). This is an old English trillion.

We can deconstruct this ‘risk analysis’ using the five levels.

— Is this a plausible probability for this event? There are around 2 000 000 000 half-dozens of eggs sold in the UK each year, a huge number, but even so we would only expect such a rare event to occur once in every 500 000 000 years. So this suggests the probability is incorrect.

— The basic parameter of the model is the 1/1000 chance of a double-yolked egg. But this risk is much higher for extra-large eggs, and so the parameter is thrown into doubt.

— The model assumes that eggs in a box are independent. But a little research shows that eggs in the same box tend to come from the same flock, and hence getting one double-yolked egg increases the odds of getting another in the same box. Hence, we should consider elaborating our model to allow for correlated eggs.

— But what about other inadequacies of any such model? A little reflection suggests that we might want to know about where the eggs came from, how they are screened, and many other possibly influential factors, before we could say whether this was really such a surprising event.

— Finally, what did we not even think of, what were we entirely ignorant about? When we use this example in lectures, we say how the next box of eggs we bought had six eggs that were all double-yolked! Extraordinary? And then we show the picture shown [below], revealing that this was not a difficult feat since they can be bought in a supermarket. This was a total surprise to us and has been to all audiences (although it was pointed out in the comments to the newspaper articles); it had not crossed our minds that double-yolked eggs can be easily detected by holding them to a light and selected for inclusion in a box, and so the six eggs in the original story were most likely deliberately selected or the box wrongly labelled.

The thing to keep in mind is that all analyses, even the most “scientific” approaches, are based on a series of assumptions and judgments. The trick is to be able to look at those assumptions critically, and assess how much they affect our conclusions.

What do you think? What’s the best way to deal with unknowns in a rational analysis? How can we avoid falling into the traps of the above example when we don’t have enough information, and don’t even know enough to ask the right questions?