Archive for model criticism

Probabilistic Models on Trial

scales

There are many modes of evidence accepted in courts of law. Each mode has its strengths and weaknesses, which will usually be highlighted to suit either side of the case. For example, if a witness places the defendant at the scene of the crime, the defense lawyer will attack her credibility. If fingerprint evidence is lacking, the persecution will say it's because the defendant was careful. Will inferences from probabilistic models ever become a mode of evidence?

It's a natural idea. Courts institutionally engage in uncertainty. They use phrases like "beyond reasonable doubt", they talk about balance of evidence, and consider precision-recall rates (it is "better that ten guilty persons escape than that one innocent suffer" according to English jurist William Blackstone). And the closest thing we have to a science of uncertainty is Bayesian modelling.

In a limited sense we already have probabilistic models as a mode of evidence. For example, the most damning piece of evidence in the Sally Clark cot death case was the testimony of a paediatrician who said that the chance of two cot deaths happening in one household, without malicious action, is 1 in 73 million. This figure was wrong because the model assumptions were wrong -- there could be a genetic or non-malicious environmental component to cot death but this was not captured in the model. But as is vividly illustrated by the Sally Clark case, inferential evidence is currently gatekept by experts. In that sense, the expert is the witness and the court rarely interacts with the model itself. The law has a long history of attacking witness testimony. But what will happen when we have truly democratized Bayesian inference?

Perhaps one day, in a courtroom near you, the defense and prosecution will negotiate an inference method, then present alternative models for explaining data relevant to the case. The lawyers will use their own model to make their side of the case while attacking the opposing side's model.

In what scenarios would probabilistic models be an important mode of evidence?

When there are large amounts of ambiguous data, too large for people to fit into their heads, and even too large/complex to visualize without making significant assumptions.

Consider a trove of emails between employees of a large corporation. The prosecution might propose a network model to support the accusation that management was active or complicit in criminal activities. The defense might counter-propose an alternative model that shows that several key players outside of the management team were the most responsible and took steps to hide the malfeasance from management.

In these types of cases, one would not hope for, or expect, a definitive answer. Inferences are witnesses and they can be validly attacked from both sides on the grounds of model assumptions (and the inference method).

If this were to happen, lawyers would quickly become model criticism ninjas, because they would need model criticism skills to argue their cases. Who knows, maybe those proceedings will make their way onto court room drama TV. In that case, probabilistic model criticism will enter into the public psyche the same way jury selection, unreliable witnesses, and reasonable doubt have. The expertise will come from machines, not humans, and the world will want to develop ever richer language and concepts that enable it to attack the conclusions of that expertise.