A hot legal topic is predictive coding in e-discovery. In the automated approach to reviewing discovery documents for responsiveness, computers do much of the work instead of armies of humans reviewing each document. Those who doubt the reliability – and I would argue inevitability – of predictive coding could learn a lesson from how Wall Street automates trading. 

Computers That Trade on the News (New York Times, 23 Dec 2010) reports that the “number-crunchers on Wall Street are starting to crunch something else: the news.” It describes ‘robo-readers”, software that digests large volumes of text reports about companies and markets, assessing the meaning of news and social media. The analysis triggers real-time trades, without human intervention.

Predictive (automated) coding is conceptually similar. Just as a trader’s computer analyzes news to execute trade, a lawyer’s computer can analyze documents to determine responsiveness and privilege. The challenge with predictive coding is the question of judicial defensibility. For more on predictive coding, see the October 2010 eDiscovery Institute Survey on Predictive Coding (PDF) and my colleague Foster Gibbon’s July 2010 Integreon blog post, The Future of Automated Document Review.

If the stewards of large sums of money trust computers to decide what to do with their money, why don’t courts and litigators trust computers to make decisions about documents?

Today, lawyers and courts typically presume that predictive coding is unreliable. For example, the January 2011 Inside Counsel reader poll found that 53% answered no to “Do you think computer review of documents for e-discovery purposes is as accurate as human review?” Perhaps the percent will change after respondents read, in the same issue, Computerized E-Discovery Document Review is Accurate and Defensible.)

Why do predictive coding proponents have the burden of proof to rebut the presumption that predictive coding is unreliable? And note that burden of proof is high.

That Wall Street traders willingly stake large sums on computerized text analysis suggests that the courts should reverse the presumption. They should presume that predictive coding is reliable. The burden of proof should shift to predictive coding opponents to show that it is not reliable.

Objection Your Honor! Wall Street is not a courtroom. Your Honor, traders can test. They can run the robo-readers historically, assess what trades the software would have made, and prove they work properly.

Your Honor, lawyers can do the same. Some studies support the reliability of predictive coding. I am unaware of any published study that finds predictive coding is unreliable or that finds that human reviewers are reliable.

Dozens of large US companies face a future of expensive document reviews. Wouldn’t it would pay for them to “run the tests” to prove predictive coding works. Yes, testing is expensive. But so is repeatedly paying for human document review.

In November I argued in The Case for General Counsels to Invest in R&D for investment to “develop defensible automatic / predictive coding for litigation document review”. Wall Street traders have put much money on the line based on automated text analysis. If that does not inspire GCs to make the investment to rebut the predictive coding unreliability presumption, what will?