One of the most interesting ongoing electronic discovery questions is the role and importance of search tools in document review. I recently spoke with Jonathan Nystrom and Dick Oehrle of Cataphora about this and related issues. 

Cataphora develops powerful language analysis software. Its pattern recognition engine is designed to detect deviations from the norm, which makes it useful in investigations and compliance. The company is also an established EDD provider. I first saw Cataphora technology in 2003 and was impressed from the first time.

We agreed that while the choice of tool is important, the bigger considerations for effective e-discovery are (1) the fit of the tool to the document set and (2) the overall process a litigation team follows, which includes the training and skill of the people using the tool.

We left unresolved the tension between standardizing on a platform (the goal of many law firms and departments) and choosing the best tool for the matter. Optimizing tool choice by the case is a challenge: unless an organization has an entire process and infrastructure built around the tool, it unlikely to be the “best choice” for that case. So, as a practical matter, it seems likely that all but the largest legal organizations and EDD vendors will standardize around one tool. Or, if they need a specific tool for a specific matter, find a vendor that has optimized around it.

As for tool selection itself, we agreed that the answer to the question of “best tool” is empirical, not theoretical. Comparing tools on the basis of features has limited value because (1) different features work better for some collections than for others and (2) most lawyers simply cannot evaluate competing algorithms (much less how those algorithms are instantiated in a specific application).

Empirical testing requires sampling and statistics. We bemoaned the fact that most lawyers are not comfortable with either. I speculated that aversion to sampling and stats is more than just a lack of training or even familiarity. Rather, I suggested that sampling – and probability more generally – is fundamentally an anathema to most lawyers.

Lawyers dislike uncertainty; they hate being wrong. In their eyes, sampling can never be certain – it is thus suspect. This honestly held but mistaken believe leads to skewed (I am being charitable) outcomes. For example, we agreed lawyers’ faith in manually reviewing documents is misplaced. Those of us with experience in the trenches know humans make many mistakes. And sampling tests would undoubtedly prove that. Whoops, lawyers don’t do sampling, so they continue to have faith in human accuracy. The challenge is that e-discovery is largely science and math and some art; but it definitely is not religion. So blind faith is necessarily misplaced.

My conversation with Cataphora management was stimulating; they clearly have a sophisticated view of the market and the tools. And I was intrigued to learn that this summer, the company will release a consumer tool that will tell Outlook users what Outlook says about you. For those who are open-minded, those data could well give rise to another instance of having to question honestly held views. So, with apologies to the Talking Heads, “Goes to show what a little DATA can do”.