E-Discovery Concept Searching Clarified
What’s the best way to review documents in discovery? I’ve suggested that an army of US-based contract lawyers is not the answer, that using offshore lawyers or software is a better approach. Nicolas Economou, CEO of H5, offers useful insights in response to a prior post about software v. lawyers and empirical testing.
In The Gold Standard for E-Discovery Document Review (3/18/07), I referred to H5’s document review process as concept search. Economou sent a helpful clarification about the appropriate role of concept search in document review and the H5 approach. He explains below that achieving the best results requires a combination of technology and human effort – lawyers and other domain experts.
Disclosure and Caveat: I met Economou several years ago and, from time to time, have talked to him about the possibility of consulting work. Vendors Speak postings are neither a product endorsement nor an independent vetting of the author’s facts or analysis.
From Nicolas Economou:
I was interested to see your recent blog entry discussing the Webinar we hosted and your reference to a recent evaluation of H5. You referred to our document review process as a concept search tool. Though this characterization is practical and helps communicate the overall message about the value of technology, it might be confusing. I thought it would be helpful to share a more nuanced explanation about how H5 combines technology and process.
Two reasons for the potential confusion: First, we do not use a concept search tool, nor do we provide our clients with one. And second, readers may wrongly conclude that search tools as a class, or in the hands of the traditional “roomful of attorneys”, achieve the high accuracy documented in the evaluation, or higher accuracy than pure manual review.
All academic evidence supports the conclusion that search tools (keyword, Boolean, “concept search” or otherwise) are only modestly accurate. You can expect to miss, in practice, roughly half the relevant documents, as has been established by TREC (the Text REtrieval Conference sponsored by NIST and the US Department of Defense) year after year now for over a decade. In fact, in the recent TREC “legal track”, the best result was achieved by a manual searcher. (See http://trec-legal.umiacs.umd.edu/)
What H5 offers, and has been evaluated in independent studies, is a technology-enabled process that includes search technologies, but only as part of a larger process. H5’s own patented search technology is premised on the belief that humans determine relevancy based on the communities of practice to which they belong, and that modeling practices is a necessary complement to mining semantic relationships.
Where other information retrieval technologies focus on words – terms that occur in text – we focus on worlds, worlds of professional practice. We think that our combination of technology, process, and practice is superior to other approaches. We also believe, however, that packaged as a software tool and placed in the hands of attorneys, our search technology would not perform meaningfully better than any other such advanced technology does. What makes the difference is expertise and process.
Our process brings together multiple competencies: law for obvious reasons; subject matter experts to help define with precision complex responsive and relevancy criteria that often delve into technical domains (accounting, engineering, scientific research); technology experts to identify appropriate information retrieval solutions, including of course H5’s; linguists to make optimal use of such information retrieval solutions; process engineers to architect the overall workflow; statisticians to measure the accuracy of the output; and project managers to ensure the proper functioning and supervision of the process and systems. (Law firms typically have only or two of these necessary competencies among their lawyers or staff.)
We combine these multiple competencies to solve three key non-technological challenges that search tools simply cannot solve alone:
(1) defining in detail and with exactitude what is being sought (a knowledge transfer challenge);
(2) the considerable inconsistency with which humans make judgments of relevancy – even when presented with documents served up by a search tool (a human nature challenge); and
(3) the statistical measurement of the accuracy of the output (a measurement challenge).
Search tools, in and of themselves, or in the hands of individual end-users, simply are incapable of addressing these challenges and therefore of achieving high levels of accuracy.
In short, the reason our document review process proved so successful in independent studies is because we do not use a “concept search” tool per se; if we did, we would not have achieved better results than the alternatives or than the performance ceiling documented by TREC. (That is, we too would have missed half or more of all the relevant documents, as humans equipped with search tools typically do.)
It would be unfortunate if your readers took away from your entry on this study that concept search tools (as conventionally and broadly understood) enable humans to do meaningfully better than pure human review: they do not.
That said, I hasten to add that my comments here are not intended to mean that search tools are not valuable. I use such tools every day myself to find documents on my hard drive. Every one of our clients uses them, and it is stating the obvious that litigators are better off availing themselves of the speed and convenience that search tools offer. But it is important to understand what such tools can and can’t do: they are good at helping you find some relevant documents quickly; however, absent well-calibrated processes and a range of competencies that precede, accompany and follow their use, they are really poor at the much more difficult challenge of finding all, or nearly all, the relevant documents, which is what a well-designed document review process should be able to measurably achieve, in order to mitigate exposure to, or maximize expected value from litigation.
- Alternative Legal Provider (34)
- Artificial Intelligence (AI) (49)
- Bar Regulation (13)
- Best Practices (39)
- Big Data and Data Science (8)
- Blockchain (10)
- Bloomberg Biz of Law Summit – Live (6)
- Business Intelligence (19)
- Contract Management (19)
- Do Less Law (37)
- eDiscovery and Litigation Support (165)
- Experience Management (6)
- Extranets (11)
- General (189)
- Innovation and Change Management (157)
- Interesting Technology (95)
- Knowledge Management (219)
- Law Department Management (13)
- Law Departments / Client Service (110)
- Law Factory v. Bet the Farm (27)
- Law Firm Service Delivery (103)
- Law Firm Staffing (25)
- Legal market survey featured (5)
- Legal Process Improvement (21)
- Legal Project Management (26)
- Legal Secretaries – Their Future (17)
- Legal Tech Start-Ups (1)
- Litigation Finance (5)
- Low Cost Law Firm Centers (19)
- Management and Technology (177)
- Notices re this Blog (10)
- Online Legal Services (63)
- Outsourcing (134)
- Personal Productivity (38)
- ReInvent Law (10)
- Roundup (58)
- Structure of Legal Business (1)
- Supplier News (13)