Will Uppington of Clearwell has a good post called Concept Search Versus Keyword Search in Electronic Discovery

It offers a helpful explanation and comparison of concept search versus categorization. He concludes that for concept searching “use to become widespread, it will need to become more transparent. But that’s a topic for another day.” I am eager to see him develop this theme.

Concept search software that I evaluated in the early 1990s relied on approaches such as statistical co-occurrence of words (e.g., the SIRE algorithm), vector analysis, state space analysis, tuples, n-grams, and thesaurus-based. I have more math training than most but generally did not understand the computational linguistics behind these.

Given the sophistication and the complexity of concept search software, I am not sure how transparent it can be. I think we need to rely on empirically testing and comparing various tools and approaches.