The Key to Keyword Searches? If You Use Them, Be Smart about It
POSTED ON November 20

Ineffective keyword searches are like mosquitoes in the summer, or ice dams in the winter.  Yet many of us think of them as the gold standard for eDiscovery searching.

They can be an ineffective way of retrieving information. Evidence: Tests by the Text Retrieval Conference have shown that keyword searches located only 22 percent to 57 percent of the total number of relevant documents. And, you end up with a lot of false positives—matches that aren’t relevant to your case.

Initial searches are almost always too broad, and represent search overkill. Keywords may match with other definitions of the word.  For example, you may retrieve on apple, the fruit, when you want information on Apple Computers.

There’s another way.

Here are a couple of tips. First, learn the strategies or quiz your vendor on their ideas on how to pick up variant spellings so as to enhance retrieval.  Second, think about conducting or have your vendor conduct searches that rely on more than simple key words to capture more relevant data.

Trust us; we’re not alone in our cautious use of keyword searches. U.S. Magistrate Judge Andrew Peck, in his endorsement of computer-assisted review in Da Silva Moore v. Publicis Group SA, wrote, “key words, certainly unless they are well done and tested, are not overly useful.”  The operative word is “tested”.  Finding the right search protocol is an iterative process.  The words “if at first you don’t succeed, try, try again” apply.

Search strategy needs to strike a balance between having sufficient retrieval or recall and precision.  That balance will not be struck on the first try.

Make sure searches are intentional. Sample, test and analyze your searches and results. If you do, you’ll get good results.

But before, the search step you need to set the boundaries on the documents that you will review. Many people skip this step and just do a blanket search. That’s seldom a good idea.

“We really like to work with data custodians to determine where they store things, and then we pinpoint the location of the data. We are then prepared to conduct targeted collections,” says my colleague Benjamin Legatt, the Director of Client Consulting here at Shepherd Data Services. “We don’t necessarily need to image someone’s entire hard drive or take everything from their network. If we’re focused in how we go about collecting data from the outset, the client can save a lot of money.”

Keyword searches can give you useful information for what they find but also for what they don’t find with a savvy retrieval process. Here’s Legatt again, “we can use statistical sampling and re-sampling, validation and revision techniques to make sure that we are striking the correct vein of material.”

There’s so much you can do that’s more strategic and effective than a simple blanket search.  And with the advent of more advanced technologies and methodologies such as concept searching and predictive coding, we can show clients how to do things better, smarter, and for less money.  We’ll have more to say on that topic in an upcoming blog.

About the Author Chris

Author Avatar Christine Chalstrom is the Founder, CEO, and President of Shepherd Data Services, Trustee, Mitchell Hamline Law School and Adviser, Center for Law and Business. She has spoken widely on the Amendments to the Federal Rules of Civil Procedures, Digital Forensics, and eDiscovery best practices. Her credits include presentations to the American Bar Association, Association of Certified e-Discovery Specialists (ACEDS), Corporate Counsel Institute, MN Association of Corporate Counsel, MN Association of Litigation Support Professionals, MN CLE, Mitchell Hamline School of Law, Upper Midwest Employment Law Institute. She is an attorney, programmer, and forensic examiner.