Indiana District Court Approves Multimodal Computer Assisted Review
Last week, Judge Robert L. Miller, Jr. of Indiana brought us the latest opinion on computer assisted review. In this products liability case, the Defendant used keywords to winnow down 19,500,000 documents to 3,900,000 responsive documents. Then, the Defendant used automated deduplication to get these 3,900,000 documents down to 2,500,000 documents. Finally, the Defendant used computer assisted review to identify responsive documents from the final set of 2,500,000 documents. The Plaintiffs weren’t fans of this multimodal approach, arguing that adding keyword search into the mix warranted a redo.
The Court began by parsing this history, and noting some of the calculated statistics:
[Defendant] used a combination of electronic search functions to identify relevant documents. Keyword culling was used first, reducing the universe of documents and attachments from 19.5 million documents to 3.9 million documents, comprising 1.5 terabytes of data. Removal of duplicates left 2.5 million documents and attachments. Statistical sampling tests of a random sample projected, with a 99 percent confidence rate, that between .55 and 1.33 percent of the unselected documents would be responsive and (with the same confidence level) that between 1.37 and 2.47 percent of the original 19.5 million documents were responsive. In comparison, [Defendant’s] keyword/deduplication approach had identified 16 percent of the original 19.5 million.
[Defendant] then employed technology-assisted review, or predictive coding, to identify the relevant documents to be produced from the 2.5 million that emerged from the keyword and deduplication processes. Predictive coding has found many uses on the Internet. Under predictive coding, the software “learns” a user’s preferences or goals; as it learns, the software identifies with greater accuracy just which items the user wants, whether it be a song, a product, or a search topic.
The Plaintiffs argued that the “initial use of the keyword approach has tainted the process,” on the premise that keyword searching was not a statistically sound way to cull down the initial 19,500,000. Thus, the Plaintiffs wanted the Defendant to go back to the 19,500,000 documents, and employ computer assisted review on the full set- a computer assisted review redo, with the 16,000,000 documents that were earlier booted out of the mix by keyword searches.
Judge Miller held that ”[t]he issue before me today isn’t whether predictive coding is a better way of doing things than keyword searching prior to predictive coding. I must decide whether [Defendant]’s procedure satisfies its discovery obligations and, if so, whether it must also do what the Steering Committee seeks. What [Defendant] has done complies fully with the requirements of Federal Rules of Civil Procedure 26(b) and 34(b)(2).”
The Court noted that “[i]t might well be that predictive coding, instead of a keyword search, at Stage Two of the process would unearth additional relevant documents. But it would cost [Defendant] a million, or millions, of dollars to test the Steering Committee’s theory that predictive coding would produce a significantly greater number of relevant documents.” This redo request (as the Court correctly noted) “sits uneasily with the proportionality standard in Rule 26(b)(2)(C).”
Finally, the Court noted that this redo would be fine- if the Plaintiffs were willing to bear the expense of the search. This case continues the trend of courts focusing (correctly) less on what might have been a better way to search, and more on whether the search satisfied a party’s discovery obligations under the rules. It is a rare case where any party’s search efforts are perfect, and this case emphasizes the importance of discussing eDiscovery protocols at the initial 26(f) conference- if you have an issue with how the opposition will handle eDiscovery, you have a better shot at success in opposing the plan if you lodge your objection before the opposition spends millions implementing their eDiscovery efforts.
You can check out the full opinion here.