Thursday, June 1, 2017
The eDiscovery world has been atwitter over the decision in FCA US, LLC v. Cummins, Inc., No. 16-12883 (E.D. Mich., Mar. 28, 2017) in which Michigan District Judge Avern Cohn held, “rather reluctantly,” that “[a]pplying TAR to the universe of electronic material before any keyword search reduces the universe of electronic material is the preferred method.”
While there may have been a more detailed explanation in the decision’s arguments and pleadings, the resultant commentary focuses on the need for a more precise definition of what is actually meant by the broad term of TAR (Technology Assisted Review).
The Need to Define TAR
For example, consider the words of Maura Grossman, the aptly dubbed Queen of Search, in a recent interview with Artificial Lawyer. Maura commented, “It is difficult to know how often TAR is used given the confusion over what ‘TAR’ is (and is not), and inconsistencies in the results of published surveys.”
eDiscovery guru Ralph Losey covered this same point in a recent blog saying, “The passive type of machine learning that some vendors use under the name Analytics is NOT the same thing as Predictive Coding. These passive Analytics have been around for years and are far less powerful than active machine learning.”
Why the Decision Requires Further TAR Definition
In the Cummins matter mentioned earlier, the parties couldn’t agree on whether the universe of electronic material subject to TAR review should first be culled using search terms. The defendant took the position that pre-TAR culling by keywords was appropriate; the plaintiff disagreed. Judge Cohn stated that “…having reviewed the letters and proposed orders together with some in-house technical assistance including a read of The Sedona Conference TAR Case Law Primer, 18 Sedona Con. J. ___ (forthcoming 2017), the Court is satisfied that … applying TAR to the universe of electronic material before any keyword search reduces the universe of electronic material is the preferred method. The TAR results can then be culled by the use of search terms or other methods.”
So, should we then always use TAR first and THEN use keywords? Two other authorities make relevant counterpoints.
First, the EDI-Oracle Study found that in some instances (and I emphasize the word “some”) humans did better than computers in retrieving documents.
Second, in the case of Bridgestone Americas, Inc., v. International Business Machines Corporation, No. 3:13-1196 (M.D. Tenn. July 22, 2014), Magistrate Judge Brown looked at whether predictive coding was an acceptable change in the case management order and said that “in the final analysis, the uses of predictive coding is a judgment call …”
Bridgestone also cited Progressive Casualty Insurance Company v. Delaney (D. Nev. May 20, 2014), where Magistrate Judge Peggy A. Leenshe turned down a request to order the use of predictive coding. Despite her opinion that predictive coding was more useful, Judge Leenshe noted that the use of keyword culling prior to predictive coding could be appropriate under Rule 26, but it depends on many factors, including “the type of data, the value of the case juxtaposed to the cost of using advanced analytics, and other factors that are matter specific.”
In all fairness, many commentators think that the position set forth in Bridgestone and Progressive has changed since the 2015 FRCP Amendments. This position is best laid out in an article by Sean Livesay, in which he holds that “Employing predictive coding at the outset provides significantly more accurate results in identifying relevant documents than keyword culling. Predictive coding utilizes sophisticated technology which can more accurately predict relevant documents, beyond the simplistic search terms used in keyword culling.”
Experts’ Definition of TAR
But what exactly is this “sophisticated technology?” Once again, it depends on the technology. Some software depends on seed sets of key documents prepared by attorneys working on the litigation. And how do they gather those seed sets? In many cases by employing keyword searches!
Some tools use concepts (sentences and paragraphs) and not single words or short phrases at all. In fact, they are using mathematical algorithms to look at text patterns. You will recall that Ralph Losey said, “…concept search methods…are powerful search tools…But they are not predictive coding. They do not rank documents according to your external input, your supervision. They do not rely on human feedback. They group documents according to passive analytics of the data.”
If these experts can’t agree on what constitutes TAR, or even when to use the term, then how can we expect a judicial opinion using such an ill-defined term to set a technological standard?
In my opinion, the best comment is still that by Magistrate Judge Andrew Peck in Hyles v. City of New York (No. 10 Civ. 3119 (S.D.N.Y. Aug. 1, 2016). In an Order where he referred to himself as a “judicial advocate” of using TAR in appropriate cases (my emphasis added) Judge Peck cited Sedona Principle 6: “Responding [producing] parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”
Although the judge acknowledged the 2015 Amendments to the Federal Rules of Civil Procedure, he didn’t feel they dictated the use of TAR in Hyles. As he put it, “There may come a time when TAR is so widely used that it might be unreasonable for a party to decline to use TAR. We are not there yet.”
As a final note, I would point out how some of the parties cited above use the term “predictive coding” and not TAR. As both Maura Grossman and Ralph Losey comment in their articles, these terms are often used interchangeably. To my mind, this is just one more example of why we need to be specific about what technology is being discussed.
Until we collectively agree on our definitions, how can we agree on what is the best technology to use?