
Featured News
CaseCentral Ranked as a Top Provider in the 2008 Socha-Gelbmann Electronic Discovery Survey
Featured Events
AIIM Webinar: Bring Business Process Efficiencies to eDiscovery - August 13, 2008
Blog
Chris Kruse Blog - Insights regarding eDiscovery Best Practices
Blog
Concept Search & eDiscovery - a confused market - July 22, 2008
The market is flooded with advanced technology "tools" that provide insight into the content at issue in legal and compliance matters. These tools are useful and interesting. However, without a unified evidence management strategy, these tools are just tactical point solutions that lack strategic importance.I continue to hear how concept search is now a critical component to any eDiscovery strategy. I agree that advanced technology that discovers additional meaning in large bodies of information is helpful in the litigation and compliance context, but I am bemused by the definitions being thrown around in the industry.
To further complicate the conversation, the term "analytics" has emerged as either a parallel or competing definition. Document analytics, content analytics, etc., are again all important and interesting issues, but our market lacks a common framework within which to appropriately discuss these themes. In order to help the people who buy and use all of our solutions, the eDiscovery industry should try to eliminate overlapping, redundant and vague definitions.
Further, many of these tools are poorly defined in the market. Here is my take on the appropriate framework to define and discuss these offerings.
I believe concept search can be broken down into four buckets: (1) advanced search, (2) categorization, (3) classification, and (4) visualization. Let's see if this framework is meaningful:
Advanced Search (aka Concept Search)
Technology that allows for the search and retrieval of content is pretty vanilla. All products have keyword, wildcard, Boolean and phrase searching. These are key in the legal market to balance precision and recall requirements. The newer technology that is now available provides for the extension of vanilla technology into more conceptual or fuzzy algorithms that provide additional search and retrieval capabilities. A quick example of this would be to do a search for George Bush, and retrieve "rose garden" and "white house" in documents, even if the term "George Bush" was not mentioned.
Can we all agree that this is the primary benefit that concept search technology, as defined above, brings to the legal market?
Categorization
I see three types of categorization: The first is "clustering" - defined as a fully automated function (read - an algorithm) that automatically puts content into a series of categories to help speed review and give users a clue to what they have. Sometimes these clusters are organized in hierarchies to further aid in the business use case.
Second is "trained" or "supervised" categorization. In this definition, the user provides the algorithm with a set of content that represents the core issues and context of what the user is trying to understand or retrieve. This content forms the basis for the creation of the taxonomy (or information hierarchy). Once this taxonomy is completed using this highly relevant set of content, the remainder of the corpus is provided to the algorithm, with instructions to put it all into the highly-customized taxonomy.
Third is "manual" categorization, where a taxonomy is provided 100% by the user, and the body of data is provided to the pre-built taxonomy and the content is placed into predefined buckets. A good example of this would be taking the American Medical Association medical taxonomy (or other vertical market standardized information organizational structure) and pushing your data against it to see where it goes.
Classification
Basically this is document "tagging" based on metadata attributes. Some classification comes from system-type information (file type, file name, location, etc.) and some might actually be generated from policy information provided inside of a content management system. This is generally very limited additional data, but can be helpful when trying to group documents into meaningful categories.
Visualization
Finally, there is growing interest in tools that provide the end user some sort of visual representation of the content before them. If a picture is worth a thousand words, then why not apply a picture to the thousands - or millions - of documents in your database? We all look at stock charts, for example, on Google or Yahoo that dynamically and visually analyze a stock price according to user-defined set of criteria. This is much easier than evaluating a table of numbers. This same theory holds true for content. Date ranges, e-mail traffic, timelines, etc., all are interesting and useful ways of reviewing content.
Missing Link
These four categories of tools are in use today, but despite all of the hype and noise in the market, the full value of these technologies remains untapped unless they are integrated into a unified evidence management platform. As point solutions they represent the hammer or the nail or the screwdriver - but corporate counsel need an overall system that, at the appropriate time, can seamlessly employ these various tools to get a job done AND manage all other aspects of eDiscovery.
Sure, CaseCentral provides all sorts of great tools that deliver advanced functionality, but their value would be very limited without the secure multi-party architecture, configurable litigation workflow engine, process management and performance analytics we deliver. It's about the strategy.
Merits. V. Mechanics - June 27, 2008
Since the introduction of the revised FRCP in December 2006, the process mechanics of how a Corporation responds to litigation and regulatory inquiries is as important, if not more so, than the actual merits of a case. Companies can no longer solely focus on justice and holding the guilty party responsible, but must also ensure that both litigation readiness and the overall process of eDiscovery is done correctly and defensibly, as well as in a timely and consistent manner, in order to not jeopardize the outcome of a case.A case in point: It was reported today, on the morning of the second day of LegalTech West, that Celanese, a large Dallas-based chemical company, and it's law firm, Kaye Scholer, are in a dispute regarding discovery production problems. The allegation is that Kaye Scholer screwed up and didn't product a large number of documents in the case, and as a result, Celanese had to settle the matter for an "inflated" dollar amount - $107 million.
The real issue here is the effect of poor discovery management on the end result; the actual settlement and the damages amount. While this could simply be posturing to gain some leverage in the legal activity by Celanese, the point I am making is that the "mechanics" of discovery management can distract from the actual "merits" of a case.
John W. Woods , an accomplished lawyer and a partner at Hunton & Williams whom I respect a great deal, once told me that litigation is about one thing. Leverage. Litigators are trained experts in establishing leverage in legal activity all day - every day. This practice is a highly evolved game of chess, with very real winners and losers. All the new modern weaponry of war is deployed in this quest for leverage.
Generally speaking, the legal teams on both sides are focused on the merits of a particular case - the truth - what really happened. And, the quest to achieve this is a well-oiled, multi-tiered process that has rules and guidelines to get to the end result. However, during this process, if the rules are not followed, then the case stops being about the "merits", and starts being about the rules of the game, i.e. the "mechanics."
Plaintiff lawyers are famous for their command of the rules of engagement in litigation, and often will exploit them to gain leverage. A case might actually be completely frivolous, but if you can exploit the challenges of email discovery, for example, for a company with 250,000 employees who send 10 million emails per day, then the merits of the case don't matter - really.
Why am I talking about this? Because the better a Company can define, build, deploy and audit their process for managing litigation discovery, the lower the risk that a merit-less case will produce an opportunity for the opponent to gain unwarranted leverage in a legal matter.
Further, as law firms are at risk of both disappointing their clients, and causing them real harm in an inflated settlement, it makes sense for a Corporation to control their own destiny in managing their own discovery. That is the mission here at CaseCentral - to enable Corporations to effectively manage their discovery and reduce the risks associated with the mechanics of the process.
How much is it worth a corporation to reduce this risk? Celanese thinks it is a big number . How much did the alleged discovery production problems inflate the settlement? $5 million? $10 million?
Corporations and their outside counsel are taking advantage of modern eDiscovery warfare strategy and tools to fight back the reality of "mechanics" process issues in the daily operation of legal activity Why not own a unique multi-matter platform that enables a repeatable, defensible process for all matters, reducing time, cost and risk while increasing quality?
Please send me comments to ckruse@casecentral.com
Welcome to the Chris Kruse blog on the CaseCentral site:
I founded CaseCentral in 1994 with a vision on how to drive efficiency into a labor-intensive & manual process. I am pleased to say that 14 years later, we continue to innovate and automate in the legal technology marketplace.The market has changed dramatically in 14 years, with the arrival of both the internet and email during the history of CaseCentral. A large project of 500,000 "pages" of information is now eclipsed by document and email collections that are over 100 million "documents". In addition to the quantity of content that shows up as evidence, the complexity of the information is also different.
An email thread, for example, with its attachments and forwarding history, makes the analysis of the information more challenging.
There is growing momentum in the eDiscovery market that the fundamental economics of the business are at risk. The common practice of charging per gigabyte fees for the electronic discovery processing and subsequent value-added services is now being scrutinized. If you believe, like Intel's Gordon Moore, that a major generation in technical advancement occurs every 18 months, then technology should be driving the costs down at each stage. The market should pay for technology that enables the business process, rather than for bloated services that utilize available technology.
More on this in future postings. I leave you with the following provocative question. How many dollars are spent each day for redundant and ineffective discovery management? We believe the annual number is measured in billions.
Please send me comments to ckruse@casecentral.com







