Hadoop/Hbase vs RDBMS

March 27, 2012

good presentation comparing Hadoop/Hbase vs RDBMS


Typical Objections to Using Predictive Coding

March 23, 2012

Consistent Review Criteria

Users indicate that any errors identified during the coding of the sample set will be amplified across the document population.

True Predictive Coding solutions (e.g. Servient eDiscovery solution) include an integrated refinement process that continuously learn and adjust to drastically improve document coding accuracy.


Plaintiff’s Attorney have an incentive to cast a wide net when looking for potential documents

Anybody have any thoughts to address this issue?


Defensibility in court.

For Predictive Coding to be effective, its use must be defensible in court.

In February 2012, Judge Peck provided what many people described as
the first judicial opinion in which a court has expressly approved the use of computer-assisted review. He specifically wrote, “This judicial opinion
now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”

Benefits of Predictive Coding

March 15, 2012

1 – Cost Savings

By automatically identifying relevant documents,  eDiscovery projects utilizing this technology result in substantially lower billable hours. While a small team of more senior lawyers are still required to “train” the software on what is relevant, traditional eDiscovery techniques require a larger group of less experienced lawyers  reviewing a lot more documents and costs more.


2 – Faster Document Review

Since the review process  is accelerated via predictive coding, it allows decision makers to have earlier access to pertinent information which might reduce litigation costs relative to if the same issues are identified later in the process.   A more recent trend is that the timeline of eDiscovery engagements are getting compressed due to the exponential rise in data usage, so the ability to quickly process large amounts of data is critical


3 – Holistic Pattern Recognition

With manual reviews, individual reviewers see their documents in relative isolation to what their fellow reviewers are discovering. Underlying
connections between separate documents may not be observable.   With Predictive coding, initial review is performed through a single lens and certain patterns or consistencies may be identified that would not have been observable to an individual reviewer looking at a relatively small set of documents.

4 – Ability to respond to changes

If the criteria for responsive documents change, predictive coding can be quickly “trained” and re-applied  to the document set for the new criteria rather than a manual reviewer having to begin the process anew


5 – Greater Cost Transparency

With Predictive coding, clients are able to determine the number of documents that will require further review. This could enable them to more accurately estimate the cost of engagements.

Who should pay for eDiscovery costs?

March 9, 2012

significant ruling because it provides guidance to litigants on the scope of their e-discovery obligations and the factors required for cost shifting to be ordered.



Patents related to Predictive Coding

March 2, 2012

System and method for establishing relevance of objects in an enterprise system  –   http://goo.gl/uGRU8

Systems and methods for predictive coding   –   http://goo.gl/nlEhT

System and method for providing information navigation and filtration – http://goo.gl/uGRU8

What is Big Data?

March 2, 2012

With the hype on big data, It is hard to read a technology article today without encountering “Big Data”.  A number of technologies and terms get mentioned in the context of Big Data, with Hadoop chief among them, “data scientist” often not far behind and sometimes NoSQL thrown in for good measure.

But what is really Big Data?

Big Data is about the technologies and practice of handling data sets so large that conventional database management systems cannot handle them efficiently, and sometimes cannot handle them at all.

Is this a good definition?  Is there a better definition?

1.2 zettabytes this year

March 1, 2012

The Digital Universe will grow to 1.2 million petabytes, or 1.2 zettabytes, according to IDC’s annual report.

This is  1,200,000,000,000,000,000,000 bytes


First Judicial decision approving use of Computer assisted review

March 1, 2012

on Feb 24, Andrew Peck US Magistrate judge approved use of computer assisted review

See his opinion at  http://goo.gl/7d3DV