good presentation comparing Hadoop/Hbase vs RDBMS
Consistent Review Criteria
Users indicate that any errors identified during the coding of the sample set will be amplified across the document population.
True Predictive Coding solutions (e.g. Servient eDiscovery solution) include an integrated refinement process that continuously learn and adjust to drastically improve document coding accuracy.
Plaintiff’s Attorney have an incentive to cast a wide net when looking for potential documents
Anybody have any thoughts to address this issue?
Defensibility in court.
For Predictive Coding to be effective, its use must be defensible in court.
In February 2012, Judge Peck provided what many people described as
the first judicial opinion in which a court has expressly approved the use of computer-assisted review. He specifically wrote, “This judicial opinion
now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”
1 – Cost Savings
By automatically identifying relevant documents, eDiscovery projects utilizing this technology result in substantially lower billable hours. While a small team of more senior lawyers are still required to “train” the software on what is relevant, traditional eDiscovery techniques require a larger group of less experienced lawyers reviewing a lot more documents and costs more.
2 – Faster Document Review
Since the review process is accelerated via predictive coding, it allows decision makers to have earlier access to pertinent information which might reduce litigation costs relative to if the same issues are identified later in the process. A more recent trend is that the timeline of eDiscovery engagements are getting compressed due to the exponential rise in data usage, so the ability to quickly process large amounts of data is critical
3 – Holistic Pattern Recognition
With manual reviews, individual reviewers see their documents in relative isolation to what their fellow reviewers are discovering. Underlying
connections between separate documents may not be observable. With Predictive coding, initial review is performed through a single lens and certain patterns or consistencies may be identified that would not have been observable to an individual reviewer looking at a relatively small set of documents.
4 – Ability to respond to changes
If the criteria for responsive documents change, predictive coding can be quickly “trained” and re-applied to the document set for the new criteria rather than a manual reviewer having to begin the process anew
5 – Greater Cost Transparency
With Predictive coding, clients are able to determine the number of documents that will require further review. This could enable them to more accurately estimate the cost of engagements.
significant ruling because it provides guidance to litigants on the scope of their e-discovery obligations and the factors required for cost shifting to be ordered.
With the hype on big data, It is hard to read a technology article today without encountering “Big Data”. A number of technologies and terms get mentioned in the context of Big Data, with Hadoop chief among them, “data scientist” often not far behind and sometimes NoSQL thrown in for good measure.
But what is really Big Data?
Big Data is about the technologies and practice of handling data sets so large that conventional database management systems cannot handle them efficiently, and sometimes cannot handle them at all.
Is this a good definition? Is there a better definition?
The Digital Universe will grow to 1.2 million petabytes, or 1.2 zettabytes, according to IDC’s annual report.
on Feb 24, Andrew Peck US Magistrate judge approved use of computer assisted review
See his opinion at http://goo.gl/7d3DV