May 5, 2014
The Ingestion Box in the reference architecture is displayed as the smallest box. However, this is the component that integrates with all the available data sources. This tends to be among the most complex and time consuming task, but tends to be relegated to a lower priority which is a big mistake.
One needs to prioritize the data sources that generate maximum value and ensure we can ingest the data into the Big Data platform for subsequent “cool” analytics.
In my experience, it is also extremely important to have a robust User Interface for the ingestion section. Otherwise, there could be a series of manual steps leading to errors and ingestion of “bad” data that will minimize impact of subsequent analytics.
December 11, 2012
Which errors does your Litigation support vendor commit?
I have found that automated QC can address most of these issues.
December 6, 2012
Is it hard to justify ROI for defensible deletion?
here is a tip – try and calculate how much money was spent last year processing and reprocessing useless data for eDiscovery purposes, rejecting it time after time, at considerable expense. There’s a big chunk of ROI there.
November 7, 2012
A lot of focus has been placed on delivering automated solutions with no human interaction. We however need to invest in more than just making machines smarter. We need to train our employees to become more sophisticated consumers of the outputs of their machines. Then, the network effect will begin to bringing more value out of data than ever before.
The biggest victories in the man-machine framework come when machine learning is appropriately delivered to respect the role of humans.
In the E-Discovery space, the better machine learning products actively learn from documents marked by human reviewers to produce continuously improved results expediting the review process
November 3, 2012
We can talk all day long about measuring, researching, evaluating and get nowhere. It is time to start using Predictive Coding technology and reap its benefits of lower cost of review.
September 9, 2012
Cloud computing is a boon to big data. Paying by consumption destroys the barriers to entry that prohibit many organizations from playing with large datasets, because there’s no up-front investment. In many ways, big data gives clouds something to do.
September 5, 2012
A more comprehensive comparison of solr and elastic search
September 4, 2012
Good article comparing the 2 open source search server solutions built on top of Lucene.