Data Ingestion for Enterprise Data Platforms

http://hortonworks.com/wp-content/uploads/2014/03/Oil-and-Gas-Ref-Arch.png

The Ingestion Box in the reference architecture is displayed as the smallest box.  However, this is the component that integrates with all the available data sources.  This tends to be among the most complex and time consuming task, but tends to be relegated to a lower priority which is a big mistake.

One needs to prioritize the data sources that generate maximum value and ensure we can ingest the data into the Big Data platform for subsequent “cool” analytics.

In my experience, it is also extremely important to have a robust User Interface for the ingestion section.  Otherwise, there could be a series of manual steps leading to errors and ingestion of “bad” data that will minimize impact of subsequent analytics.

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: