Greenplum and the Power of Onehttps://exist.com/wp-content/uploads/web_800x507_theone-768x487-1.webp768487Exist Software LabsExist Software Labs//exist.com/wp-content/uploads/logos/exist/logo-default.png
“Did I disappoint you? Or leave a bad taste in your mouth? You act like you never had love And you want me to go without”
So goes a passage in the chart-topping song, “One”, off of U2’s very successful Achtung Baby album.
It could also pass for the sentiment of many organizations who jumped in on a Hadoop Big Data romance, only to be met with heart-breaking disappointment. But why was it such a heart-breaker?
1. Hadoop’s performance with small datasets leaves much to be desired. Its file-level processing is good for batch handling of large datasets but constrained for smaller, interactive querying.
2. Hadoop is weak on real-time analytics. As mentioned in point 1, Hadoop is more of a batch kind of implementation, and response times are often unacceptable. It’s true that Spark and Kafka on top can offer some remedy to this predicament but the complexity of configuration and maintenance can be quite harrowing.
3. Hadoop was limited in its deployment options. Most Hadoop implementations are on-premise and have not fully embraced the cloud renaissance.
4. Administration-wise, Hadoop’s infrastructure can be a pain in the bottom. Core tools for replication, adding nodes, directory, and partition-creation, performance tuning, workload management, data distribution, etc., are minimal and often require add-ons. Not to mention the headache that is disaster recovery.
All for One, One for All
It could also be argued that the need to piece together so many different disparate software components in order to come up with a data analytics platform was a key determinant in the decline of Hadoop-based implementations.
Imagine needing to learn a big chunk of these technologies just to get the ball rolling! What if what once looked like this: