What's holding back Hadoop?

A survey of data management experts finds that qualified staff and clearly defined business cases are the top obstacles to deploying the open source big data solution.
Hadoop -- the open-source, distributed programming framework that relies on parallel processing to store and analyze both structured and unstructured data -- has been the talk of big data for several years now. And while a recent survey of IT, business intelligence and data warehousing leaders found that 60 percent will Hadoop in production by 2016, deployment remains a daunting task.
TDWI -- which, like GCN, is owned by 1105 Media -- polled data management professionals in both the public and private sector, who reported that staff expertise and the lack of a clear business case topped their list of barriers to implementation:
| Barriers to implementation | Respondents who checked each category |
| Inadequate skills or difficulty of finding skilled staff |
|
| Lack of compelling business case |
|
| Lack of business sponsorship |
|
| Lack of data governance |
|
| Security for Hadoop data |
|
| Lack of metadata management |
|
| Excessive hand coding required of Hadoop |
|
| Cost of staffing Hadoop admin/development |
|
| Cost of implementing a new technology |
|
| Difficulty of architecting big data analytic system |
|
| Immature support for ANSI-standard SQL |
|
| Interoperability with existing systems or tools |
|
| Software tools are few and immature |
|
| Enterprise-class manageability |
|
| Not enough information on how to get started |
|
| Slow pace of hand-coded development |
|
| Cannot make big data usable for end users |
|
| Handling data in real time |
|
| Existing user-defined DW architecture |
|
| Poor quality of Hadoop data |
|
| Software tools need higher-level language support |
|
| Hadoop's high operational expenses |
|
| Enterprise-class availability |
|
| Other |
|
The respondents did, however, see a wide range of uses to justify the deployment efforts, including:
| HDFS applications | Respondents who checked each category |
| Complementary extension of a data warehouse |
|
| Data exploration and discovery |
|
| Data staging for data warehousing and data integration |
|
| Data lake |
|
| Queryable archive for non-traditional data |
|
| Computational platform and sandbox for analytics |
|
| Enterprise data hub (for both new and traditional data) |
|
| Business intelligence (reporting, dashboards) |
|
| Queryable archive for traditional enterprise data |
|
| Operational data store (ODS) |
|
| Repository for content, records management |
|
| Operational application support (apps on Hadoop data) |
|
| Don't know |
|
| Other |
|
And just 6 percent said Hadoop deployments were not in their organization's plans at all:
When do you expect to have HDFS in production?
- 2012 - 2014
The full report, which also includes best practices and implementation trends, is available here.




