Why do Big Data projects fail and how to make it succeed?

Bram Neijt
Jul 28, 2017 · 4 min read

Big Data has the goal to automate delivery of actionable business insights from data. In order to do this, you often end up wanting diverse data sources, large data sets and a vast amount of computational power. However, most are symptoms of an approach, not prerequisites of the goal.

This often leads to higher management focusing on tools used by competition instead of focusing on why the competition is using that tool in the first place and what steps are required to end up in the same league.

To emphasize this, I thought it would be easy to go through the points from the 2013 post Why does SCRUM fail? and lazily substitute SCRUM with Big Data to highlight my point.

Big Data, no doubt, is one of the most viable and potentially applicable approaches to managing insight development projects. Being a strongly automated approach, Big Data has multiple advantages and that is why an increasing number of companies in the recent past have either implemented or have been intending to implement Big Data; however, implementation does have its baggage. Although the facts and figures suggest that Big Data has been successful in developing and delivering high-quality, business-valued insight, there have been instances where it has failed to bear fruit. Data Scientists as well as Data Strategists across the globe have been addling their brains to find the reasons how and why Big Data fails to deliver.

An analysis of stories about some failed Big Data projects revealed the following:

  1. Big Data is neither a Magic Wand nor a Silver Bullet — The hype that comes with the promotion of any new approach or technology, has led higher management to harbor the assumption that Big Data is a silver bullet. However, to be precise and practical, Big Data is not a magic wand or potion that puts an end to all types of problems. Big Data is just a methodology which delineates the processes and practices that help in managing large data processing. No process, technique, or methodology will solve all your data analysis problems. Though tempting, expecting a single, silver bullet like, solution that will kill all insight problems is unrealistic.
  2. Inappropriate application of Big Data can lead to its doom — Big Data is not a prescriptive method, but a suggestive approach to data analysis. So, the way it is implemented makes all the difference. The team practicing Data Science should be well aware of Big Data principles as well as of Big Data’s suitability to the problem at hand. People implementing Big Data should be aware of the strengths and weaknesses of the approach they are applying. Ambiguity and vagueness about either the approach or capabilities or both can result in confusion and rework — increased production cost and delayed delivery on commitments.
  3. The problems highlighted by Big Data need to be solved — One of the significant advantages of Big Data is that it reveals problems at quite an early stage in the development process; however, “knowing is [only] half the battle.” The more imperative task is to solve the problems and make an effort to remove the impediments that have surfaced. However, it has been noticed that some companies make no efforts to deal with the problems; these problems are either ignored or hidden until the embedding phase of the project. It is actually this procrastination that leads to delays in delivery or failures to meet commitments.
  4. Lack of a skilled and efficient project team — A very skilled and an efficient project team is required to implement Big Data effectively and successfully. All the roles: Engineer, Scientist, Architect and IT support, need to be aware of the Big Data principles and facilitate them as effectively as possible. Also, the team members should be technically sound and experts in their fields, that is, the scientists should have expertise in the technologies to be used, while the engineer should possess all the relevant and valuable business information about the systems required to embed the result. Part of the a successful implementation of Big Data relies on the team members being “generalists/specialists;” they need to be cross-trained enough to allow for a smooth transition from the domain specific concept towards automated embedded insights.
  5. Lack of an experienced and visionary Data Scientist — The Data Scientist is responsible for introducing new ways of looking at the data and proofing it’s value. He/she is responsible to maximize the value and often push the requirements of production to their limits. So, it is essential that the Data Scientist is efficient, experienced and visionary. In other words, he/she should be well versed in data analysis techniques and well aware of their computational complexity.

If you have read Why does SCRUM fail?, you should see the resemblance. The ease with which SCRUM can be exchanged for Big Data shows that new and innovative approaches endure similar obstacles. Embrace change, collaborate, push for embedding and focus on team interactions over processes and tools.

BigData Republic provides these type of Big Data Solutions. We are experienced in advanced predictive modeling and deploying large-scale big data pipelines. Interested in what we can do for you? Feel free to contact us.



Bram Neijt

Written by



More From Medium

More from bigdatarepublic

More from bigdatarepublic

Pachyderm for data scientists

More on Data Science from bigdatarepublic

More on Data Science from bigdatarepublic

A Review of Netflix’s Metaflow

Also tagged Data Science

Also tagged Data Science

What is PyTorch?

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade