Kyle Feiner
3 min readSep 18, 2022

STOP! Don’t Use Big Data

Big Data has garnered quite a lot of attention in recent years and rightfully so. Big Data, the tools around it, and advanced analytics can take historical data and give you an almost magical portal into valuable insights. Simply put, you can use descriptive and predictive analytics to super charge your organization. However, there are many instances where Big Data does not apply and should not be considered.

The first item for consideration is your budget. If you have a small budget, you may not be able to afford to setup a Big Data architecture. For example, a Hadoop cluster may utilize many computers to handle the processing power required for a Big Data hub. As well, you will need to account for specialists that can understand a command line interface to manipulate the files in the Hadoop Distributed File System. Furthermore, the data lake could potentially have many different formats such as JSON, XML and CSV that will all need to be addressed by highly paid professionals. Typically, a systems administrator will be on the payroll just to maintain the Hadoop ecosystem alone. This does not take into consideration the systems downstream of the Hadoop platform that will handle data science, machine learning and any other OLAP systems that could be used to store and gather insight.

Another reason that Big Data is not the right concept for your organization is the type of goal you have in mind for your organization. If a non-profit organization needs a back end for its transactional systems with ACID compliance, Big Data will not be your top priority. In fact, tools like Hadoop will not give you the level of transactional consistency that you would need to support your front end due to the nature of the file system approach. Even though Hadoop is highly scalable for analyzing large datasets, it would not give proper transactional accountability that a transactional database such as Oracle or SQL Server excels at.

Lastly, Big Data is great for exactly what the name describes. However, if your organization has a low volume of data, there would be no need for a Big Data environment to run analysis on large datasets because they simply do not exist. In some cases, a simple and well-thought-out spreadsheet can be the exact data tool that is called for. In fact, it would be overkill to suggest that an organization setup a data model, pay skilled engineers, database administrators, data architects, business analysts, business intelligence developers, and data analysts to tackle something that can easily be put into a spreadsheet. In most industries, spreadsheets can be a classic tool to capture and share information. In fact, that data can be exported in comma-separated value formats so that reporting interfaces can give descriptive analytics if necessary. This is assuming that the data has some validation checks and macros in the spreadsheet itself.

To sum up, Big Data and the tools that orbit it can be a powerful way to make use of your industrial data and turn it into insight that can offer a look into the present and future of the organization. Descriptive analytics can be useful in dashboarding to give a broken out or rolled up view of useful metrics. Prescriptive analytics will utilize machine learning to offer up a precise recommendation for how to govern decision-making. However, sometimes the insights to be gained don’t justify the cost of tools and people for running a modern Big Data architecture.