Photo by Franki Chamaki on Unsplash

How to Make Big Data Cost Effective for Startups

John Murray
Primalbase
Published in
5 min readMay 7, 2019

--

Tech giants like Amazon and Facebook are famed for their collection and use of data. But big data isn’t just for big companies. Even the smallest startups should be looking for ways to exploit the vast opportunities for data collection and analysis open to them.

We’ve taken a look at how startups can utilise data in their operations, ranging from specifically targeting applicable data streams, to utilising the numerous cloud services that have lowered the point of entry into big data analytics.

Collect the Right Data

Today, more than ever, companies of every size have no shortage of data streams available to them. However, as with any other area of a business, there needs to be prioritisation. Big data does not necessarily need every available data stream captured to yield the greatest insights, and attempting to do so will be costly.

Startups need to hone in on data streams that will help them achieve their specific business objectives. By limiting the scope of data projects, startups can better utilise their resources to collect data that is pertinent, then take action on it, and grow the business.

Photo by Luke Chesser on Unsplash

Google Trends is an easy first step for a startup to develop a more focused idea of the keywords and region-specific metrics applicable to the business.

Build an Appropriate Data Science Team

There needs to be a complementary strategy for effectively using existing infrastructure and bringing in data professionals to analyse this. Larger companies have the capital to assemble large teams of data scientists, but this isn’t necessary or tenable for many startups.

Photo by amir shamsipur on Unsplash

The crucial thing is to be realistic with expectations for what smaller data science teams can achieve. If a data-driven company wants to achieve a turnaround of $1 billion in a year, then their data science investment will need to be in the tens of millions. However, startups with lower access to funds are better served with hiring one of two data scientists (who can command an average salary of $120,000) who can carry out data science experiments and research, which can help shape and develop the overall data strategy for the startup. The needs of a startup are also likely to be smaller, so it could be that either a part-time data scientist or even an intern may be a far cheaper option that can still do an effective job.

Open-Source Software and the Cloud

With a small data science team, or even a single data scientist, startups have a diverse and cost-effective selection of software and infrastructure solutions to make use of big data streams. Open-source Apache Hadoop is one of the most popular solutions, and allows anyone to utilise lower cost, off-the-shelf hardware to facilitate a framework for distributed storage and processing of big data.

Of course, using free software often involves trade-offs. Platforms that have not been tailor-made for your organisation or needs may require a slightly longer setup and a high level of technical skill. Many of these open-source software offer customised versions, specifically tailored for different industries, which help cut down on this initial effort and expenditure.

Big Data as a Service (BDaaS)

The barriers for entry into cloud services and infrastructure have dropped immensely over the years, to the point now where startups have the ability to utilise these services to shape and scale their big data operations.

BDaaS is a broad term, much like the scope of the data in question, and can involve a range of functions. Supply of data is one, as well as the offering of analytical tools to interrogate said data, often in the form of online dashboards. There are also reporting tools, while some BDaaS providers also offer advisory services as part of customisable packages.

In terms of providers, there is a growing number of options for startups and small businesses to use. Amazon Web Services (AWS) is one of the best known iterations of this decentralised infrastructure, providing cloud storage that is more affordable, scalable and secure than many on-site physical options. AWS is actually made up of a diverse catalogue of individual services, including Amazon Elastic File System, which provides a flexible capacity for your cloud storage that grows and shrinks as you add and remove files, thereby ensuring that various applications have necessary storage when it’s needed.

Photo by Christian Wiediger on Unsplash

Amazon aren’t the only large tech company that have created BDaaS products. HP’s big data analytics platform, Haven, is now available exclusively in the cloud. These various offerings have allowed big data to be a much more tenable commodity for startups, especially with the established infrastructure of recognisable industry giants behind them. Competition between these companies helps keep prices down, which puts big data projects within the reach of startups with limited capital.

Examine Data Collected by Other Startups

It can be difficult for startups to know which metrics align with progress towards specific goals — especially when you’re completely new to an industry. Fortunately, other companies in the same field will often have open data (or at least open findings from their data) that can be applied to other businesses. It is not uncommon for startups and more established companies to share their industry data in the form of visualisations. Often, the data represented in the visualisations will be distilled down, but there can be valuable analysis to be drawn, particularly if your own data supply is limited.

Tech startups are spoilt for choice when it comes to leverageable data sources for their operations. How could more be done to ensure that the data used by startups is pertinent to the requirements of the business? Let us know in the comments.

--

--

John Murray
Primalbase

Senior Editor at Binary District, focusing on machine learning, AI, quantum computing, cybersecurity, IoT