Is self-service big data possible?

SnapLogic
SnapLogic
Sep 6, 2018 · 2 min read

By Mark Gibbs

By now, we all know about and are experiencing the rise in the volume of data generated and available to an organization and the issues it can cause. One can see that there is little end in sight to the data tsunami which is largely in part due to the increased variety of data from mobile, social media, and IoT sources.

So, it’s no surprise that organizations find themselves drowning in data. In a recent survey from independent market research firm Vanson Bourne, they discovered that up to 80 percent of respondents believe legacy technology is holding their organization back from taking advantage of data-driven opportunities. The same survey also stated that only 50 percent of the data collected is analyzed for business insight. Couple all this with organizations needing insights from the data at a faster and faster rate is a recipe for disaster or, at best, represents potentially lost revenue.

To collect and analyze the data for hidden business insight and truly embrace a data-driven culture, organizations need the tools to empower data engineers and business users with domain knowledge to operate more efficiently in this environment without requiring them to have deep, technical skills.

The rise of big data technologies

To collect and analyze all types of data, organizations continue to adopt big data technologies and are increasingly building cloud-based data lakes. This allows them to take advantage of the benefits that the cloud offers such as significantly lower CapEx.

However, operating the big data environment is often challenging as specialized skills are required to install, configure, and maintain the environment. Once the environment is up and ready and employees are ready to embark on their project, they’ll need Spark developers to write code to perform the data engineering tasks. Next, they will need to develop a mechanism for Spark pipeline job submission to the processing cluster. Once this is done, and employees want to take advantage of the transient capability that Big Data as a Service (BDaaS) clusters offer, they will need to develop their lifecycle management.

All of this takes many hundreds, if not thousands, of lines of code which is complex, error-prone, and typically takes months to write. At the same time, Spark developers are in high demand and there is a tremendous shortage of this talent overall.

To finish reading this post, visit snaplogic.com.

SnapLogic

Written by

SnapLogic is the leader in enterprise cloud and big data integration, helping companies connect their data, apps, and APIs. http://ow.ly/lUOv30jGgo6

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade