New Survey Finds 85% of Companies Now Working on Machine Learning, Stretching Data Teams to the Limit

Learn about the new DataOps approach needed to support machine learning efforts

6 min readJun 28, 2018

Welcome to the second installment of our Definitive Data Operations Report 2018 mini-series! Every year we take the report and break it down into a blog mini-series to provide deeper insights of each section of the report. This week, we’re discussing data team structures and the need to hire in DataOps to fuel machine learning and artificial intelligence.

About the survey

Every year, Nexla surveys hundreds of data professionals to assess the current state of DataOps. This year’s survey, conducted by executive research platform Pulse Q&A, polled 266 IT and data professionals, including analysts, data scientists, data engineers, and executives. The respondents included professionals from over 25 different industries including people working in tech companies as well as e-commerce, advertising, finance, and more. Survey respondents had a wide range of experience, from new grads to data pros of 10+ years and came from companies of all sizes from less than 50 employees to more than 10,000 employees. The survey was fielded between May 3 — May 21, 2018.

A quick snapshot:

85% of companies are working on machine learning or artificial intelligence. That’s up 15 percentage points from last year’s 70%
50% of data pros say there are not enough backend data resources to support this growth. On average, there is only 1 backend data engineer for every 5 frontend data users. We define “frontend” users as those who need to derive value from the data — analysts, data scientists, and the like.
18% of an engineer’s time goes to troubleshooting (that’s 9.3 weeks a year)
Companies realize the need to invest in the human side of data — 73% plan to hire in DataOps within the next year to continue to fuel machine learning and AI

Machine Learning and AI Takeover

Almost all data pros report that their company is working on machine learning and artificial intelligence. This is up significantly (a whopping 15 percentage points) from 2017, when “only” 70% of respondents reported their companies were working on machine learning or artificial intelligence.

We wanted to understand what this focus on machine learning and AI had on data teams. This growth to 85% of companies working on machine learning or AI means that only the laggards or the last 15% are left to adopt machine learning.

Fueling Machine Learning and AI

Building and scaling machine learning models or AI models is no easy feat. It can require weeks or months of a data professional’s time. Couple that with the continuous tasks of maintaining data pipelines, troubleshooting problems that arise, and traditional ETL tasks to support frontend data users, a data pro’s to-do list is endless.

We asked data pros if they thought there were enough backend resources and data engineers to support data needs and frontend users. 50% said no — there are not enough backend resources to support the company’s data needs. Not surprisingly, respondents with more folks on the frontend team were more likely to believe they didn’t have enough resources.

The team behind the machine learning magic

To get a better understanding of the current state of data resource allocation, we asked data pros about the size of their data teams. They told us how many “frontend” data users, such as analysts, data scientists, and business users, they worked with and also how many “backend” or data engineering users, the team has.

The survey found that 37% of companies have more than 50 frontend users while almost half of all companies (44%) have anywhere from 5–49 frontend users. When asked about backend data producers, only 26% of companies reported they had more than 20 backend users, and 43% of companies have anywhere from 2–10 backend data producers.

After calculating the results a bit more, we found the average data team has 1 backend engineer for every 5 frontend professionals that needs to use the data. There are outliers of course, with the minimum ratio at 0.5 (or two backend engineers for every frontend pro) and maximum ratio of 29. That’s 29 frontend data users for every backend engineer — a ratio that is unlikely to be sustainable.

The more favorable ratios are found in smaller teams, where there are less than 10 frontend users. It appears there is a minimum of 1–5 data engineers, even on the smallest teams. But as frontend users grow, the ratios get larger because the data engineering does not scale as quickly.

Having enough backend resources is critical to minimizing and maintaining the engineering queue. As you can imagine, the more frontend users requesting data to perform their jobs, the more work piles up for backend data engineers. With the average of only 1 backend data producer to every 5 frontend data users, sustaining workloads becomes increasingly difficult. Allowing DataOps to automate and control certain tasks can immediately create ease in a backend data producer’s workload.

How to add leverage for data teams: Hire in DataOps

On average, a data engineer spends 18% of their time troubleshooting and fixing data related problems. That’s one whole day every week, and 9.3 weeks every year. Multiply that by however many engineers you have on your team, and it can add up — quick. What other things could an engineer be spending this time on?

DataOps manages data from source to value — creating scalable, repeatable, and predictable data flows for data engineers, data scientists, and business users. With this predictable flow comes more time for data pros to focus on what they really should be spending time on, including machine learning and AI objectives. So it should come as no surprise that the majority of survey respondents reported their companies have plans to hire in DataOps in the next 12 months.

When looking at the 73% of respondents who said they are planning to hire, two-thirds reported they did not think there were enough backend resources. A perceived lack of backend resources seems to be a trigger for DataOps investment, which makes intuitive sense.

Maximizing time and resources is critical to ensuring success with machine learning and AI. But tools and technology can only get you so far if your team is under-resourced and overwhelmed with work. Creating sustainable teams with scalable and repeatable processes guarantees long-term success of any business objective but especially with the rise in machine learning and AI efforts. Employers are beginning to see the importance of investing in the human side of data. DataOps is as much about people as it is about tools and processes.

Stay tuned for more DataOps insights from the 2018 survey findings! You can download the Definitive Data Operations Report 2018 here.

Want a quick recap of the report? Check it out here.

Thanks for reading. If you enjoyed this post, please consider a clap or a share.