Who needs a Data Engineer?
Today, who needs a Data Engineer when everyone else wants to hire a Data Scientist?
Let me start with a real-time situation; a new enthusiastic data scientist joins a firm. He knows how to analyse data, how to build models around it, how to create data stories. Now, business wants him to work on a use-case, data scientist understand the use-case and start looking around for data to work on. And he keeps on waiting, because there is no ready-made data available, data is hidden across various data stores. Now, data scientist needs help and here comes data engineer to his rescue.
“A Data Engineer is responsible for the creation, processing and maintenance of data pipelines which gives processed data that enables data scientists to work on their use-cases.”
So I would like to call ‘data science’ as ‘data science & engineering’ which gives a better idea of engineering skills required in this field.
But not all organizations realizes that they require both roles and most of the time data scientists end up doing data engineering tasks most of their time.
Skills of a Data Engineer
An article from DataQuest mentions following skills what a data engineer should have:
- Architecting distributed systems
- Creating reliable pipelines
- Combining data sources
- Architecting data stores
- Collaborating with data science teams and building the right solutions for them
This is the first in a series of posts on Data Engineering. If you like this and want to know when the next post in the…www.dataquest.io
Panoply has published a decent article on ‘How to Become A Data Engineer’ which also highlights the skills required for the role:
The demand for skilled Data Engineers (or Big Data Engineers) is projected to rapidly grow . No wonder that's the case…blog.panoply.io
Data Scientists Vs Data Engineers:
In general, data scientists are great at advanced analytics and data engineers are good at programming front in general.
The differences between data engineers and data scientists is explained in following article by DataCamp from following aspects: responsibilities, tools, languages, job outlook, salary, etc.
The discussion about the data science roles is not new (remember the Data Science Industry infographic that DataCamp…www.datacamp.com
Following article on O’Really coins a term ‘Machine Learning Engineer’ for a role who fills the gap between a Data Scientist & Data Engineer.
Check out the session "Deploying ML Models in the Enterprise" at the Strata Data Conference in New York City, September…www.oreilly.com
Ratios of data engineers to data scientists
Even if an organization/department realizes that they need both roles, a common issue is to figure out the ratio of data engineers to data scientists. Considering that building data pipelines requires more efforts, a common starting point is 2–3 data engineers for every data scientist.
Data Engineering: The Close Cousin of Data Sciencemedium.com