What’s the difference between data science and machine learning?

You may have heard that in the world of AI there are emerging three overlapping disciplines: machine learning, deep learning, and data science. One of the reasons they overlap is that they all, in one way or another, deal with data. Massive amounts of data.

Roger Huang
HackerNoon.com
Published in
4 min readJun 5, 2019

--

This is an excerpt from the Springboard guide to AI/ML jobs prepared for our research for the Springboard AI/Machine Learning Career Track. Use it to understand the difference between the two fields, especially when it comes to getting a job in the field and in hiring for those different positions.

Let’s begin by taking a quick look at a company that’s making lawyers’ lives easier and making them infinitely better at their jobs. Everlaw developed technology that looks through caselaw during discovery to find documents that are relevant and important to the case. They help law firms, government agencies, and corporations sift through millions of documents of evidence in big lawsuits and investigations to find the proverbial needle in the haystack.

How are they able to do that? By creating and maintaining data pipelines for data analytics, storage, and reporting, and deriving insights from various data sources using statistical methods and machine learning models. In the case of Everlaw, data scientists working there are tasked with all of the above. They help machine learning engineers design and build better ML algorithms, and use ML techniques to assist developers in implementing new AI features.

The role of a data scientist at Everlaw is a great example of what the job entails on a fundamental level. In essence, a data scientist looks for new data sources, creates pipelines for that data, designs dashboards that make sense of that data, and helps ML engineers with building better algorithms.

Another example is what data scientists at Airbnb do. Their role is heavily focused on analytics and building data pipelines that help inform business decisions. They figure out what metrics are most important for the organization today, and analyze them in the right way.

On the other hand, machine learning engineers build and maintain scalable ML algorithms that are based on the core computer science concepts (like data structures, algorithms, profiling, and optimization). Machine learning engineers code more than data scientists, and data scientists make sense of the data that drives the business forward.

Mansha Mahtani, a data scientist at Instagram, said this when we asked her for her take on the key differences between professions.

“Given both professions are relatively new, there tends to be a little bit of fluidity on how you define what a machine learning engineer is and what a data scientist is. My experience has been that machine learning engineers tend to write production-level code. For example, if you were a machine learning engineer creating a product to give recommendations to the user, you’d be actually writing live code that would eventually reach your user. The data scientist would probably be a part of that process — maybe helping the machine learning engineer determine what are the features that go into that model — but usually data scientists tend to be a little bit more ad hoc to drive a business decision as opposed to writing production-level code.”

Why does it matter?

In practice, both data science and machine learning roles are good ways to work with data. However, they require slightly different skillsets and different training approaches. A data scientist is more of a generalist who understands algorithms and statistics at scale but may only be tasked with implementing models over smaller-scale datasets. A machine learning engineer is likely somebody with production engineering skills who will be asked to use their understanding of machine learning algorithms (generally expected to be more higher-level and limited than that of a data scientist) and create a pipeline at scale that data scientists can experiment with.

The two roles are complementary, but they are not exactly the same, and hiring the wrong profile for one role or trying to get a job in one area where your skills don’t fit can be a disaster.

Data Scientist Job Requirements

This is a junior posting for recent graduates to become data scientists at Microsoft. Notice the emphasis on communication and quantitative fields, while not having a very high bar for engineering experience at scale.

Machine Learning Engineer Job Requirements

This job posting from a startup (indus.ai) already gives you a slightly different profile with strong knowledge of machine learning frameworks and software development experience needed. In contrast to data science roles, the statistics and communication portion is not mentioned as much.

Machine Learning vs. Data Science

In summary, machine learning and data science are complementary fields but have pretty different requirements in individual roles. Data science focuses on the theory, and managing a predictive model, then communicating it to stakeholders. Machine learning engineers will help support that process by dealing with the data needed to feed data science models through creating the production engineering pipelines to process data.

This is an excerpt written by @ThatAlexPalmer, published here with permission.

--

--

Roger Huang
HackerNoon.com

Passionate about engaging students to solve real-world problems in a fun and dynamic way.