Data based job roles: Data Analyst, Data Engineer, Data Scientist, and Machine Learning Engineer.

Anjali Kumari
7 min readSep 1, 2019

Data Analyst

Who is a Data Analyst?

Data analytics is the art of exploring the facts from the data to answer a specific question which helps to identify trends within industries. Data analyst takes the information about a topic and analyses, interprets, organize it by cleaning and use it to help companies make better decisions.

What does Data Analyst do?

Data Analyst research and produce insights. They aim at searching for answers to the questions posted by Data Scientist.

Responsibilities

  • Mining data from sources, then reorganizing it in a format that can be easily understood by human or machine.
  • Emphasis on representing data via reporting and visualization
  • Designing maintaining data system and databases; this includes fixing coding errors and other data-related problems.
  • Prepare reports for executive leadership that effectively communicate trends, patterns, and predictions using relevant data.

Skills and Tools used by Data Analyst

The candidate should be well-versed in programming skills as well as visualizing data, scripting and statistical skills, SQL/Database knowledge, Data warehousing.

These are some common tools in a data analyst’s tool belt:

  • Microsoft Excel
  • SQL
  • SAS software
  • Google Analytics
  • Google Tag Manager
  • Tableau
  • Google AdWords

Application Areas

There are so many applications of Data analysis two are listed below:

  1. Policing/Security: In several cities of the world have employed predictive analysis in predicting areas that are likely to witness a surge in crime with the use of geographical data and historical data. Data has made it possible for police officers at which time in which city the crime is more likely to happen, which has led to a drop in the crime rate.
  2. Transportation: A few years back at the London Olympics, there was a need for handling over 18 million journeys made by fans in the city of London and fortunately, it was sorted out. How was this feat achieved? The TFL and train operators made use of data analytics to ensure the large numbers of journeys went smoothly. They were able to input data from events that took place and forecasted a number of persons that were going to travel; transport was being run efficiently and effectively so that athletes and spectators can be transported to and from the respective stadiums.

Salary: $59000 /year

Data Engineer

who is a Data Engineer?

Let's understand it by a quote

“A scientist can discover a new star, but he cannot make one. He would have to ask an engineer to do it for him.” –Gordon Lindsay Glegg

So, the role of the Data Engineer is really valuable.

Data engineers build and optimize the systems that allow data scientists and data analysts to perform their work. Every company depends on its data to be accurate and accessible to individuals who need to work with it which is done by Data Engineer.

What does Data Engineer do?

The data engineer ensures that any data is properly received, transformed, stored, and made accessible to other users.

For example, if a company starts generating a large amount of data from different sources, your task, as a Data Engineer, is to organize the collection of information, it’s processing and storage.

The following are examples of tasks that a data engineer might be working on:

  • Building APIs for data consumption.
  • Integrating external or new datasets into existing data pipelines.
  • Applying feature transformations for machine learning models on new data.
  • Continuously monitoring and testing the system to ensure optimized performance.

Responsibilities

  • Develop test and maintain architecture
  • Understand programming and its complexity
  • Deploy ML & statistical models
  • Building pipelines for various ETL operations
  • Ensures data accuracy and flexibility

Skills and tools used by Data Engineer

A Data Engineer needs to have a strong technical background with the ability to create and integrate APIs. They also need to understand data pipelining and performance optimization. A data engineer needs to be good at:

  • Architecting distributed systems
  • Creating reliable pipelines
  • Combining data sources
  • Architecting data stores
  • Collaborating with data science teams and building the right solutions for them

From this list, we can assume the data engineers are specialists from the field of software engineering and backend development.

data engineers and data scientists both will likely use programming languages such as Python, Java, C++ or a query language, e.g., SQL. Furthermore, data scientists and data engineers must know how to utilize distributed storage and computation software including Hadoop along with any additional software packages such as Spark, Hive, Pig or NoSQL systems such as MongoDB. For cloud-based storage and computation, many enterprises use Amazon Web Services or Google Cloud Computing, and data engineers need to understand how each architecture functions, i.e., how the data is ingested, stored, retrieved, and computed.

Salary: $90,8390 /year

Data Scientist

Who is a Data Scientist?

A data scientist is a specialist who applies their expertise in statistics and building machine learning models to make predictions and answer key business questions. A data scientist still needs to be able to clean, analyze, and visualize data, just like a data analyst. However, a data scientist will have more depth and expertise in these skills, and will also be able to train and optimize machine learning models.

What does a Data Scientist do?

A data scientist’s daily tasks revolve around data, which is no surprise given the job title. Data scientists spend much of their time gathering data, looking at data, shaping data, but in many different ways and for many different reasons. A data scientist cleans a dataset with the intent of feeding it into a statistical model for predictive and inferential purposes.

Responsibilities

Data-related tasks that a data scientist might tackle include:

  • Pulling data
  • Merging data
  • Analyzing data
  • Looking for patterns or trends
  • Developing and testing new algorithms
  • Developing predictive models
  • Building data visualizations
  • Carry out data analytics and optimization using machine learning & deep learning
  • Responsible for developing Operational Models

Skills and tools used by Data Scientist

Skills required to be a Data Scientist are

  • Statistical & Analytical skills
  • Machine Learning & Deep learning principles
  • Data Mining
  • In-depth programming knowledge (SAS/R/ Python coding)
  • Hadoop-based analytics
  • Data optimization
  • Decision making and soft skills

Salary: $91,470 /year

Machine Learning Engineer

Who is a Machine Learning Engineer?

Machine Learning Engineer are computer programmers, but their focus goes beyond specifically programming machines to perform specific tasks. They create programs that will enable machines to take actions without being specifically directed to perform those tasks. An example of a system a machine learning engineer would work on is a self-driving car.

Responsibilities

Machine Learning engineer is a good post. there many responsibilities of engineer such as:-

  • Designing and developing Machine Learning Systems
  • Running machine learning tests and experiments
  • Implementing appropriate ML algorithms
  • perform statistical analysis and fine-tuning using test result
  • Extend existing ML libraries and frameworks
  • Research and implement appropriate ML algorithms and tools

Skills and tools used by Machine Learning Engineer

  1. Computer Science Fundamentals and Programming
  2. Probability and Statistics
  3. Data Modeling and Evaluation
  4. Applying Machine Learning Algorithms and Libraries
  5. Software Engineering and System Design

Application Areas

  • Social Media (Facebook): Automatic Friend Tagging Suggestions on Facebook or any other social media platform. Facebook uses face detection and image recognition to automatically find the face of the person which matches it’s Database and hence suggests us to tag that person based on DeepFace.
  • Products Recommendations: Suppose you check an item on Amazon, but you do not buy it then and there. But the next day, you’re watching videos on YouTube and suddenly you see an ad for the same item. You switch to Facebook, there also you see the same ad.

Well, this happens because Google tracks your search history, and recommends ads based on your search history.

  • Traffic Alerts (Maps): Well, It’s a combination of People currently using the service, Historic Data of that route collected over time and few tricks acquired from other companies.

Everyone using maps is providing their location, average speed, the route in which they are traveling which in turn helps Google collect massive Data about the traffic, which makes them predict the upcoming traffic and adjust your route according to it.

Salary: On an Average, an ML Engineer can expect a salary of ₹719,646 (IND) or $111,490 (US)

Conclusion

Knowing the difference among the four fields makes it easier for engineering students and IT professionals who are interested in data science to assess themselves and decide on which path fits them best. Jobs in data science are growing every year and playing some of the highest salaries as both the public and private sector continue to implement the use of big data.

--

--

Anjali Kumari

I am pursuing computer science and engineering and am Machine Learning enthusiast