Want to be a Data Analyst?

Jiamei Wang
5 min readAug 12, 2020

--

Introduction

Believe it or not, we live in a digital world where we are generating 2.5 quintillion bytes of data every day! With this widespread availability of data, the need for data related jobs will surely be on the rise.

This post is an analysis on data analyst jobs based on a data set found on Kaggle. The data contains 2253 job listings for data analyst positions scraped from Glassdoor by picklesueat. Hope that this analysis could help you hone your job search and get your dream job faster!

In this post, we will be answering the following questions:

1. Which industries/sectors/states have higher demand for data analysts?

2. How much salary can a data analyst expect to earn?

3. What are the most frequently appeared words in data analyst job descriptions?

4. Which features have more impact on the average salary of a data analyst?

1. Which industries/ sectors/ states have higher demand for data analysts?

Here we listed top 10 counts of job postings by industries and by sectors. We see that there are more job opportunities in industries like IT, Staffing & Outsourcing and Health Care Services & Hospitals, or in sectors like IT, Business Services and Finance.

Below is a regional heat map of job posting counts. Darker color indicates more counts. The scraped data set apparently did not cover all of the states, but within the data we already have, California, Texas and New York are the top 3 states that are demanding higher number of data analysts.

2. How much salary can a data analyst expect to earn?

The histogram shows the distribution of estimated minimum and maximum salaries. If you are planning to be a data analyst, expect a minimum salary range of 24K to 113K, or a maximum salary range of 38K to 190K.

The boxplot below shows the average salary distribution by state. The two nearby states, New York and New Jersey, have really similar distributions. If you are in California, expect a larger range of salary distribution than any other states. (Click this to find out more about how to read a boxplot)

The boxplot below shows the average salary distribution by the type of ownership of a company. “Hospital”, “College/University” and “Private Practice/Film” have similar medians, while the former two types have relatively wider distribution of average salary than the rest of the ownership types. In comparison, “Government” and “Self-employed” have relatively lower salaries. The variations of “Contract”, “School/ School District”, as well as “College/University” is less in the data, so your salary will just be around that range if you choose to work in a company of these three types.

(Click this to find out more about how to read a boxplot)

Surprisingly, there is not much difference for the salary distribution of different company sizes, except for the “Unknown” category.

3. What are the most frequently appeared words in data analyst job descriptions?

From the word cloud, we see that companies are looking for candidates who can provide solutions and supports. A lot of data analysts may need to work with clients, deal with products, and create reports. Companies also look for people who have data analytics, communication and teamwork skills. The data analyst job may require certain years of experience with mostly Bachelor’s degree required. Among all of the majors, students who graduate with a Computer Science degree would probably have a higher chance of employment as a data analyst.

Word Cloud of Job Descriptions

4. Which features have more impact on the average salary?

The features used in the linear model to predict the average salary of a data analyst are: “Job Title”, “Rating,” “Location,” “Headquarters,” “Size,” “Type of ownership,” “Sector,” “State” and “Revenue.” After data preprocessing including removing NA rows or filling missing data with mean, and creating dummy variables for the categorical columns, the r-squared is still pretty low, meaning that none of these independent variables are significant enough to predict the Average Salary. The test r-squared value is 0.26 and the train r-squared value is 0.37. The model could be improved by adding more features that are more significant if we could find some more related data set.

Conclusion

Who have higher demand for data analysts?

  • Top 3 Industries: IT, Staffing & Outsourcing and Health Care Services & Hospitals
  • Top 3 Sectors: IT, Business Services and Finance.
  • Top 3 States: California, Texas and New York.

How much salary can a data analyst expect ?

  • Minimum Salary Range (USD): 24K -113K (Mean: 54K)
  • Maximum Salary Range (USD): 38K -190K (Mean: 90K)
  • If you are in California, your salary could be really low or really high
  • Company size has almost no impact on your salary amount

What to include in your resume or interview/ What skills are required?

Demonstrate your data analytics, problem-solving, communication and teamwork skills. Pay attention to their requirements of how many years of experience. A bachelor’s degree will be required for most of the data analyst positions.

What major should I choose if I want to be a data analyst?

There are a lot of majors that could give you a good knowledge of data analysis. However, based on this study, choosing Computer Science major would increase your chances of getting a job because it matches more job descriptions.

How would the data analyst job be like?

You will be working on a lot of data analytics projects, most likely in a team. You may also need to work with clients, deal with products, and create reports.

Hope that this analysis on data analyst job postings will help you hone your job search. Please check out my GitHub link for the detailed analysis.

--

--