Exploring the Latest Salary Trends in Data Science and AI: An In-Depth Analysis

SHAP values for country, experience, job title, year, and more

Dmytro Iakubovskyi
Data And Beyond
Published in
4 min readJul 10, 2023

--

Photo by Giorgio Trovato on Unsplash

In this article, I use the newest updated public dataset taken from the ai-jobs.net website that contains (as of November 2023) 4,858 2022–2023 year gross salaries of Data domain professionals, including Data Scientists, Data Engineers, Data Analysts, Data Managers, and many more. The dataset is also publicly available on Kaggle. Full details of the analysis can be found in this public Kaggle notebook.

Step 1 — data preprocessing

Here, data preprocessing consists of the following steps:

  • converting the label (yearly gross salaries) to kUSD/year;
  • combining Experience and Expertise Level columns, as well as Employee Residence and Company Location countries
  • encoding rare categorical variables (in employee_residence, job_title, and experience_level columns) with no more than 50 different categories in each column and at least 15 data samples in each category;
  • finally, dropping unused columns.

Note that, unlike the previous analysis,

--

--

Dmytro Iakubovskyi
Data And Beyond

Top writer in AI, Movies | Senior data scientist | Editor in Data And Beyond | https://www.linkedin.com/in/dima806/