Data science in 2017 — Part I

Naren Santhanam
3 min readDec 22, 2017

--

Kaggle is the most popular platform for data scientists to connect, learn, find and explore data, and compete in machine learning challenges. Since its launch in 2010, Kaggle’s platform has attracted a diverse set of data scientists and machine learning engineers. There are more than 1 million users of the platform today, making it the world’s largest online data science community.

Earlier this year, Kaggle conducted an online survey to collect a wide variety of information about people working with data on a daily basis. There were more than 16,000 responses from all over the world, providing information about their demographics, their job and their opinion about what’s next in data science. Because Kaggle is the most popular platform for data scientists in the world, the responses from the survey can be assumed to be representative of the general population of data scientists from across the globe.

As 2017 comes to a close, let us analyze the data from this survey to assess the state of data science in this year, and what’s coming in 2018. I extracted this data, analyzed it in Tableau and will share my findings in these series of posts, starting with the topic: demographics of data professionals (data professional = anyone who works with data in some capacity in their daily work).

How old are data professionals?

One of the first questions about the demographics that I had was about the age. If you look at the distribution below, you can see that most of the data professionals are in their late twenties to early thirties, with median age hovering somewhere around 30. There is a slight difference in the distribution for men and women and of course, if you filter the data based on higher experience, the distribution moves to the right. Overall, I think the data indicates that data professionals tend to be on the younger side, and leadership positions in data science may be in high demand going forward.

Click the image for interactive visual

Where are data professionals from?

A large portion of data related work is concentrated in the US, especially in Silicon Valley. The data from the survey confirms this, as the highest number of responses came from US based data professionals.

Click the image for interactive visual

How does gender distribution look like?

Data science can be considered a STEM field, given the amount of coding and math knowledge required to be successful. Does the gender imbalance in STEM fields apply to data science too? Unfortunately, it does.

Click the image for interactive visual

What is the educational background of data professionals?

A majority of data professionals seem to have at least a Bachelor’s degree, with little difference in distribution between men and women. Computer Science, Math and Engineering can be considered “gateway degrees” to data science.

Click the image for interactive visual

What skills do they have and how did they learn?

One of the most frequently asked questions in data science: “What skills does one need to be a data professional and how does one acquire them?”.

The encouraging response from the data professionals in this survey is that a majority of them said that they either self-taught their skills or learnt them from online courses (Coursera, edX, Udacity and so on).

Click the image for interactive visual

In the next part, we will analyze the work profile of data professionals — what kind of employer they work for, what kind of work they do, and what challenges they face in their daily job.

--

--