Data science in 2017 — Part II

Naren Santhanam
4 min readDec 23, 2017

--

In my earlier post, we saw the demographics of data professionals. We saw that they are a relatively young breed, mostly male, with good educational background and a quantitative bent of mind. Let us examine their work profile in this post. Specifically, we will look at the types of employers they work for, the kind of work they do, and the challenges they face.

What type of employers do they work for?

A large portion of the data professionals that answered the survey identified themselves as working in large technology companies with more than 10000 employees. And a majority of them have experience of less than 5 years, indicating that this is still a growing field.

Click the image for interactive visual

How much do they earn?

Data science is considered a skill that is in high demand, and it shows in the above average median compensation levels. US, Australia and Western Europe are the leading countries in terms of median compensation (in USD). The figures have been converted to USD for relative comparison.

Click the image for interactive visual

What do they actually do?

It’s true that the key job function of data professionals is to ultimately analyze data and derive insights which in turn inform product and strategic decisions. But this may in fact vary based on the job title. For example, data engineers work on building the data infrastructure, whereas data scientists work on analyzing and understanding data for product or business decisions. But, if we analyze the percentage of time spent by data professionals, we can see that a large portion of their time is spent collecting and gathering data (irrespective of years of experience or job title).

Click the image for interactive visual
Click the image for interactive visual

How do they do what they do?

What tools do data professionals use for these tasks? We know that Python and R are popular in the data science community. In fact, Python is actually the most dominant tool in data science today. Especially with machine learning frameworks available in Python such as Keras, Tensorflow and Torch, it is tough to see a successor to Python, at least in the short run.

There are various algorithms available in data science today. Some of the most popular are logistic regression, decision trees, time-series analysis and text analysis. I suspect these may get replaced with neural network based algorithms in the next year.

Click the image for interactive visual
Click the image for interactive visual

What challenges do data professionals face?

Anything worth doing is challenging! And so is data science. So what are some of the toughest challenges that data professionals face? Earlier, we saw that a large portion of their time is spent in collecting / gathering data. This is because data is not always available in a format that is easy to analyze. One has to do a lot of preprocessing before any models can be run on the data. That’s why dirty data is the top challenge faced in this field. Also, a lack of data science talent is also cited as one of the top challenges. This should be an encouraging sign for anyone wanting to get into data science.

Click the image for interactive visual

In the next part, we will look at what is required to get into the data science field.

--

--