Predicting the future of data science

Nikki van Ommeren
ING Blog
6 min readJul 7, 2020

--

We have seen a massive growth in the amount of data created in the past decade, going from 1.2 zettabytes per year in 2010 to an estimated 53 zettabytes per year in 2020 [1][2]. This is a whopping increase of 44 times in only a decade! According to IDC, a research firm focused on IT, the COVID-19 pandemic will only accelerate this process in 2020 and 2021 [3].

It might not come as a surprise that this increase has led to the rise of a multitude of data-related jobs. I’m seeing the same within ING. Nowadays, we have positions like data engineer, data analyst, machine learning engineer and model validator. One of them, data scientist, has even been named ‘the sexiest job of the 21st century’ [4]. It also earned the title of ‘best job’ in the US from 2016 to 2019 [5].

Although it is exciting to see that our skill set is in demand, I have been wondering whether my hard-earned expertise will still be relevant in ten years from now. I actually do not expect my job to exist in the same form as it does today. To be best prepared for the future and to figure out how I can successfully advance my career, I went to one of the top conferences on machine learning (Neurips) and asked PhD researchers, professors and engineers the following tricky question:

What does the future job of a Data Scientist look like?

Recruiter at OpenAI

How far in the future are we talking about: 5, 10, 20 or 30 years?” In the far future, the job will hopefully not exist anymore, as machines will do all the work for us. I expect all technical work to be fully automated. The focus will be more on the human aspect of data science. What is the goal of a project for example: we optimise the things that are harder for machines to automate. I think the job will be less technical and more human focused.

ML Engineer at HyperScience

During my daily job, I’m constantly reminded of the automation of work, as I work for a company that digitalises and classifies (hand-written) documents. As for many jobs, I believe the job of a data scientist will be partly automated, but you will always need people to make decisions on things like: “how to handle missing data and how to clear your data”. Also, Data Scientists will still be needed to explain to management what is possible and what not with a model. And even if, for example, the architecture is automated, we still need to navigate through examples and make choices about which model and technology to use in the end. So, although the job will become more focused on decision-making and explaining, I do not think that our will become obsolete in the foreseeable future.

Director of data science Master Program at Northeastern University

I see three different types of data scientists, namely those:

  • interested in machine learning;
  • particularly interested in one sector, for example healthcare or finance; and
  • who just like to move data around.

But no matter what someone’s motivation for becoming a data scientist is, I want to stress that communication is the most important factor. Yes, parts of the job will become more automated, but we always need a human to validate the model. We need to have a thorough understanding why a model fails — it can even be critically dangerous to use a model without any checking.

Research ML in Hedge Funds at Voleon:

In the financial sector, I expect an increasing need for data scientists. Companies use an increasing amount of information for their predictions, for example the weather and the news, and I believe there is still more to gain. There are more and more firms selling data that could help financial institutions, such as Bloomberg. You can read about how the quant revolution started in ‘The man who solved the market’, which is a book by Gregory Zuckerman that I a highly recommend.

Research Staff Member at IBM

It is hard to predict the exact changes to the scope of activities related to the job, but I believe the fundamentals will not change. The maths, programming and communication will stay important factors. Furthermore, the job will always require critical thinking and a willingness to learn. These skills and values will not lose their importance, regardless of the degree of automation in the future.

Manager of Cisco data science Lab

I believe it depends on the kind of firm you work for: the role at large companies will become more specialised, while medium and small companies will need all-round data scientists. Next to implementing the model it is also important to communicate and explain how it works no matter what kind of company you’re working for.

What can we say about the future of data science?

As you will have read above, there was no full agreement and answers varied wildly from person to person. Furthermore, the definition of a data scientist that people used differed quite significantly as well as the answer to the relevant query when ‘the future’ actually starts. Despite these differences in interpretation, there are some points we can deduct from these interviews:

· the job will become more human focused;

· the technical work will be more and more automated;

· explaining how a model works and explaining it to our stakeholders will gain in importance; and

· model validation can never be fully automated.

One trend was undoubtedly apparent from the answers I received: they almost all agree that the human aspect will become increasingly important.

I personally agree with the research staff member who works for IBM that the fundamentals of programming, maths and communication will stay essential, at least for the coming ten years. I do believe that the composition will change and is in fact already changing. We can already see a trend towards more models included in packages and less complex programming languages such as Python. Is Python our final destination or are we moving to even simpler plug & play models?

We need strong communication skills and a solid foundation in mathematics to understand and be able to explain a model to its users. We are currently witnessing two developments significantly affecting the job of a data scientist: more regulation and more automation. I believe both of them will shift the focus of the job onto the need to explain how the model works and what the consequences are. Effectively explaining these will be key to success in the decade to come.

What about ING?

ING has also not been sitting on its hands. Recently, it (i) started a collaboration with the University of Delft on machine learning, (ii) set up an international Analytics traineeship to attract new talent and (iii) created a Global Analytics community, covering different domains of expertise (e.g. fraud detection, retail banking, HR, and wholesale banking) [6].

The four points which I deducted from the interviews (please see above) are also being incorporated at ING. ING is increasingly focusing on the explainability and automation of its models. This can be evidenced by a growing model validation department, which is tasked with gaining a thorough understanding of the model and its underlying assumptions and limitations. Furthermore, ING is automating its machine learning models, enabling data analysts with limited programming knowledge to create and apply these models in their work. With automation taking away some of the technical processes, the human aspect of the job is poised to gain more importance.

What do you think?

What do you believe the future will hold for data scientists: will the job become less technical, demand stronger communication skills or be fully automated? And, perhaps more importantly, how to prepare for such a future? I would greatly appreciate it if you could share your views.

[1] https://www.forbes.com/sites/gilpress/2020/01/06/6-predictions-about-data-in-2020-and-the-coming-decade/#46e76e54fc36

[2] https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf

[3] https://www.idc.com/getdoc.jsp?containerId=US44797920

[4] https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century

[5] https://www.glassdoor.com/List/Best-Jobs-in-America-LST_KQ0,20.htm

[6] https://www.ing.com/Newsroom/News/Data-driven-from-bytes-to-business.htm

--

--