Hiring a Data Scientist? Focus on the Math!
Data science is a field at the intersection of mathematics, technology and business. It draws from many skill sets including statistics and programming, as well as from deep domain expertise. But that doesn’t mean that individual data scientists have to be multidisciplinary. In fact, searching for a polymath data scientist who can do it all is likely to lead you to failure. That’s because even if you find a person who has it all, you’re unlikely to get the depth of skill in any one area (math, technology, business) or the scale that your organization needs.
It also distracts data scientists from continuing to develop the core skill that underpins data science: mathematics.
Data Science Is Multi-Disciplinary, Data Scientists Aren’t
There are numerous articles online that espouse the view that data scientist should have a multitude of skills. Such articles call for data scientists to be programmers, statisticians, data whisperers, domain gurus, communicators, problem solvers, Spark / Hadoop experts and product geniuses. In my view, looking for unicorns like this that just don’t exist is wrong. A better approach is to look at data science as a team sport, where you potentially have multiple specialists in each of the key skill areas.
So what might that look like in practice?
Take a look at Figure 1 below. In this model, the magic of data science happens in a team that includes the math skills of a data scientist, the technology know-how of an analytics architect and the business / domain focus of a product manager.
[caption id=”attachment_5421" align=”aligncenter” width=”612"]
Figure 1: Data science happens at the intersection of Math, Tech and Business[/caption]
Importantly, the product manager should probably be hired first since it’s critical to start out by clearly defining your target, or desired outcome, in any data science undertaking. By identifying and prioritizing the most impactful business insights, the product manager sets the stage for the data scientist, who in turn can help define the data sandbox for the analytics architect. That person leads the programmers, infrastructure admins and ETL engineers, among others, needed to deliver the software infrastructure to support your data science effort.
Does all of this mean that your data scientist only needs to be skilled in mathematics? No, that person definitely needs to have a mix of skills, but their overwhelming emphasis should be on mathematics. Figure 2 shows what the relative emphasis of a typical data scientist might be.
[caption id=”attachment_5422" align=”aligncenter” width=”654"]
Figure 2: Relative priority of skills for a data scientist[/caption]
The point here is to make sure they have deep mathematics skills first and foremost. Then look for someone who has skills in SAS, SPSS or R, exposure to programming languages, great researcher skills and the ability to communicate concepts.
Taking this approach to finding data scientists and building a data science team will mean that you find many more suitably qualified candidates than if you try to go after the mythical polymath unicorn. It will also mean that your data science project is more likely to succeed and scale to deliver the insights your organization needs.