What Makes A Good Data Scientist At A Small Company
Data science is one of those fields that everyone’s talking about but few people know how to do “properly.” Schools are just now starting to pick up on how to teach it. Each company has its own recruiting practices when it comes to data science (heck, some don’t have any at all).
I’ve been working in this field for a bit of time now, have done lots and lots of data science interviews, and have recruited a team of people that work on data science problems close to my heart.
I’ve noticed that the things that I value and the things that work for me aren’t always the same things that big companies tend to recruit for. So in this post, I wanted to list a few things that I think make a really good data scientist, in hopes that larger companies take note, and in hopes that candidates get a good feel for what recruiters look for.
For the sake of this post, I’ll primarily be discussing data science in the context of building and deploying predictive analytics solutions, rather than in the context of surfacing insights (though much of what I talk about will be applicable to both contexts). With that, let’s begin.
The Most Important Aspects
There are a few traits and skills that I’d consider to be the most important aspects of a good data scientist. These are things that are deal breakers. If a candidate doesn’t have these, then I already know it won’t work out. Fortunately, these are fairly basic and are easy enough for anyone to master.
Communication skills (and the ability to work well with others)
By far, the most important skill I look for in a data scientist (and really, anyone that I work with) is the ability to communicate well. Things get accomplished within teams. One person’s work will frequently depend on progress from another person. In order to create a functioning work environment, and to progress forward quickly, a team needs to be able to disseminate ideas effectively. Communication is the most important aspect here.
Frequently, I’ve found that people who are good communicators are also able to work well with others, which is extremely important in a small team. In my opinion, the two are so closely tied together that I briefly considered “works well with others” as the next important skill I look for, but decided against separating it out from communication skills because I can’t remember a time when I’ve had one and not the other.
Time management skills
I have yet to find an organization that isn’t a slave to time. Technical problems require a good amount of time, and are frequently wrought with hurdles, but it’s still important for a data scientist to be able to understand how to get things done in an effective manner.
Note that sometimes getting things done may require surfacing issues and recruiting some help, or it may be saying flat out “we shouldn’t be doing this now since it’s going to take too much time.” But understanding how to get things done effectively is crucial.
Problem solving skills
Technical work is just a series of solving one problem after another. You could even argue that this is the nature of all work. In general, it’s important for someone to know how to go about solving tough problems. This goes hand-in-hand with the ability to learn new concepts quickly. When it comes down to it, I’ll always go with someone who can pick up and run with new concepts over the person that’s mastered an important skill but can’t learn new things.
This should go without saying, which is why I didn’t list it as the number one priority, but working hard is essential. You can’t really get anything done without being willing to work at it. This is obvious, but I figured it’s better to over-communicate than under-communicate (going back to the first point, I suppose!).
The things mentioned above are far and away the most important things I look for. I’m a big proponent of hiring for the ability to learn and grow and work well.
With that said, it’s important for people to have some solid skills that they can put to use. These skills are important and are generally what I’ll test for when recruiting a new data scientist. It’s important to note, however, that I’m not testing for the ability to solve brainteasers, or trying to trip up someone in an interview. I’m testing that the person can do what they say they can do, or that they can think through a problem in technical terms in a way that’s going to reflect their real-world, day-to-day job.
Really, there’s no two ways around it. In order to be a good data scientist (for the purposes of building up a predictive analytics solution), you’ll have to be a decent coder.
I can teach you good software development skills, but you’re going to need to put in the effort to adopt them. Coding is essential and immensely crucial, and it’s important to be able to do it at least in a way that can be iterated upon and that can add value to the company’s product or business.
Good grasp of data distributions, data cleaning, and learning algorithms
As a data scientist, it’s important to know how to proceed towards your goal given the resources you have at your disposal. You need to have a basic understanding of how to take raw data and prepare it to be fed into a learning algorithm, and you should also know a bit about learning algorithms themselves.
Note, I don’t personally care if you can give all the details about how an algorithm works under the hood. What I do need to know is that you have a high-level understanding of what’s going on in the algorithm, and where it may work, and where it may fail.
I love talking to folks who are passionate about data science. It’s true that there can be something quite “magical” about seeing a prediction materialize from a set of previously meaningless data. Passionate people are driven to learn and understand more about their field, and that’s the type of person I want on my team.
If I can find someone that has all of the traits I’ve listed above, I’ll be fairly happy. But if I can find someone that has those traits, plus a few nice-to-have traits that I’ll mention below, I’ll be ecstatic.
Data science is founded on a good amount of math. Knowing that math allows a data scientist to modify the inner workings of algorithms (though such a task isn’t usually needed in applied/practical machine learning, the need for it does arise every now and then). Knowing the underpinnings of a learning algorithm or the statistical and probabilistic theory behind a concept can prove invaluable at certain points, and it would be fantastic to have someone on the team who can apply their theoretical knowledge to real world problems.
Ability to grok white papers
Data science is a quickly evolving field, with new techniques and methodologies posted daily. It’s possible that a new technique that could greatly improve your predictive analytics application may be buried in a hard-to-read academic white paper.
Having the ability to read and understand these white papers could mean the difference between being at the forefront of implementing the “next” convolutional neural network, or being ignorant about it until it’s implemented in a popular framework.
As a data scientist that has interviewed at larger companies, this may seem counterintuitive to you.
In fact, when I was interviewing for data scientist positions at a large, well-known Silicon Valley firm, the focus was frequently on statistical knowledge, understanding of the inner workings of algorithms, and giving silly coding tests.
The funny thing is that some candidates that do well in those sorts of interviews didn’t do so hot when working in their day-to-day roles. The reason I emphasize the softer skills first is because all throughout my career, from consulting to software engineering to startups to data science, the people that provide the most value are the people that can help and work well with those around them, regardless of their functional roles.
With that said, it’s important to have a decent foundation on the technical skills that are needed for data science. The good part of this, though, is that those skills can be acquired over time with practice and diligence, so as long as you’re willing to put in the effort to learn and do some hard work, you can eventually acquire those skills.
So I’ll end with some questions. What about you? How do you approach data science recruiting? What are some of the most important areas you look for and how do you ensure that candidates have them?