5 Tips for Conducting Data Science Interviews
Step 1: Prepare for them
#tldr Technical interviewing is a broken process for everyone. While preparing for conducting an interview, you may want to keep in mind these 5 tips: 1) Keep an open mind; 2) Don’t get sucked into discussing a single thing; 3) Introduce time constraints to problem solving questions; 4) Discuss issues of scalability; 5) Don’t forget data munging, retrieval and coldstarts.
Technical interviewing is a broken process for everyone.
It usually takes a few attempts to internalize the fact that the stuff you do in technical interviews isn’t actually the stuff you do at work every day. For this reason technical interviewing is very much a numbers game; the more you go through the better you get at it.
Having spent some time on both sides of the table, I recently admitted to myself that conducting interviews was a piece of cake; but conducting GOOD interviews is HARD. What do I mean by good? I mean a one hour conversation that:
- Limits the weight of your personal biases in the decision process.
- Distills some truth about the person’s skills alignment for the role.
- Figures out their ability to work with passion and kindness in your team.
Truth is, it’s hard to know WHAT exactly you should be evaluating in a candidate and HOW. Do they know what L1 regularization is? Unsupervised versus supervised learning? You can cover a lot of basic knowledge but what does that tell you? When I look at all the interviews I’ve experienced, I’m led to think there’s probably as many different interviews as there are interviewers. So how to make sense of it all as a candidate and interviewer?
There’s obviously one thing above all that you can do that’ll increase your chances of success on either side and that’s to prepare for that interview. Don’t assume that because you’re on the safe side you can take it easy. Sure, your job is not on the balance but you’re risking wasting everyone’s time by not giving this process sufficient thought. In this post I list a few things I try to remind myself while I prepare an interview in an attempt to tickle out more subtle things from the way a candidates mind works.
1. There’s No Single Solution
We all come with a baggage of ideas and biases when entering an interview: the last things you’ve worked on, the problems you’ll be tackling next week, the ideas you have about how we should tackle those issues, what you’d do if you had more time, your favorite technique, etc. But to hold onto those thoughts is a mistake to avoid. What I found is that failing to do so, I sometime find myself looking for a mirror in the candidate. I would unconsciously want the person to tell me what I was thinking, expecting them to find their way to the solution I had pre-baked rather than truly consider the answer he/she would give me and use that information to engage in a discussion around the topic.
There are many solutions to a single problem and the brainstorming arising from not expecting a specific solution can yield much more interesting discussions than if I hadn’t felt so readily satisfied by an answer (if it turns out to be the one I expected) or rejected the proposal on the basis that it wasn’t what I had in mind.
If there’s 5 people in a room with the same idea, there’s maybe 4 people too many
2. Aim Wide and Deep
Imagine the following scenario: the conversation flows well between you and the candidate. There seems to be a mutual understanding about the usefulness of L2 regularization for a given scenario. In fact, the candidate had to deal with a similar problem on a previous project. You then spend the next 20 minutes discussing this idea in what turns out to be an interesting back and forth with a potential future colleague. You leave that interview room pleased with the candidate thinking they know their stuff.
But what have you figured out about what they don’t know?
It’s easy to get trapped discussing a single approach. The problem is that it offers a very limited perspective on the candidates. It may be that the candidate knows a lot about the topic, or that the question is highly relevant to your own work. In both cases, it’s better to remain cognizant of the time spent so you can steer the conversation on various aspects related to the role to be filled. You want to give the candidate a chance to shine where they most feel comfortable while poking at various topic to get a sense of the breadth and depth of their knowledge and experience. The goal isn’t to make the candidate feel like under an interrogation but to get a sense of their strengths and weaknesses in an honest and constructive way. To figure out which is why you need to poke at different places. At the end of the day, it’s perfectly okay not to know the difference between two optimizations algorithm; everyone has their blindspot.
3. Introduce time in their thinking
Time is critical in most businesses. There’s always much more to be done than the allocated time for it. How does the candidate prioritize his work? Can they come up with a quick ‘n dirty solution that’ll work as proof of concept? Can they expand on that idea and provide a mid-term improvement? What would they do if he/she had an infinite amount time? This tells you a lot about how adaptable the candidate is and whether they can come up with pragmatic approaches to whatever problem they’re facing.
4. Scalability
Most of the things built these days need to scale. Probing them about how their approach would work on large amounts of data tells you a lot about the breadth of their thinking. Can their approach easily be parallelized? Efficiently distributed? Beyond the challenges related to modeling, prediction, cross-validation, and accuracy, how do they see their work being eventually pushed into production? Have they had any experience working with distributed computing? Are they familiar with basic concepts of mapping and reducing?
Seeing the candidate coming up with a MapReduce based solution is interesting to see when appropriate. Anything that considers parallelism are good pointers (e.g. queuing or multi-core). Nevertheless, not being able to convincingly discuss any of these points is not a show stopper for me. You can easily bring them up to speed with some of the core concepts and see where they take it from there. It will also help reveal certain facet of their personality such as the ability to think on their feet or contextualize their experience facing new constraint and environments.
5. Data Retrieval
In my Starting at a Startup post I discuss how often you’ll be asked to come up with solutions and produce results without any readily available dataset. This fundamental problem of searching, listing, acquiring, aggregating, manipulating, filtering, merging, data is central to any Data Science position.
Most applicants have only worked with toy datasets and labelled data. This prepares them for only a small portion of their daily task and its worth the effort to investigate deeper on that front. My interviews will typically involve at least one real-world scenario where you give the candidate barely next to nothing to latch onto. Most of them will at first feel some level of discomfort and uncertainty (one unfortunately did not easily grasped how you could do anything with nothing). Fortunately, some will rise above and begin considering the small steps they can take towards having some initial data to work with and make incremental improvements on this.
Got a few more tips to help others conduct rewarding and useful interviews? Would love to hear about them in the comment section!