How to Crack a Data Science Interview — Part 1: Technical & Business Aspects

Zacharias Voulgaris
Bootrain Blog
Published in
7 min readMay 22, 2020
Interview image

Being a data scientist established itself as one of the most promising job titles of our times. Working with data full of insights, discovering useful patterns and relationships, implementing cutting-edge smart systems are all fascinating achievements that are brought about by the hands of the data scientists. In this article and the next, we provide some useful insights as to how to become a data scientist by fulfilling a data science interview.

As the first one of a two-article series about how to nail data science interviews; in this article, we talk about the technical and (to a lesser extent) the business aspects of a typical data science interview process. In the next article, we’ll look at how you can handle the non-technical aspects of this process (e.g. soft skills, your general presentation as a candidate, etc.). Our goal is to provide a guide and a framework for those who are preparing for data science interviews to achieve what they aspire to. We appreciate feedback and your experience on interviews for data science roles so feel free to comment on this article afterward.

Let’s start by sketching the holistic view of a typical data science interview.

Data Science interviews in general

Since interviews remain the primary method for recruiting data scientists, it is important to know how to prepare for them. In its simplest form, we can differentiate between two different but intertwined aspects of a data science interview:

  1. The first one is the technical aspect of the interview process. During this process, you need to confront the technical challenges coming from programming, machine learning, etc. Data science is an interdisciplinary field and hence its technical stack is quite vast.
  2. The second aspect is business-related matters like how to communicate your technical results to a non-technical audience, how to show your domain knowledge related to the area that the company is operating on etc.

We dedicate this article to the technical aspects of the data science interviews. For the business-related aspects, you can resort to the second article when it becomes available. It doesn’t matter if you’re an aspiring data scientist or an experienced one who wants to change jobs; you need to be prepared for the technical aspects of the interviews as you need to pass the technical interviews to make it to the later stages. Now, let’s talk about what kind of topics are awaiting you during a technical data science interview.

Outline of a typical technical interview

As we said, data science is quite rich in terms of the technical capabilities it requires a data scientist to have. This is basically because data science is an interdisciplinary field and it combines math, statistics, programming, and algorithmic thinking among others in its pot. In this respect, you need to be prepared for at least the following topics:

  • Programming and algorithm development (usually in Python)
  • Analytical thinking and problem solving
  • Machine Learning algorithms
  • Data science processes and workflows
  • Business-related matters

Now, let’s talk about these topics one by one and discuss how you can prepare yourself for each one of them.

Programming and algorithm development

This part of the interview entails the assessment of your programming skills and/or how you can tackle a problem algorithmically. S It can be both theoretical and practical.

Typical questions that you might be asked are:

  • What are the sorting algorithms you know of?
  • What are the computational complexities of those sorting algorithms? (big O notation)
  • Given an array of numbers, how to you get the largest k elements?
  • What are some efficient dimensionality reduction algorithms?
  • How can you estimate the probability of an event if you don’t have enough information to do so analytically using a formula?
  • Etc.

To be able to cope with these questions, learning the basic algorithms in computer science is helpful though you may want to go beyond that (after all, an interview is not the same as an academic exam). To practice on your own, there are numerous online platforms that you can make use of. Sites like Hackerrank (www.hackerrank.com), CodeAbbey (www.codeabbey.com), and Programming Praxis (www.programmingpraxis.com) are great places to start. At the very least, you can discover your limits and have a better sense of perspective. To refine your coding, you’ll need to do more, however, something we’ll get to towards the end of this article.

Data science processes and workflows

This part of the data science interview involves specific methods, techniques, and models related to the craft. It may feel like an oral exam of a data science course and in a way, it is a kind of evaluation of your knowledge. Of course, it’s not feasible (or even possible) to assess everything in a few minutes, but if you are knowledgeable in data science, you should be able to convey that through the explanation of certain data science concepts. Most likely the interviewer will focus on the material that is related to the role they are trying to fill, but they may also go beyond that, to ensure that you are not overly specialized.

For example, you may be asked questions like the following:

  • How would you explore the different kinds of customers we have, given their transaction data?
  • What kind of models would you use to predict sales and stock demands for the next month?
  • What kind of data would you use to create a comprehensible profile of our most valued customers?
  • How would you evaluate the text description of product X that we are launching next quarter?
  • How can we make use of the reviews we have accumulated for our top-selling products?

To prepare for this part of the interview you need to revise all the data science material you’ve studied and perhaps take a few quizzes. Talking to some data scientists can be very useful too so that you learn how to use the corresponding vocabulary properly. For example, what is data engineering, feature extraction, dimensionality reduction, model fitting, and model deployment?

Business-related matters

Business matters are very important in data science work and they come into play in most interviews, particularly those related to companies. As a data scientist, you need to know how to lease with project stakeholders, how to translate business questions into data science tasks, how to interpret your findings, and how to use data science to solve business problems. Understanding how an organization works and the domain it is in is crucial too.

To prepare for all this, you need to do some research on the organization, have a high-level understanding of data science work, and most importantly be able to use data science as a tool for solving real-world problems. You can demonstrate that in the interview by avoiding being overly technical, asking good questions, and stating your assumptions when offering a solution for a hypothetical problem you are asked to tackle. Also, standard questions like the following are something you ought to be prepared for:

  • How would you contribute to our current business using data science?
  • What are the points you think we need to improve our services or products?
  • If you were to start a project in this company, what would that be?

Analytical thinking and Problem-solving matters

Problem-solving is essential in data science and it is often tested as a skill in data science interviews. Even though this technically is a soft skill, it is often evaluated alongside with algorithm development, something that’s highly technical. Of course, you won’t be asked to solve very complex problems, but you may be asked to solve common ETL related problems on a whiteboard. More often than not these problems will require some programming, but you won’t need to write whole libraries to pass the interview. Just the pseudo-code is usually enough, though you may be asked about particular Python commands or even a program. However, you ought to be able to explain at least some of the most commonly used machine learning algorithms as this is something you are bound to be asked about.

Typical questions of this category include:

  • What would you do if the dataset you have has formatting issues making it impossible to access it through the standard data import functions?
  • How would you handle over-fitting issues?
  • One of the data streams used in an existing model is no longer available. How would you modify that process so that it can still yield something useful?

To prepare for this sort of interview task you need to be comfortable with programming, particularly scripts that handle data. Practicing a bit in a data science language like Python or Julia would be very useful while being able to explain your reasoning is also very useful. After all, all this is to test your analytical skills and evaluate your ability to handle situations that require logical reasoning. Also, being able to create a programming-based solution that’s efficient (i.e. of low computational complexity, measured by the big O notation) is important.

Next steps

Naturally, although all the aforementioned tips are useful on its own, they’re particularly useful when coupled with practice. Among the various services we offer at Bootrain (www.bootrain.com), preparing our students for data science roles is something predominant. All of us participating in this service have been on both sides of the interview table and are comfortable with the whole process. As a result, we provide you with useful advice and feedback on interviews, drilling into the technical aspects of them. We invite you to reach out to us for more details and get this quite challenging aspect of your data science career out of the way.

--

--