The Ultimate Data Interview Checklist

Lauren Glass
DataSeries
Published in
6 min readApr 22, 2019

Nervous for your Data Science / Data Engineering interview? Start here.

Data Science, Data Engineering, Business Intelligence, Data Analysis, and other related positions fall at an intersection of coding, databases, statistics, and business/product. This blend of subjects results in an engaging and challenging career. The interviews are similarly engaging and challenging. :)

When I was studying for my data interviews, I noticed there was not one “holy grail”. I was googling, “review SQL fast”, “data science interview questions”, “statistics questions”, “data model interview questions”, and more endlessly. I found a lot of resources and also a lot of gaps — which is why I wrote this list.

I’m putting together this list to help others who are taking their next step in their data career. While the title says, ‘ultimate’, I’m hoping to keep collecting resources from others who have also been through the process. I want to get feedback on what worked & what didn’t. I want to keep growing this list.

How can you help? Share this with your friends! And also reach out to me with resources that worked for you!

Data Structures & Algorithms

Interview Cake: My subscription to this service was the best money I ever spent. Thank goodness I did because it prepared me for every data structure and algorithm question that came my way. They take you through the theory and how to code it like no other resource I have seen. Be prepared though, it’s pricey but worth it.

Cracking the Coding Interview: This book is the best place to start learning and reviewing software engineering aspects of your data interviews. The reality is that software engineering fundamentals are generally expected of those of us in this field, this book really helps cover any gaps you may have.

Introduction to Algorithms: Be prepared to invest time in this one, but it is worth it. This textbook covers algorithms in depth. I re-coded the pseudo-code in Python and it was the perfect supplement to Interview Cake.

SQL

W3 Schools — SQL: This is the place to go if you have never written any SQL before. It lets you run and experiment with queries while learning SQL syntax. The most important thing here is to learn to visualize the data underneath each query.

Data Mastery — SQL: I wrote this one because I noticed a lot of the SQL resources are badly formatted or missing tricks of the trade. I designed it so technologists could review SQL syntax quickly, practice, and continue to use this book as a reference.

Select Star SQL: This one was recommended to me by a software engineer. It has interactive practice questions that reach the subject of JOINS, which is great. This resource will also be very helpful for experimenting with queries and learning to visualize data.

Mode SQL Tutorial: I came across this on Quora somewhere as a great place to practice beginning to advanced SQL queries. This resource will be great for anyone who needs to ramp up to advanced topics and learn to visualize data transformations.

Machine Learning Algorithms

Hands-On Machine Learning with Scikit-Learn and TensorFlow: This book is the #1 bestseller in multiple Machine Learning categories on Amazon & came to me through a recommendation. I haven’t read it yet but skimming through the table of contents it looks very robust.

Machine Learning A-Z: Hands on Python & R in Data Science: This is a course on Udemy, which I like because it has exercises and is more interactive than books. It’s great for starting out and doesn’t require you to download an obscure language.

Data Warehousing

Agile Data Warehouse Design: This book covers the fundamentals of data model design. It’s a good book to read if you are headed into a BI or data engineering interview.

The Data Warehouse Toolkit: This book is the industry standard, must-have for any job that includes designing database schemas / Data Warehousing.

Product Analytics

Cracking the PM Interview: There’s a lot of soft stuff in here. For those of us who are not aiming to be a Product Manager a lot of it may not be useful. But check out the Company Research, Estimation Questions, Product Questions, and Case Questions chapters.

Lean Analytics: I put a post out asking for product analytics recommendations and this book came highly recommended from a few people. Disclosure: I have not gotten through all 440 pages. From what I have read I like that it dives into different kinds of metrics, how to think, and what to avoid. So far it seems like an important read to prepare for conversations that involve Product Thinking. And the book is recommended for Data Scientists by the authors themselves!

A/B Testing: One huge concern of mine is that a lot of product analytics books gloss over the tough stuff: math. And to get to the math needed for data work, you have to dive into a statistics book or a machine learning book which can be very broad. I was glad I read about this resource on Udacity that focuses specifically on A/B Tests which is an essential part of any data related position. Also it’s great because it is called A/B Testing by Google!

Statistics

Heard in Data Science Interviews: This one was recommended to me by an expert data scientist. This book covers a lot of topics but I was told the statistics section is worthwhile.

Practical Statistics for Data Scientists: This is a solid refresher. If your interview is predominantly statistics, you should invest in reading this book.

This list is incomplete! I’m looking for recommendations for more Machine Learning, Product Analytics, Big Data, and Coding (R & Python) resources that help prepare people for interviews.

Am I missing your favorite resource from this list? Did something else come up in your interviews that is not a subject covered here? Let me know.

Connect with me on Instagram @lauren__glass & LinkedIn & Facebook

--

--