Here’s a generic picture of “interns” working. Thanks Unsplash.

5 Things To Look For in a Data Science Internship

Ryan Harrington
CompassRed Data Blog
5 min readFeb 5, 2019

--

Any trip to your job board of choice will quickly tell you one thing: a lot of companies are hiring data scientists of all skill levels¹. There are a lot of common themes across these roles, which have been covered in plenty of places. From a technical perspective, you’ll need skills in the most popular open source languages — R or Python and SQL at a minimum. If you have experience with a variety of big data toolsets and scripting languages you’re even better off. That’s to say nothing of the large variety of other technical skills a data scientist might need. From the soft skills perspective, every company wants data scientists to be story tellers and business experts.

Let’s assume that you’ve built an understanding of some of those key skills. You’re proficient in R or Python. You’ve mastered SQL. You understand how to tell a story with data. You’ve meticulously worked to build your skillset and now you feel ready for your first data science internship.

As you prepare for the internship application process and begin interviewing, it’s worth reflecting on some of the experiences that you should get out of the opportunity. There’s one key point to remember: internships should be just as valuable for you as they are for your employer.

Having a fantastic internship experience can help you to better understand what you’re looking for in a full time role, while helping you to grow as a professional. With that in mind, there are a lot of different qualities that you should consider when evaluating data science internships that might not directly show up in a job description. Here’s 5 of them.

1. You should have exposure to every part of the data pipeline.

If you’re going to be a data scientist, then you need a strong understanding of the data pipeline. Data needs to be ingested, validated, stored, extracted, transferred, loaded, cleaned, modeled, visualized, evaluated, and deployed (among a whole variety of other equally important verbs).

Most people who are applying to data science internships are doing so while they are at a university or completing an alternative form of study. As such, they often only have exposure to a few parts of the data science pipeline — often the “cleaned, modeled, visualized, evaluated” portion of it. These are extremely important parts of a data pipeline, but understanding the rest of the pipeline is just as valuable.

A great data science internship will give you exposure to all of the parts of a data pipeline. That is not to say that you should directly work on each of those parts of the pipeline, but rather that you have the chance to interact with the people who own those portions of the pipeline and get a better understanding of what their roles entail.

2. You should have the opportunity to solve a real, big problem.

At its core, data science is about problem solving. To build your skillset as a data scientist, you should prioritize internships that let you solve real problems.

Data science curriculums are typically filled with projects, but those projects tend to be intentionally contrived. They typically come with clearly defined objectives, relatively clean datasets, clear expectations for how to solve the problem, and do not address a meaningful, real-world challenge. Your internship should provide you with exactly the opposite experience: the chance to grapple with loosely defined objectives, messy datasets, open-ended expectations about solving the problem, and the chance to address a meaningful, real-world challenge.

Of course, every internship that you apply for will tell you that this is exactly what you’ll be doing, but the reality is that this isn’t necessarily the case. To solve this, consider asking questions about what you’ll be working on during your time with the organization.

3. You should have exposure to different types of problems.

Data science has a wide breadth of problems that it can address — from association to classification to forecasting. In an academic setting, you’ve likely had exposure to many of these types of problems and have probably found that you have preferences for one over the other.

To get a better understanding of the types of problems you enjoy solving, look for an internship where you might need to use multiple types of techniques. You might find that you appreciate classification techniques as opposed to association techniques or find that you excel at the art of clustering. All of these experiences will give you a better sense of the type of role you might search for in the future, making you a better data scientist.

4. You should have exposure to managing stakeholder relationships.

If data science is about problem solving, then the problems have to come from somewhere. Those problems come from stakeholders. Data scientists are experts in data. Stakeholders are experts in their domain. In order to be successful in solving any data science challenge that you’re presented, you have to work successfully with stakeholders.

A strong data scientist understands how to manage stakeholder relationships. This requires a wide breadth of skills — identifying appropriate business problems, managing expectations, managing scope creep, learning from subject matter experts, communicating results. As part of your internship you should not be the person actively managing the stakeholder relationship, but you should absolutely have exposure to that process.

5. You should have the freedom to fail.

Data science. What’s a key tenet of science? Testing hypotheses. Frequently, hypotheses fail. If we extend that to the data science process, we should expect that sometimes when solving problems that we might fail. Our initial approach to a problem may be incorrect. Even if our general approach is correct, the techniques that we use might not prove successful. More frustratingly, sometimes our dataset just doesn’t lend itself to a solution, even when we’ve matched the perfect approach with the perfect technique. All of these issues are part of the process.

The only way to move beyond them is to have the freedom to fail within your organization. Experimentation should be encouraged. Getting to the “correct” answer should be less important than willingness to grapple with the data science process as a whole. This process of experimentation is how you learn, how you get better, and how, ultimately, you find success.

¹ We’re one of those companies. You should drop us a line at careers@compassred.com if you’re interested.

--

--