How to Create an Outstanding Data Science Portfolio
Two things are very clear about the data science job market today:
- There is a lot of demand for qualified data scientists out there. (And most of these jobs come with handsome compensation and other perks.)
- This demand has resulted in a flood of job seekers applying for available positions.
Suppose you are aiming for an entry-level position in this market. You’ll be primarily competing with recent university graduates, software engineers transitioning to data science, and graduates of Data Science bootcamps and online programs.
How do you make yourself stand out from this crowd?
There’s a straightforward, but by no means easy, solution: build an impressive portfolio showcasing your skills.
How to get started
Identify your interests and goals
Are you interested in working in a specific industry such as banking, retail, information security, etc? Do you enjoy telling stories with data and creating compelling data visualizations? Or maybe you are more into building state-of-the-art machine learning models?
Answering these questions will require some self-reflection, but this process will help you decide which direction you want to apply your portfolio-building efforts.
Research job descriptions
Once you know what area of data science you want to specialize in, you might want to look through a few job postings in this area.
Do your research and pay attention to the following:
- What does the company do? Specifically, how does the company leverage data science to make money?
- What technical skills are they looking for in addition to what you already know? It’s possible you may need to become more comfortable with the Linux command line, cloud technologies (like AWS), or even NoSQL databases.
- Will a data scientist contribute to a customer-facing product or build solutions for other teams within the company?
- What about soft skills? Will this role require educating other people from inside or outside of the company on the capabilities and limitations of data science?
What will your portfolio consist of?
There are plenty of options when it comes to selecting the area(s) that will help you stand out, for example:
- Creating personal projects on GitHub
- Contributing to open-source software
- Writing blog posts
- Presenting at meetups and conferences
- Building a reputation on Stack Overflow by helping others
- Participating in data science competitions through the Kaggle platform
Principles to follow when building your portfolio
Regardless of what portfolio area you want to focus on, there are a few principles that you should keep in mind whenever you start brainstorming new project ideas:
- Finished is better than perfect. You might be overflowing with ideas and attempt to start working on several simultaneously. This might not be a good strategy. You run the risk of spreading yourself too thinly and risk not having the energy or motivation to finish any particular project.
- Quality over quantity. Remember that the goal is not to produce as many artifacts of your work as you possibly can but instead create a few quality pieces that will get you hired.
- Interesting data over advanced analysis. Avoid using datasets common in many “Intro to Data Science” courses, as these datasets have been analyzed and looked at from every possible angle. It’s unlikely you’ll find anything intriguing. Instead, try to find “hidden gems” in public dataset repositories, such as Google Data Search or Open Data on AWS. Many governments have Open Data portals that are treasure troves of interesting datasets for those who care to look.
- Storytelling with data. Whether you are writing a blog post, building a web application, or presenting live in front of an audience, your primary role is that of a storyteller. And that means your story should 1) lure people in with an intriguing introduction and attractive data visualizations; 2) provide sufficient context to the dataset and your approach; 3) teach the audience something they didn’t know earlier.
- Amplify your work by sharing it on social media. This might be intimidating, but these days that’s pretty much the only way to get your work exposed to a lot of people. Incidentally, a few of them might be your future hiring managers — you never know.
- Iterate based on feedback. Be open to editing your work based on the feedback you receive from others. Not only will this make your project better, but it will also be a learning opportunity for you to identify and address your knowledge gaps.
Ultimately, a portfolio is evidence, if not proof, that you can perform the kind of work you claim in your resume. It will open doors for you that, without it, would not be within your reach.
Hopefully, by now, you have a good understanding of how to approach the process of building a portfolio that will eventually get you hired.
In the following posts, we’ll elaborate on a few portfolio areas like: what makes a great blog post, where to start when contributing to an open-source project, and what makes a good personal project on GitHub.
Learn more about the data science bootcamp and other courses by visiting Practicum and signing up for your free introductory class.