How to Construct a Data Science Portfolio from Scratch

Manali Shinde
One Datum At A Time
7 min readApr 17, 2018

--

I feel like this is very meta because I’m writing about how to construct a portfolio on my portfolio…

Hello Readers!

Welcome back, I hope that after reading through some of my case studies you’ve realized a few things of what I’m trying to do here:

A. Showcase my work and my python/analytics skills
B. Showcase my journey in learning python/analytics skills!

Full disclosure before I start this article: I am quite new to the Data Science field. In fact, at the moment, my job does not entail a lot of data science, but more so basic analysis and reporting. That being said, in the next 5 years, I am aiming to fully plunge myself into data science as a career and into academia.

That being said — during that “prep” time I want to ensure that my skills as I learn them are being put to practice. They say practice makes perfect, and I don’t have to tell you that they aren’t wrong! In order to get ahead in this field, and any other, the only thing that ensures that you go from beginner to intermediate to expert is through practice, practice, practice. And remember to not be discouraged because:

Every expert was once a beginner — Helen Hayes

In my case, I want to feel as if the work I’m doing is for something. Some goal that I am trying to achieve, or something that I can display as that is what motivates me. However, while that may not motivate every new data scientist, after talking to industry experts (i.e. consultants, and people with PhD’s in this field i.e. super smart people), one piece of advise has always been common: make a portfolio. Showcase your work somewhere because you never know who will read it, and may want to reach out to you.

My Academic Background — Building the Base

I may not know much about constructing a house, or any type of building, but I do know that before starting any construction project, it is very important to scope the area and build a base. That is what I did with my education. I graduated with a Honours BSc in Biology and Psychology. You may be wondering — that has nothing to do with data science, how is that useful? Well it really is. Getting that initial undergrad degree has helped me build strong critical thinking skills, researching skills, and most importantly a drive to stick to and love Science. I have also been able to develop a love for healthcare and public health, and aim to be a public health advocate.

After getting my undergrad degree, I enrolled in and am currently undergoing a graduate certificate in Data Science. This is where the exoskeleton of the house is being built. I’m learning Python, Statistics, constructing Machine Learning models, and how to deal with any and all kinds of data.

Therefore, if you’re currently undergoing an undergrad degree or post-secondary diploma and feel, “man, this is useless, I can be doing so much more with my life”. Don’t listen to what social media or social media “celebrities” say my friends, an education is important! We build skills that we don’t even know we had until they are put into application (most likely at your place of work or in your career).

Creating the Portfolio — Your Tool Kit

  1. Medium and/or Other Blogging Platforms

This site is beginning to feel more and more like home. When I initially decided to make a portfolio, I thought of creating my own website on a platform such as WordPress or Squarespace. While those platforms are amazing to host your own portfolio, I wanted a place where I would get some visibility, and a pretty good tagging system to reach greater audiences. Luckily Medium, as we know, has those options (and it’s also free). First, I made sure that data science and analytics is something that was talked about and written about on this site — I was obviously not disappointed.

Once my profile was set up, I started to post my successful assignments that I had done in my certificate class. Slowly, I’m hoping that this blog will host a good chunk of interesting case studies for people to read, as well as articles like this. Articles that may help and intrigue people when navigating this career choice.

The only drawback I find with Medium is that I haven’t found a more efficient way to import my Jupyter Notebooks into Medium in order to post my code. Posting code from an IDE is out of question! At the moment, I just snip the selection and post it into the article as a picture. Although, I’m sure that there is a better way to post the JNs — I just have to find it (and when I do — I’ll be sure to make a post about it).

Other blogging platforms you can use: the aforementioned Wordpress and Squarespace, Tumblr(I haven’t explored DS on Tumblr but it also has blogging abilities and a large user base), for any web developer you can of course… build your own.

2. Kaggle

Kaggle was one of the first sites that really got me into Data Science. If you want to really showcase you skill set, and have people revere you work, the tool you can use is Kaggle.

Participating in competitions (and of course winning), creating Kernels from data sets, and contributing to discussions can get someone really seen. In order to really put your work to the forefront, I think Kaggle is a must-have in your online toolkit.

Building your Kaggle profile can be something that you put on your resume, and employers can just double check your skills by searching your name. They can see how many competitions you have participated in, the types of models you have built, or analytics you may have done with the data sets available.

Another reason to use Kaggle is the community itself. I found the Kaggle to be quite judgment free and receptive to new learners. You can ask questions and have anyone from students to experts give you a well thought-out and earnest answer. If you need some more practice, you can go to the Kaggle Learn section and review machine learning, SQL, or if you would like to venture into R programming territory, you can do that as well. It’s a fantastic tool/website, and I honestly urge those who haven’t tried it yet to do so.

3. Github

Oh Github — I think Github is THE coding site. You want to make sure you have at least something on there. Although…at least for me personally..it’s also one of the most confusing sites! I honestly think that there should be a separate module in every data science or coding course that deals with how to post on Github. Due to Github being the primary source that almost all people in computer science, data science, etc. use to share their work, I think it should be necessary for new learners to become really familiar with it. That being said, I’m almost 98% sure that there are tutorials for Github available on YouTube (I’m iffy with the 2%).

I haven’t been posting too much on that site, but as soon as I figure out how it works, I know for a fact that I will be using it. Maybe you’re a bit unsure about Kaggle, maybe posting on Medium or making your own blog isn’t your cup of tea — but at least Github should be a part of your portfolio. Industry experts or people who are hiring for Data Scientists positions often check Github to see who is posting what, and how accurate your code and models are. Github is like the hammer in your toolbox — you need to have it!

4. Twitter/Social Media

Finally, I think one of the most popular social media sites for Data Scientists is Twitter. Of course there’s Facebook (but as a data scientist, do you really want to use FB after it’s debacle?), and LinkedIn as well. LinkedIn is a great site to share you articles and insights to people who may be future employers, or read articles to expand your own knowledge.

If you want to post your insights, perhaps articles you wrote, or read up about what’s the latest and greatest in this industry, I do think Twitter is the site for you. It’s a great way to get in touch with and follow experts in their field, who often have blogs of their own you can read, and the data science audience is big enough that you can reach quite a few people. So tweet away! Share your insights in 280 characters or less.

The Construction

However you want to build your data science portfolio is up to you. Maybe you want to incorporate all 4 tools, maybe none at all! I think the important thing is to practice. Data Science, analytics, coding — these are all like learning an instrument. The more you practice, the better you get and master your craft, and there is never a moment when you are not learning. Whether you are a Junior Analyst or a CEO — practice is everything. If people can see your work along the way and provide praise or feedback, it just brings you that much closer to that “expert” title!

I hope these tips helped you out — let me know if there is any other tools you use to build your data science portfolio. That’s it for now, tune in for more articles, case studies, and tutorials. Thanks for reading!

Happy Building :)

--

--

Manali Shinde
One Datum At A Time

A health informatician and aspiring health data analyst. I am a photographer, writer, dancer, and public health advocate. Join me on my journey!