Metis Data Science Bootcamp Blog Series #1

Using Matrix Factorization (SVD) to compress an image (this one is too compressed!)

After a year off from a full time job where I explored passion projects and startup ideas I’ve decided to enroll in Metis. They run a 3 month bootcamp to take people with a background in programming or stats and teach them data science and machine learning skills.

This is my 3rd time considering doing one of these programs. I initially got in the summer of 2017 but decided not to go, then was interested last summer but deferred enrolling and finally committed in December 2018. My hesitations came from the program cost and how I enjoy working with people so I wondered if a heavy technical role would suit me.

What swayed me was that overall though I kept finding jobs I was interested in that asked for data science skills and kept finding my curiosity in better understanding stats (I’m more on the hobbyist programming side) emerge. It sounds very exciting to be able to better test and understand claims made in the world and help causes I care about make better decisions.

A few examples of this include wanting to better understand research studies and understand which are bogus like mathematician Nassim Taleb claiming Better Angels of our Nature is bad science, or the never ending litany of health studies that each claim a fad diet is good for you. It is also cool that you can use ML to help reduce energy consumption in factories or data centers. The challenge for me is to find a place where I can contribute I find interesting given too many options!

The Prep

So far I’ve spent a month doing the Metis prep work and much more and learned a ton. I’m very happy to have committed to doing the program and still have a lot of free time to prepare. I’ve used this time to review lots of data science tools (Pandas, Matplotlib, Numpy, scikit-learn) and a good amount of stats. Two of my favorite free resources I read were Think Python and Think Stats, which walk through many examples in each topic with lots of good code. I’ve done several mini-projects including looking into non-profit funding by city, scraping weather data, predicting housing prices and the largest project I took on was building a books recommendation tool using Good Reads book data (it isn’t completely user friendly, but you can see it here).

The two highlights from my prep were getting to combine several approaches to predicting housing prices in a Kaggle challenge and making it to the top 6% of entries and seeing this book recommender system come to some basic form of life. It is so fun when you work and debug for a while and then things create good results that can help make better decisions!

Oh yeah, I also had some fun playing around with AI art and creating fractals with Python. See below:

A Mandelbrot fractal created in Python run through a neural network

What’s next

I start my program on Monday and plan to keep writing about what I’m learning and making for anyone who is thinking about data science and resonates with my process. Feel free to comment and give me ideas of what you’re interested in.