Monthly ML: Data Science & Machine Learning Wrap-Up — April 2020

Amy Peniston
6 min readMay 2, 2020

--

Thanks for checking out my Data Science and Machine Learning blog series. Every month, I’ll be highlighting new topics that I’ve explored, projects that I’ve completed and helpful resources that I’ve used along the way. I’ll also be talking about pitfalls, including diversions, wrong turns and struggles. By sharing my experiences, I hope to inspire and help you on your path to become a better data scientist.

Jump to: January | February | March | April | May

Greetings! I am typing this on the very first day of “Phase 1” of the new normal here in Bermuda. As the island emerges from lockdown, things feel different, a lingering uncertainty pervading every interaction, every conversation.

As is probably no surprise, I’ve kept the gloomy thoughts at bay by busying myself with projects. In fact, I think I was even more focused in April, simply because I was able to devote my energy to learning. No excuses not to code when you can’t go further than 1/2-mile from your house!

Anyway, the overarching theme for April was data structures and algorithms. On a whim, I decided to participate in the 30-Day LeetCode Challenge, starting on the 1st of the month. I definitely did not appreciate what I was getting myself into, especially from a time perspective: the countless hours I would devote to understanding each problem.

All aboard the Struggle Bus!

At first, it seemed like I was having to learn a different topic every day. On the rare occasion where I managed to implement a brute force solution, I was miles away from coming up with anything close to optimal. I spent at least an hour every evening watching related YouTube videos and puzzling over increasingly complex “interview-style” coding problems.

I thought about giving up.

But then, around mid-April, things started to click. Daily challenges built upon earlier questions and the topics started to converge.

On one particular day, I even managed to implement a dynamic programming solution that would have boggled my mind only weeks before.

My solution to Day 23 of the 30-Day LeetCode Challenge.

In the end, I completed the challenge. It was a great learning experience. While I’m not jumping straight into another one, I will certainly revisit LeetCode soon, lest I forget everything that I’ve learned!

Me, after finishing the 30-Day LeetCode challenge.

By the way, if you’d like to check out my LeetCode solutions, please visit my Github.

Shifting focus, I made good progress on Frank Kane’s Big Data Udemy course. The main reason I’ve embarked on a quest to learn about the Hadoop ecosystem is that I am tired of being bombarded by terminology that I don’t understand. I know you know what I’m talking about: HDFS, Hive, Pig, Flume…the list goes on, and on. It’s exhausting and, before this course, completely bewildering.

If you’re interested in this stuff, Frank Kane gives a fantastic overview of everything “big data”. He introduces each tool using slides and diagrams before jumping into one to two short practical examples. The course is based on the Hortonworks Sandbox virtual environment and I appreciated getting some experience using this stack.

Couple words of warning: as with many online follow-along resources, there were a few installations that just didn’t work. It also would have been nice to use the most recent Hortonworks Sandbox image, rather than the antiquated version featured in the videos. And most importantly, this course imparts breadth, not depth. Go in expecting a whirlwind tour and you’ll most certainly be satisfied.

In addition to big data, I also explored the wonderful world of Docker.

As I’ve started to build more complex models, particularly neural networks, I’ve struggled with configuring the correct programming environment and hardware setup. I know that containerization is the answer to most of these problems, but because I have been developing on a windows machine, I’ve stubbornly resisted learning Docker.

I finally decided to bite the bullet and wipe my horribly bloated 2014 Apple MacBook Pro, giving myself a blank slate to install and play around with Docker.

Fortuitously, I happened upon a promo ad for a free month’s subscription to Pluralsight and decided to check out Nigel Poulton’s Docker series. Nigel came highly recommended and I thoroughly enjoyed his teaching style. I ended up bouncing around, selecting various pieces of his different courses to fill the gaps in my knowledge. (As a completionist at heart, this hurts my soul, but deep down, I know it’s the best way to learn).

My #DotNotes curated from Nigel Poulton’s Docker courses on Pluralsight.

Now, onto the secret elephant in the room, which I can’t talk about yet, so will vaguely refer to as “The Project” .

The Project has occupied a significant amount of my attention for several months. It is exciting and new and I am having a blast working on it. I am also bursting at the seams to share, but it is not quite ready. (Soon, I promise).

This endeavor has been a joint effort, which makes it different than most of my freelance work. I am a notorious solopreneur. I thrive on doing things myself. But, I can honestly say that The Project has changed my perspective on working in a team, particularly a team in which the individuals have different, complementary backgrounds.

Just a few of the things that I’ve appreciated learning during this process:

  • Version control (git) in the real-world
  • Agile methodology and story point estimation
  • Integration of front-end and back-end
  • Parsing user feedback
  • Prioritization and communication

Keep an eye out on my Twitter for news about The Project official launch!

Finally, on a not-so-technical level:

I decided to give a little bit more attention to my social media accounts. After posting a photo of my notebook to Twitter, I discovered that people really like my #DotNotes! Ultimately, I am passionate about inspiring others to learn and if I can do so by creating content that I love, it is a win-win.

If you’re interested in study notes or study inspiration, check out my Instagram or follow me on Twitter. (#DotNotes FTW!)

Follow me on Instagram @amy.dot.notes

And, speaking of Twitter, we’re still going strong with #100DaysOfCode! Today marks my 77th day and it has flown by. So many people have jumped on the bandwagon and it’s incredible to see the different sorts of projects that everyone is working on. I am amazed by how much progress can be made by doing a little bit every single day.

On a related note, I was a bit miffed by the lack of customizable, free options for “link in bio” landing pages and decided to create my own. After posting my final product to Twitter, I was so excited to see that someone was inspired by my creation and built a similar page for themselves!

My “Link In Bio” page: https://amypeniston.com/hello

Lastly, I want to briefly mention a few of the events that I enjoyed in April. Although I was very disappointed not to be able to attend the ODSC East conference in person, I took full advantage of the virtual sessions. There was a LOT of content, which, thankfully, continues to be available online. I do hope to be able to experience ODSC West in San Francisco in October.

Other highlights included several Data Science 4 All — Women’s Summit webinars and Lex Fridman’s weekly AI Reading group. Lex is someone I have always admired and it is very cool to interact with him and other members of a community that is passionate about artificial intelligence.

With that, April is a wrap! Thanks again for reading and, until next time, happy coding.

--

--