Open Source & Machine Learning — Joining Two Worlds

Andrei Batomunkuev
4 min readSep 10, 2021

--

Photo by Arnold Francisca on Unsplash

Machine Learning and everything related to big data manipulation and analysis have become popular these days. Open Source plays a big role in Data Science, it pushes forward the field — allowing people to experiment and make technological advances. In this article, we are going to talk about Open Source Development and popular open-source projects in Machine Learning, which you can explore and contribute to.

Background

Let me introduce myself first. My name is Andrei. I am a student at Seneca College of Applied Arts and Technology. I am living in Toronto, Canada. Currently, I am in my 3rd year of the Computer Programming and Analysis diploma program. I specialize in Data Science, I have recently graduated from the Data Science program at Practicum by Yandex. Data Science has become my passion, my biggest interests are Machine Learning, Deep Learning, and Data Analysis.

Open Source Development

In the current college term, I am taking an Open Source Development course (OSD600) at Seneca College. Somebody may ask me: “Andrei, why did you take Open Source Development course ?” In this paragraph, we are going to answer this question.

Open Source is more than just contributing to Open Source projects. First of all, it is developers worldwide community; you can leverage your network through Open Source projects, share your knowledge with the community as well as learn from the community. It allows growing as a developer.

I am going to define 3 components to grow as a developer through Open Source:

  • Understanding the code from people, who are contributing to a particular project. The project you contributing to can contain big portions of code as well as documentation. Pick a good project of your interests, and try to understand how some features are working under the hood by going through the documentation and source code. It will leverage your debugging skills, and you will be solving your code problems faster.
  • Exchange information with community. By asking and answering the questions, discussing some issues and improvements, you will leverage your technical knowledge. Open-Source community is really big, you may communicate with professional people around the world.
  • Being part of something more than just a project. Open Source is building software or tools, that people can use, share, and modify. It is a great opportunity to share your work with other people, push the industry forward. Participating in discussions, connecting with people around the world, giving someone recommendations, being open to someone’s suggestions — by doing that you will develop personal growth.

Throughout this term, I am going to build these components for myself — this will be my goal for the current academic term. I hope I will discover more components of Open Source Development throughout the term.

Machine Learning: Goals & Open Sourced Projects

As I said before, Machine Learning has become one of my passions these days. I am constantly and progressively learning something new in Machine Learning by researching and experimenting with open-sourced Machine Learning projects.

I have set several goals that I want to achieve in Machine Learning by the end of the year:

  • Understand how some Machine Learning algorithms work under the hood by going through GitHub source codes. In addition, I will gain some best practices for coding, researching, and experimenting, that will be helpful in my career.
  • Contribute to one or many Open Sourced projects related to Machine Learning. Projects can vary: from algorithms to frameworks and tools.
  • Try out experimenting in deployment by using open-sourced projects like Docker as well as tools that help to track Machine Learning model performance. By doing this, I will be able to have a good understanding what is the work cycle of Machine Learning (Starting from researching ,experimenting, and finishing by deploying a model)

Let’s move on to Open Source Projects. I have recently done a couple of real-world projects at Practicum by Yandex (the first one was an NLP ( Natural Language Processing) project, the second project was associated with Computer Vision). While I was doing these projects, I was researching tools, libraries, frameworks to apply them in my project solutions.

One of the great resources was — Facebook AI. This semester, I am planning to work on open-sourced projects that Facebook provides — Facebook AI Tools

Specifically, I have the interest to work on the following projects:

Conclusion

In the following weeks, I am going to provide updates on my progress in Open Source Development. This is going to be a great journey!

Feel free to connect with me via LinkedIn and GitHub

More information about me is available on my portfolio website.

--

--