Project of the Month: scikit-learn

Henry Badgery
OpenTeams
Published in
3 min readJan 6, 2020

Picking the project of the month is always difficult. There are so many factors involved in the decision-making process: documentation, code quality, licensing, known vulnerabilities, community activeness, and more.

To kick off the new year, we would like to announce that we chose scikit-learn to be January 2020’s project of the month.

scikit-learn logo from Wikipedia

What is scikit-learn?

scikit-learn (AKA sklearn) is a free and open source software machine learning library for Python. Its features include different classification, regression, and clustering algorithms, some of which include Support Vector Machines, Random Forests, k-means, and DBSCAN. A great benefit of scikit-learn is that it is designed to interoperate with the Python scientific and numerical libraries SciPy and NumPy.

Why scikit-learn?

For access to easy-to-use, high-quality implementations of popular algorithms, scikit-learn is a fantastic place to start.

Commitment to usability and documentation

scikit-learn has great documentation. Period. Contributors are required to include narrative examples along with sample scripts that are run on small data sets.

The community is also committed to quality and usability; the global API is safeguarded, all of the public API’s are well documented, and when appropriate, contributors are encouraged to expand the coverage of unit tests.

Models are chosen by industry experts

scikit-learn has a group of stable contributors that include machine learning and software development experts. Some of the contributors are able to devote a portion of their professional working hours to this project.

scikit-learn on Github

Incorporates most machine learning tasks

Scan through the list of machine learning algorithms scikit-learn covers and you’ll quickly realize that it covers most. With a large contributor base made up of machine learning experts, new and promising techniques are quickly added to scikit-learn’s algorithms arsenal.

Other reasons:

  • scikit-learn scales to most data problems.
  • scikit-learn is focused on only including things that it sees fit well with the project.

Fun facts about scikit-learn

  • scikit-learn was initially created by David Cournapeau in 2007 as a Google Summer of Code project.
  • scikit-learn is primarily written in Python, however, some core algorithms are written in Cython to achieve performance. Cython is a programming language that gives C-like performance and is a superset of the Python language.
  • scikit-learn is created upon SciPy (Scientific Python). SciPy must be installed first before you can use scikit-learn.

List of scikit-learn’s core developers

Thanks to all the maintainers and core developers of scikit-learn. Without you, none of this would be possible and people wouldn’t be able to implement machine learning algorithms as easily as they can today!

Here are the top 10 maintainers and core developers. Check out their great work and reach out to thank them personally:

scikit-learn on OpenTeams

Claim your contribution to scikit-learn on OpenTeams

Have you ever contributed to scikit-learn? What about any other project? Made a PR? Helped with content? Written documentation? Regardless of how you contributed, go to scikit-learn’s OpenTeams page by clicking here (or another project page) and claim your contribution. In doing so, you will get recognized for the great work you’ve done!

If you liked this, click the💚 below so other people will see this here on Medium.

Thanks for reading!

--

--

Henry Badgery
OpenTeams

I’m the Growth Marketer for OpenTeams. My role is to help grow the user base, in addition to managing all marketing and sales efforts.