🎂 Happy 1st Birthday Libraries.io 🎉

On the 16th of March 2015, one year ago today, I launched Libraries.io to the public to help “solve open source discovery”, since then it’s quickly grown into the biggest project that I’ve ever started.

Now that the project is one year old, I’m taking it to the next level to ensure it continues to help developers find good, dependable open source software and keep track of all of the open source libraries that they already depend upon.

But first, some numbers

I’m a sucker for open source metrics and over the past year Libraries.io has collected a boatload of them. Here’s some of my favourite high level statistics:

  • 1.2 million packages indexed from 33 package managers
  • A new release of a library is detected and indexed on average every 10 seconds
  • Counted 1.5 billion open source commits from 1.9 million Github users
  • Analysed 5.6 million GitHub repositories and found 23 million dependencies across 2.1 million manifest files (package.json, Gemfile etc)
  • More than 100 million rows in PostgreSQL, almost 40GB of data
  • 2.2 million visitors over the past year made nearly 4 million page views
  • Googlebot has scraped over 300GB of HTML from the site since launch

Libraries.io has connected the dots between all of the libraries and the open source projects that depend upon them, mapping out a huge network graph of the whole open source ecosystem.

Libraries.io uses this network to improve the quality of search results using a similar method to Google’s PageRank, that takes into account other quality factors specific to open source software like licensing, semantic versioning and the bus factor of contributors.

Since launching, I’ve been busy improving the quality of the data collected. Amongst other benefits, Libraries.io discovers unlicensed, deprecated, unmaintained or removed projects and warns against depending on them.

Along with indexing many package managers, Libraries.io has been trawling through millions of open source projects on GitHub, finding and indexing all of the dependency manifest files for the package managers it supports.

The huge amount of data collected is used to build a picture of the usage of all the libraries across the whole open source ecosystem, which is then fed back into the search engine to improve relevance as well as providing high level information about the dependencies of every GitHub project, including any out of date or conflictingly licensed dependencies.

All of this data combined has made Libraries.io into an incredible resource for finding open source libraries, frameworks and tools.

I believe Libraries.io is a really valuable tool for the open source community and want to ensure it doesn’t end up going on an Incredible Journey. So today, on the one year anniversary of shipping it, Libraries.io is going open source!

Going Open Source

A lot of small services and modules of Libraries.io are already open source on GitHub: https://github.com/librariesio but so far the main application has been closed off.

From today the source code behind the Rails app is available on GitHub: https://github.com/librariesio/libraries.io and the Node.js app for parsing dependency information from manifest files (package.json, Gemfile etc) on GitHub: https://github.com/librariesio/librarian, both available under the APGL-3.0 license.

This doesn’t mean that development will be slowing down, if anything having the project out in the open will help to encourage me to write more about my ideas and goals for the project in the open as well rather than just keeping it all in my head.

If you’d like to contribute to the project check out the open issues here: https://github.com/librariesio/libraries.io/issues and here: https://github.com/librariesio/librarian/issues or add your own ideas/feature requests to the support repo: https://github.com/librariesio/support/issues


Follow me on Twitter at @teabass and @librariesio for updates. Discussion on Hacker News: https://news.ycombinator.com/item?id=11298694