Libraries.io June Progress Update

Andrew Nesbitt
Libraries.io
Published in
3 min readJul 10, 2017

Continuing from last month with the regular progress updates on Libraries.io since securing a grant from both the Sloan Foundation and the Ford Foundation, here’s what we got up to in June 2017.

Notable Features

Open Data Release

The big feature that we worked on in June was our first ever open data release.

25GB of metadata on all of the packages, versions and dependencies from the 33 package managers we support plus repositories, tags and dependency information from over 25 million open source repositories found across GitHub, GitLab and Bitbucket.

All this data is available in CSV format, split into a number of different files, with the total number of rows reaching 200 million!

You can download and reference the data from Zenodo under a Creative Commons Attribution-ShareAlike 4.0 International Licence, more information and updates available on https://libraries.io/data.

Ben also wrote up a great blogpost about the release: https://medium.com/@BenJam/libraries-io-releases-data-on-over-25m-software-repositories-ab1db665826e

Updated homepage

New, simplified homepage for the site

Along with the new data page, Ben updated the homepage of the site to better communicate the wide range of things you can do with Libraries.io, which aligns nicely with our documented set of user personas.

Sustain OSS

Towards the end of the month both Ben and I travelled over to the US for the Sustain OSS conference in San Francisco, as described on the website:

“Sustain” will be a one day conversation for open source software sustainers. There will be no keynotes, expo halls or talks. Only a guided discussion and concrete ideas about getting and distributing money or in-kind services to the Open Source community.

We got to meet a lot of great people who are also focused on helping solve long term sustainability for important Digital Infrastructure projects.

One of the key things I took away from the discussions we had there is the need for more data of changes over time, allowing for analysis of trends and generation of feedback based on real data.

We then flew over to New York City and took part in a joint Ford/Sloan meeting to help them craft their strategy on supporting digital infrastructure.

Both of these meetings were really positive for us both in terms of what we’ve already achieved and informing our work for next year.

Ben has started a small, working group focussed on collaboratively developing measures for classifying and measuring aspects of projects which should fill in a few holes in our own knowledge and our data.

Brighton Ruby Talk

At the end of the month I put together a data driven talk for the Brighton Ruby Conference about some of the risks involved in depending on open source software

Other Notable Changes

Statistics

41,155 new libraries and 298,570 new versions found, bringing the total to 2,339,193 libraries indexed.

134 commits, 13 pull requests and 323 issues opened across all Libraries.io repositories on GitHub: https://github.com/librariesio

Contributors

This month we had a quite a few patches from outside contributors whom we would like to thank:

Plans for July

After full time working for 6 months on Libraries.io we’re halfway through our grants from Sloan and Ford, for the next 6 months we’ll be focusing on the commitments we’ve made to the Ford Foundation, as outlined in our proposal, which includes the following:

  • Establishing a baseline for open source metrics
  • Collecting and recording time series data around those metrics
  • Exploring routes to long term sustainability for the project

As always, follow us on Twitter at @teabass, @benjam and @librariesio for more updates.

--

--

Andrew Nesbitt
Libraries.io

Package management nerd. Creator of @octoboxio, @Librariesio, @24pullrequests and co-host of @manifestpodcast.