Libraries.io April/May Progress Update

Andrew Nesbitt
Libraries.io
Published in
4 min readJun 2, 2017

We missed our April update last month due to holidays, so we’ve got a combined April/Month progress report on Libraries.io since securing a grant from both the Sloan Foundation and the Ford Foundation, here’s what we got up to.

Notable New Features

Importing GitLab and Bitbucket users and organisations

After much refactoring and some painful migration of large sections of the database, our concept of repository owners is now longer tied to the specifics of the GitHub API, we now have support for Bitbucket Users and Teams and GitLab Users and Groups.

It’s also going to be much easier for us to add support for more kinds of repository owners in the future.

Importing GitLab Bitbucket Issues and Pull Requests

Along similar lines, the support for storing Issue and Pull Request metadata for repositories is no longer tied to the GitHub API, and we have support for indexing both GitLab and Bitbucket Issues and Pull Requests/Merge Requests, which will be useful in the future for calculating community engagement, responsive and activity metrics.

Redesigned Project pages

After Ben’s comprehensive UX review of the search experience, we’ve made a number of simplifications and moved some elements around to make searching and browsing the site much more enjoyable.

We’re also loading some of the intensive elements on the page as you scroll down the page which has reduce the load time by approximately 50%.

Search page refinements

You can also now search both libraries and repositories directly from the navigation header from any page of the site.

We also took the time to enable selecting multiple filters from the interface, this allows you to do more comprehensive searches, for example, searching for Testing Libraries written in CoffeeScript or Typescript, or any repository available under some kind of GPL license.

Yes, that’s 18 libraries called “covfefe”!

The search pages are also completely driven by elasticsearch now, skipping the need to load records from postgres after querying elasticsearch as we did before, making the search pages considerably quicker to load.

Civic Tech Research

Contribution to Digital Infrastructure by the Civic Tech Community

One other large piece of work we did in April was researching The Impact of Civic Tech on Open Source which was presented by Ben at TICTec.

We compiled a list of the top one thousand most depended upon projects amongst the 93m declared dependencies tracked by Libraries.io which we’re calling “Digital Infrastructure”, then looking into how the Civic Tech community uses and contributes to it.

Some highlights include:

We found that amongst the 5,034 projects we call Civic Tech, 6,815 users have contributed 1,135,846 commits. This compares with 82,514 people who have contributed 2,731,564 commits to any open source dependency of a Civic Tech project.

The Civic Tech Community contributes around ~15% of the work needed to support its own foundations.

The Civic Tech Community contributes ~6% of work needed to support our shared, digital infrastructure.

Other Notable Changes

A full list of all changes right across the Libraries.io org on GitHub is available in two gists, April and May, generated by: https://github.com/librariesio/org-pulse

Statistics

April: 230 commits, 11 pull requests and 129 issues opened across all Libraries.io repositories on GitHub: https://github.com/librariesio

May: 279 commits, 15 pull requests and 228 issues opened across all Libraries.io repositories on GitHub: https://github.com/librariesio

We’ve now indexed 9,375,765 published versions of 2,291,604 libraries, 23,470,130 open source repositories, 23,644,149 issues/pull requests and 93,965,312 dependencies from GitHub, GitLab and Bitbucket.

Contributors

We had a few patches from outside contributors in April and May whom we would like to thank:

Plans for June

The main aim for June is to release our first public data dump of dependency graph and metric data. This will including information about the 2.3 million libraries we have indexed, their versions and dependencies and the 25 million open source repositories and all of their dependencies.

It will be available under a Creative Commons Attribution-ShareAlike license, we’ve already had some amazing researchers start to play around with the data and find some interesting facts around dependency complexity and expect to see many more fascinating insights once people start to drill down into it.

Keep track of our progress via this GitHub issue: https://github.com/librariesio/supporters/issues/9

We’ll also be heading over to America for the Sustain OSS conference and a Digital Infrastructure workshop with Ford and Sloan Foundations to share what we’ve been working on.

As always, follow us on Twitter at @teabass, @benjam and @librariesio for more updates.

--

--

Andrew Nesbitt
Libraries.io

Package management nerd. Creator of @octoboxio, @Librariesio, @24pullrequests and co-host of @manifestpodcast.