When software goes missing.

A few years ago, a team of researchers at NICTA developed some bioinformatics software. This software allowed researchers to easily analyze certain types of data from a very specific type of cancer model (tumors growing in mice). We have downloaded and used this software, and while it had some issues, it has been useful to us. Others have also sought to use it:

The authors wrote a paper about this, which they published in the journal Bioinformatics. The manuscript states that the software is available for non-commercial use from their institution, NICTA.

Screenshot of the abstract for the paper describing Xenome. [pubmed]

But NICTA has changed and no longer exists in this form. The URL doesn’t work, and the software can’t be downloaded anymore. The authors, based on personal correspondence, have been cutoff from the source code and apparently cannot distribute the software. I have a copy of this software, but I cannot distribute it because the license applied to the software prohibits this.

A twitter conversation raised an interesting question on this situation:

I’ve thought about this a bit, and I do not think that the paper should be retracted. First, the software does still exist. I have a copy of the software, and if I used it I would like to be able to cite the paper. Second, the lack of availability is not the fault of the authors. I doubt that they expected that we would end up in this situation either.

While I don’t think the paper should be retracted, I do think we should learn some lessons from this.

  1. We should expect to place software and resources used in the construction of a paper into a resource designed to archive digital artifacts (e.g. Zenodo, Figshare). Journals should develop policies around this and enforce them.
  2. As scientists, we should carefully consider the implications of the license that we apply to research artifacts. In this case, a permissive open source license would have worked around the changes at NICTA.
  3. As reviewers, we should carefully consider the implications of the license that authors have applied to these artifacts and the manner of archiving. If there are elements of this that would affect our enthusiasm for a manuscript, we should clearly comment on this aspect of the contribution.

While we can’t foresee all contingencies, these are real issues that have an impact on our field right now. We can and should modify our behavior and expectations to address them using existing resources.

I expect that this story will have a happy ending (see responses). The authors now have access to the source code and will be releasing it under an open source license.

Because you made it all the way through this post, here’s a GIF of a cute dog dancing.

About this post: This was thrown together during quick breaks while I was working on a grant. It’s probably too terse, unclear, and has other issues. Please provide feedback, and I’ll try to edit it to address concerns. Thanks! Casey.