DevOps and Reproducible Science
I have been enjoying reading a lot about the movement towards a more unified development and operations culture within computer science, and the tech. industry as a whole recently.
Traditionally, development (i.e. programmers and testers) and operations (i.e. system administrators) teams have been largely isolated from one another, which has caused a great deal of discordance between development, testing, and production environments — or at least a great deal of work required in keeping things “in sync”.
By bringing these teams together under a DevOps culture, there is more focus on the use of automation, configuration management, and orchestration/provisioning tools across all stages of the product lifecycle. This helps to create more controlled, and uniform environments, overcoming many of the problems relating to differences in runtime dependencies, system libraries, and operating system configurations for example.
I, as with many others, seem to have been drawn towards some of the more popular and recently developed tools that have been gaining traction throughout DevOps; such as Ansible, Docker, Packer, and Vagrant. What strikes me most about these tools, is how they could provide an excellent resource for reproducible science by allowing for uniform, static analyses environments that can be installed, configured, and shared openly and easily via the Internet.
One thing I realised throughout my PhD is that there is little communication across disciplines. I believe that we (biologists) need to adapt more quickly to the lessons learned by other disciplines such as computer science, and in particular the developments made in industry, as opposed to just academia. If we can focus on improving, and optimising our workflows; applying automated configuration and provisioning of our analyses environments; all within an open source and version controlled framework, then we stand to greatly improve the ability for our research to be distributed, and more importantly, reproduced.
There is some interesting progress being made with this already (see BioDocker, BioDevOps; and the great work being undertaken by Brad Chapman, as well as Bruno Vieira), but I believe more can still be done.