Software Development in Scientific Research

Published in

Clean CaDET

7 min readDec 2, 2021

A case study of a familiar process in a nontypical context — Part II

How to engineer useful software for an academic research project?
How do the familiar software development practices differ in this context?
What can research labs learn from the software engineering community?

This article presents our initial answers to these questions, derived after a year of dedicated R&D on our Clean Code and Design Educational Tool. Through this open process, our goal is to share our insights, gather advice for improvement, and contribute to enhancing the field’s maturity.

This article is the second part of a two-part series. In the first part, we examined the inception of a research idea, discussed how we planned the R&D activities, and reviewed the requirements engineering practice in our context.

Now we explore techniques that improve the design and implementation practices for the software solutions that arise from academic research. While these techniques are familiar to the software engineering community, they are scarcely applied in academia, limiting further work and reproducibility studies [1]. By practicing these techniques, we believe that research groups can significantly improve the usefulness and visibility of their results.

While researchers might limit their efforts to proof-of-concept solutions, we can still follow sound software engineering practices to increase the usefulness of our work.

Design

As most research is concerned with novelty and researchers spend much of their time learning about a problem domain, it’s often hard to make many upfront design decisions that will persist throughout the solution’s lifecycle. We found that evolutionary design, achieved by respecting clean code [2] principles and performing frequent refactoring [3], is necessary when developing research-driven software.

In the first year of our project, we have substantially refactored and redesigned our platform three times so far (leading to a spinoff solution). Each redesign was triggered by discoveries we made through studying the literature and performing empirical evaluations. For example, the discovery of the KLI framework [4] and the related body of literature drove us to change the bedrock of our Tutor’s design completely. In traditional software engineering, this would be akin to discovering a new major stakeholder or end-user whose expectations for the software significantly differ from the current state.

Many design sessions preceded each major refactoring. We questioned our design choices and examined how the knowledge we’ve discovered through research and the planned features mapped to the current design. We often had to postpone a major refactoring to perform additional research and write more code with the old design before we were confident that we needed a new design.

Whiteboards are a precious tool for any kind of collaborative design. Physical whiteboards are more engaging to work with, while virtual whiteboards produce artefacts that can be more easily reused.

Once the design has stabilized, it’s helpful to clean up the sketches describing the software solution’s structure and behavior and include them in the documentation to help future researchers use and expand upon the developed software. We found that simple wiki pages are more than sufficient. They are also a great mechanism to define the ubiquitous language [5] used by the group, which helps alleviate the problem of inconsistent terminology that is present in most research fields (see [6] for an example in our domain).

It’s worth talking about the pain and cost that comes with a redesign. A redesign is much more aggressive than refactoring (defined as restructuring the code without altering its visible behavior [3]). It entails changing the entities and relationships and the business logic built on top of them. Refactoring a module won’t often require changes to the documentation, nor will it impact many tests (high-quality tests possess a certain resistance to refactoring [8]). A redesign requires significant updates to the code, the tests, and the documentation. In our case, roughly two engineers spent an entire week reworking the Tutor’s code and associated test suite, reworking and reorganizing approximately 150 class files and 2.5k lines of code. We still need to update the documentation and most of the seed data to reflect the new design.

Nevertheless, such effort pays off in the long run, as the figure below illustrates. As software becomes complex, development productivity degrades as more time is spent on bug fixing, understanding old code, and fighting the old design. Adequately designed software reduces this slowdown. In such software, bugs are easier to isolate and fix, code is traversed and understood quickly, and powerful features can emerge by coordinating existing components in a new way.

As complexity rises, the more the design and refactoring practice is neglected, the higher the loss of productivity [9]. Bugs are difficult to resolve, the code is difficult to work with, and in the worst case it might be cheaper to scrap the project and try again.

Implementation

Apart from continuous refactoring and documenting our solution, we found three practices that helped us create a solution that was easier to use and maintain. We reiterate that these practices are standard in the SE community. Furthermore, they have been recommended by other researchers [7]. These practices included:

Code reviews, conducted through GitHub’s pull request mechanism.
Automated tests, written following best practices from leading authors on the subject [8].
A basic build pipeline, including packaging the solution into a docker container for easier deployment and use.

Code reviews follow a similar process to research paper reviews and offer the same benefit — a fresh look at a solution that mitigates the tunnel vision of the original author. At a high level, we wish to examine if new code is focused, properly fits into the broader software, and is reasonably comprehensible before introducing it to the main codebase. A method should focus on a single task, much like a paragraph should explore a single new piece of information. Methods inside a class should cohesively work towards a single goal, similar to structuring paragraphs inside a section. Names should be consistent, simple, and mapped to the ubiquitous language of the team, both in code and our papers. For additional guidelines, we recommend [2] and [3].

Next, we found much value in automated tests, especially tests constructed following best practices from the field [8]. Notably, they introduce additional work when developing new functionality (and sometimes when refactoring old functionality). However, they save significant time in the long run, as we can detect bugs as soon as they are introduced. We strived to cover every use case with several integration tests and every complex module with additional unit tests. By maintaining a clean design and following this heuristic, we spent most of the test writing effort on preparing test data — defining the input for a method and its expected output.

We also looked to make our solution easy to use and play with. From the code’s side, this entails maintaining the aforementioned wiki pages. From the application’s side, this includes packaging the solution in a docker container and defining setup guidelines so that other researchers can startup the application without needing to go through the code.

Finally, we’ve recently dabbled with setting up an automated pipeline to run tests and perform code analysis with every pull request. Using a quality gate, we can prevent code that degrades readability or causes tests to fail from becoming a part of our codebase.

It is not appropriate for our Clean Code Tutor to suffer from code smells, even the mild style issues identified by SonarCloud.

What we discussed in this short series of articles might seem like a lot of unneeded hassle to a research group. Indeed, many journals do not require well-built, well-tested, and well-documented tools to be submitted with the paper. While this is changing as papers with code are becoming more valued, we are still a ways off before it becomes mandatory. Still, most researchers are familiar with the frustration of working with tools and solutions created by other researchers with little documentation and that are difficult even to start, let alone test or expand. We must do better to reduce this collective frustration and enable the research community to focus on the research and not spend days or weeks setting up a tool, only to give up because it is too hard. Fortunately, the software engineering community has produced many good practices that we can utilize with just a little extra effort.

[1] Crick, T., Hall, B. and Ishtiaq, S., 2017. Reproducibility in Research: Systems, Infrastructure, Culture. Journal of Open Research Software, 5(1).
[2] Martin, R.C., 2009. Clean code: a handbook of agile software craftsmanship. Pearson Education.
[3] Fowler, M., 2018. Refactoring: improving the design of existing code. Addison-Wesley Professional.
[4] Koedinger, K.R., Corbett, A.T. and Perfetti, C., 2012. The Knowledge‐Learning‐Instruction framework: Bridging the science‐practice chasm to enhance robust student learning. Cognitive science, 36(5), pp.757–798.
[5] Evans, E. and Evans, E.J., 2004. Domain-driven design: tackling complexity in the heart of software. Addison-Wesley Professional.
[6] Pelánek, R., 2021. Adaptive, Intelligent, and Personalized: Navigating the Terminological Maze Behind Educational Technology. International Journal of Artificial Intelligence in Education, pp.1–23.
[7] Hunter-Zinck, H., de Siqueira, A.F., Vásquez, V.N., Barnes, R. and Martinez, C.C., 2021. Ten simple rules on writing clean and reliable open-source scientific software.
[8] Khorikov, V., 2020. Unit Testing Principles, Practices, and Patterns. Simon and Schuster.
[9] Heymann, J., Introducing Agile Software Engineering in development, https://blogs.sap.com/2018/05/02/introducing-agile-software-engineering-in-development/

Software Development in Scientific Research

Design

Implementation

Written by Nikola Luburić