Breaking up the monolith
and reaping the benefits
This article was originally published on 26 September 2016
Background
The RES software team has been writing code for a long time now. Our applications help us support the development and operation of wind farms and other renewable energy projects. They are used from early stage investigations all the way through to the operational phase of our power plants. As a result, we maintain a lot of domain specific code related to the management of wind data, wind and solar resource analysis, energy yield predictions, economic modelling and more.
To help with all this, we built a home grown code generation tool. It allowed us to keep the structure of our applications very consistent across projects, and to keep them all up to date while minimising the amount of manual boilerplate code we had to write. This application was very helpful for a while as it facilitated easy interaction between distinct but related bits of software. For example:
- Wind farm turbine layouts created in our system immediately become available in a whole range of applications and seamlessly displayed in the user interface
- The post construction curtailment analysis tool could use the pre-construction code to determine whether curtailment was expected during operation
All our code was maintained in a single (very large) repository, which made this integration easy.
The problem
Over time, we found that this approach had a few issues and created a maintenance burden for a variety of reasons:
- The team got bigger, created more projects, which got more and more integrated together. This means we had to work with very large visual studio solutions (180 projects on average). We knew about Nuget and similar things, but we continued to use visual studio projects as our sole way of referencing external code.
- We switched from SVN to git, where very large repositories (3.5 Gb for a bare repository, and counting!) are cumbersome and slow to use.
- Business needs evolved over time. We found that the post-construction and pre-construction teams needed to do different things, and required our software architecture to be more flexible.
- Our template based code generation system started to show more and more weaknesses. For example, it became very hard to make changes to the software as those changes would automatically impact a large range of applications. In addition, each software release would have to be thoroughly tested because unknown changes made by other teams could have broken some functionality.
In short, to speak in trendy terms, we ended up with a big and scary monolith. We knew this was bad and that the situation would only worsen over time, but the magnitude of the changes we needed to make to fix the issue scared us. Eventually, the problem got big enough for us to decide to bite the bullet and do something about it.

Our app’s initial dependency graph. I told you it was big and scary! Graphs like this really live up to the name spaghetti code!
The plan
My colleagues from the post-construction team and I (the “SMART” team) decided to take our suite of applications out of our main repository and to aggressively drop dependencies where possible. To achieve this, we used a combination of various techniques.
Dependency tree analysis
First and foremost, we had a good look at the current situation and immediately spotted easy improvements: places where previous code dependencies were no longer needed for example.
Package up low level utility dependencies
Since we were all using the same repository, almost all references to other projects were direct code references. Breaking free from the main repository did not mean that we could not use the same low level libraries, but we had to start packaging them up instead. Since our applications are primarily written in C# on the .Net platform, the choice was obvious here: we generated a lot of nuget packages and started using them in our projects.
Replace high level dependencies with new and lightweight interfaces
While the packaging approach was appropriate for libraries, it was not the answer to deal with larger cross application dependencies. This would hide the mess behind pre-compiled packages, but not deal with the underlying problem.
We realised that the amount of information we needed to access from other systems was relatively small in most cases. The solution was to completely remove those dependencies form our code base, and to then write a new and lightweight interface to these systems, leveraging web APIs or direct database access via views instead of being coupled at the code level.
Make improvements to our Code Generation system
Ironically, to make all of the above changes and reduce our dependency on our code generation software, we needed to… make lot of changes to our code generation software! The program was made more flexible, and added support for using nuget packages throughout our applications for example.
Migrate the separated code base AND the source control history to a new repo
Git’s history rewriting tools are very powerful, but we were in a slightly unusual situation: We needed to keep history from projects in different folders at the root of our very large repo (around 55 000 commits). Our initial approach using git-filter-tree gave promising results on a small repo, but took more than 5 days to run on our real one, and crashed more than once during test runs. As you can imagine, stopping work for 5 days was also simply not an option… Luckily we were able to resolve this issue using the awesome BFG repo cleaner. We used it to do a first pass cleanup of the git history that allowed to significantly reduce the amount of commits we needed to rewrite using git-filter-tree.
Results
It took us a few epic weeks, some hair pulling, and a lot of broken or unstable builds, but in late July we finally had a fully separated code base! One git push later we were on Github, with full history preserved!

In a previous post we explained how we use “A3 thinking” to drive continuous improvement and define the current and target state of our processes. You can see how we wanted the target state to look like and the actual results below!

The new dependency graph fits on one (A3!) sheet of paper
Key takeaways
- Our main solution now contains “only” 61 projects (down from 320). As discussed in previous posts, this leads to faster builds (17 seconds on average, instead of roughly 1 minute), and a more robust application
- The size of our repository went from 3.5 Gb to… 35 Mb! This makes working with git a lot nicer and allows for very fast checkouts on our build server.
- By moving to Github, we will be able to establish a more formal code review process via pull requests, and potentially to drop our old homegrown bug tracking system in favour of github issues. We also get seamless integration with a lot of cool online services and Jenkins, our build system.
- Most importantly, we are now in full control of all the code we ship. We will be able to release more frequently as a result — probably a topic for a future post :)
Lessons Learned
Like many other software teams, we reached a point where working on a large monolithic code base was no longer maintainable. This migration is the largest step we have taken yet towards adopting a more flexible “microservice-like” architecture. For all our new development, we are extremely careful to define clear boundaries across services and avoid dependencies that are not absolutely necessary.
We also realised the perils of excessive code generation. To be clear, we still like code generation. We even open sourced Nimrod recently, a tool that helps .NET developers generate Typescript models for their APIs automatically. But code generation can be overused and our template based system lead to too much code duplication and coupling.
Enforcing consistency across projects is a very noble idea and has many benefits, but at scale it can become counter productive and frustrating. Our new development model lets us choose whatever technology is the most appropriate for each problem we have to solve, which is super exciting as a developer.
We were worried that debugging the code brought in via nuget would be complicated and time consuming, and were planning to set up a symbol server to make this easier. However, in practice, our low level code is stable and well tested and it hasn’t been a problem so far. Worrying about things like this is one of the reasons why we didn’t do this earlier, and there is a lesson here in being paralysed by all the things that might happen.
Finally, it took us too long to act. We knew about nuget and the other things we could do to fix our ballooning solutions problem, but for a long time we struggled on with the problems. This ultimately made the repository split more complicated and time consuming. In the future we will try our best to be more proactive when it comes to managing technical debt.
About me

This post was written by me, Clément. I’m a Software Developer at RES. I think I have been a programmer all my life, but only realised it a year and a half ago. Before that, I held various positions in the Technical team, doing some flow modelling and number crunching. Python is my weapon of choice, but I’ll use anything if it means learning new things.

