My year of refactoring a large scale AngularJS project…

Published in

The Startup

15 min readSep 24, 2019

In this article, I will share my valuable experiences during the migration of a large front-end project with Grunt, Bower and AngularJS to a state-of-the-art technology stack and architecture.

When I joined the project it had reached a point that may be familiar to many that have worked on fast growing, long-lived complex products: developing new functionalities was getting slower and release candidates became increasingly difficult to stabilise.

The reasons for these issues could be traced back to a number of key causes:

Application state was distributed out over different services.
Data objects spread as function arguments from domain to domain. As a consequence, data could be changed by reference in a way that is hard to trace.
Circular dependencies between multiple domains in the code base
Duplicated templates and styling
Outdated dependencies
The company was growing fast and multiple teams started working on the same codebase

On the positive side, we got a trustworthy unit test coverage in our project, so every element did what it was supposed to do. When bugs arose, it was typically in the complex coupling of all of those elements.

It was clear that there was work to do, but we could not migrate the whole legacy application to a modern project at once. That would:

1) cost too much work,

2) feature delivery would be on hold for too long,

3) have a risk of “forgotten” requirements.

To summarise, we started from a large application running on Grunt, Bower and AngularJS. After a year we progressed to a mono-repo with packages for each domain, using Webpack and hybrid AngularJS-Angular. And, importantly, while doing so we still delivered new functionality in the product that was highly demanded by our customers #customersfirst.

Keep everything stable, all the time

We always push our code to the GIT master branch. We don’t have feature branches, we don’t do pull-requests to master neither. We just add code via small commits directly to master. Continuous development, as you could say.

On every small commit, a build gets triggered which will also run a bunch of automated tests. Those automated tests could guarantee some level of quality of each build.

This means that we have to do our refactoring work in small changes that keep the application stable, all the time. If not, the changes will not get propagated in the build pipeline.

To limit the risks, we put new code behind a feature toggle, so the user would not be affected by this code change if the feature toggle was disabled. However, when refactoring on this high level, this is not always easy to use.

Feature toggle is nothing more or less then an if-else code block that toggles your code execution depending on the state of a feature toggle. A feature toggle is set per environment and does not change during code execution.

This code block is not removed during build time, the if-else is executed on runtime. This gives the safety-net to disable a feature even when it is already deployed to an environment. Even at a client installation.

By the way, we only refactored parts of the application when we touch it for adding a functional requirement. This makes it more clear for our QA team to test what we have refactored.

Componentising ui-elements

First, lets stop the wild-grow of legacy code!

In our own component library, we defined components that are used in multiple places in the application, with a division of smart and dumb components ( container & presentational components ).

The dumb components were for example buttons, input elements, progress-bar, etc. Those components have no business logic besides the visualisation, user interactions and validation.

The smart-component on the other side, have some knowledge of the business. They have actions that are not directly coupled to the business logic of the application. Those components have configurable inputs and event-handlers as output.

The container components, made of these smart components, connects those event-handlers to the business logic of the application.

We started to refactor some heavily used domains of the application, that were in much need of a redesign in user-experience anyhow. This makes a win-win on the technical side and functional side. This is important, so we stay focussed to deliver value to our clients.

Unidirectional data flow

Our front-end project is a large-scale project where the functional requirements can be very complex with a huge amount of data, lists, assets, … that are used and manipulated in all the domains of the application.

In this legacy code flow, lists of assets was passed through the application as parameters from 1 method of some service to another service, to another service, … Each service has its own responsibility, each service can add or change data on that list.

JavaScript passes objects by reference, so a list of objects is in fact a list of references. So if you change an attribute on an object inside a list and, even in another domain of your application, code that is consuming this list by reference get magically updated.

For me this feels like quantum entanglement. In the past developers in the web community probably saw this as a great feature for fast prototyping, but for us this was leading to more and more bugs that were difficult to investigate.

Why is this by reference such a pitfall? It is so powerful?

Yes it is indeed powerful when you change an object that the whole application has the latest version of this object. And when you persist this object, you always persist the object that your whole application was working on.

True, and it is some easy code writing. However, imagine you seeing this code 1 year later ( or 1 week in my case ;-) ) and you don’t remember a thing. Imagine then that you have to reverse engineer which code somewhere in the application did some manipulation on a property of that object? You will feel lost… because it could be everywhere!

( Fun fact: even the logged data in a console.log gets updated if you don’t clone the data you print. Which I seem to forget on a Monday morning. )

What makes it even harder: we did not have typings, interfaces, … so we also did not know which properties an object could have. We ended up having the property: isHidden, isVisible and shown on the same asset which basically did the same …

But again, we can’t refactor the full application. We can’t refactor the most crucial data-flow of our application at once, without causing everything to break.

Sometimes it happend when hanged something in one domain, magically in another domain something else breaks, caused by modifying data by reference.
We started calling this “the party tent”-effect. Referring to when you are setting up a party tent, and plugging in the last piece, another thing at the other side get loose. And you are started for a battle between you and the party-tent. https://www.google.com/search?q=party+tent.

So we started to use a unidirectional data-flow in the new domains of the application to encounter the existing chaotic data flow. Like combing knots out of dreadlocks. But it will give us more control over the data flow.

We did this by creating a very small Redux-like library that could easily being integrated in all the parts of the application.

It contained an event-manager, a state-manager and a process-manager.
The business logic was defined in processes, processes that could use other AngularJS services, or even custom libraries. In the end, it could manipulate the state in the state-manager ( = read store ) and it would throw an event.
On user interaction components could trigger a process and subscribe to events where it could read ( immutable ) data from the state-manager. Later we could also get a RxJS subject from the state-manager to subscribe on.

This was our custom library, very lightweight. It was not our goal to invent another Redux library, but we wanted to make all further development in the direction of reactive programming immediately.

Migrating to Webpack

A year ago we still used grunt as our build manager. As you know, widely community support for Grunt was already dead at that moment, so Grunt was not the build tool that you want to use in any future scenario.

We started with Webpack for only one file: application.ts, which contained the AngularJS manual bootstrap script. Webpack compiled this file to a build.js, which was included in the index.html . And from here, Grunt took it over. This was the first, smallest and stable step we took. WebPack was initialised, and the build was green at the end of the day. 🎉

Then we started moving dependencies, those were still in bower_components at that time. Some of them were easy to move, just move to the vendors-list of Webpack and done. Others were very difficult to move because the version we were using was not available on npm. Why not just upgrade those dependencies? No, we avoid risk, at this time, having a green and releasable build at any moment.

Some of the libraries needed to be globally accessible because existing code needed to have some libraries globally addressable. Again to avoid risk, we did not want to change the code that was consuming the libraries, we work around it by using the expose-loader in our build process to provide those libraries. Later on, we migrated this using imports when migrating to TypeScript. #smallcommits #keepthingsstable

After all the bower_components dependencies got moved, we finally could remove bower from our stack. All those dependencies were now included in the build.js file, that of course still got picked up by the Grunt build process.

Hold on, time for an eclair to celebrate ;-)

OK, the next step is to get rid of Grunt. We want to load the JS, HTML and SCSS files with Webpack. We started with the SCSS files, find the appropriate Webpack loader and remove the inclusion from the index.html file. Step by step, pushing to master all the time.

Before we move JS-files to Webpack, you must realise that soon there is no global context available anymore. If you want to use a global function, you have to export it and import those. Keep on adding imports and exports to get everything working again.

After this, we could just load all the JS-file by a require.context().keys().forEach — kind of trick. #smallcommits #keepeverythingworking

In the meantime: Goodbye AngularJS, Hello Angular

While migrating the legacy code to modern code, we decided that every new functionality, as in not extending existing current components, had to be written in Angular instead of AngularJS. Else we could not stay on top of this: new legacy code would be introduced faster than the old one was migrated.

As you know, our strategy was a migration step by step. Not rebuilding our application in a greenfield project. However, a greenfield project has a lot of advantages.

Take a look at the ecosystem around Angular, you are encouraged to use the setup provided by Angular which makes future upgrades less painful. As painful the upgrade from AngularJS to Angular is … that is something you would like to avoid.

So we started a greenfield Angular project, based on the conventions that are existing in the community. This new project was first nothing more than a container that bootstraps one component. Being the root of our legacy project.

The first new functional additions to the product built on top of our existing project. It was a div that just floated above our application, made in Angular.

Communication between the legacy AngularJS and Angular goes via our own redux-like library. We glued communication between Ng-Rx and redux-like with an action mapper. Not so difficult, because our redux-like library was just a native JavaScript library.

From JavaScript to TypeScript

Because there was no strict contract between all the domains of the application, and this almost neural network of method calls, a lot of bugs could have been avoided by heaving a strictly typed contract. So this brings in the need for TypeScript.

Again, our migration was step by step. Every time we touched a JavaScript file we converted it to TypeScript:

Change the name of the file
Add your imports
Replacing var with let/const.
Fixing your typing errors with just adding the any-type for the time being. #smallsteps
Replace your Lodash functions with native functions, if possible ( .find, .reduce, … . )
Creating local interfaces, not yet exporting them. Later, when defining the final contract of your interfaces, you just import that one.

After some months on working on our stories to bring functionality to our clients, all the code we touched was converted to TypeScript. For the last 12 files we just took the bite and convert them.

Yes, another milestone. Time for eclairs! 🎉

Spitting up our codebase

To be honest, our current Webpack-based could be improved. The build time, the time for unit tests to run was too long. Unit tests were going out of memory. Yes, it was holding us down at some moment. But we fixed it…

This was caused by the size of our project and the inefficient why of our Webpack build. But also because we were with too many developers pushing to the same code base. Our build pipeline got slow and unstable.

When you changed something small, all the unit tests needed to run. Those unit tests only took about 5 minutes. ( Do that 10 times and you lose an hour on your day. )

Therefore we started to split off parts of the application to separate packages. In a mono-repo style we moved domains to a package with its own build process and unit tests process.

We used Lerna to be able to run things parallel and publish small artefacts to our private repo. The container app just had to import them, and everything works like before.

Because we were splitting up our code base by domain, we could also easily give the responsibility of certain packages to certain teams. So each time has it strict boundaries of supporting and maintaining a package. If a bug was logged on one of those domains, it was more clear which team it had to fix, and this team could get the bug fix in easier without any hard approval of the team that was maintaining the parent project.

Unit tests could now run in parallel, the 6000 unit tests that previously ran in one queue took up to 12 minutes to complete. Run them in parallel in chunks of 500 and the unit tests complete in 4 minutes, ⅓ of the time. Having this in place and you have your return of investment very soon.

Oh yes we have invested a lot in testing, quality is important to us.

Splitting your codebase in different GIT projects however, can create complex dependencies between projects. We lost a lot of time by problems with version numbers of dependencies between codebases. In a mono-repo setup this is less of a concern.

Angular Hybrid

When more and more components were written in Angular, the need to reuse the components in AngularJS started to raise. Instead of developing in Angular around the application we also needed to use Angular inside AngularJS.

We had to refactor some pain points but we were able to use the Angular-hybrid ui-router. From that moment, it is possible to use Angular components in our legacy code.

What we don’t want to do is providing legacy AngularJS services via Angular-hybrid to Angular. Because we don’t want to create dependencies on legacy code.

Again, time for eclairs!

Write components that are future proof

From the moment we touched an existing component to add new functionality of when we are adding new components, we wrote those components to be future proof.

In the Angular migration path ( https://angular.io/guide/upgrade ) this component structure is well documented. This comes down to writing your controller as a class, with the structure of an Angular component in the back of your head.

Exit redux-like library

Our own redux-like library was good for what it was at that time, to start using a redux flow in our application. In the meantime, our usage of NgRx in Angular was growing, and it did not make sense to have 2 different state managers.

We did not want to remove our existing library and refactor all the usage of it. That would keep us busy for some days and probably has a lot of risk, because you are hitting the heart of the application. We don’t want that risk, because we deliver quality.

Instead, we made our library as a facade around a redux store. So every action on our library was now also available for our NgRx consumers. Because the library was well tested in unit tests, we could guarantee the stability when switching to the new version.

Write business logic in the new stack

Due the fact that the Redux-like mechanisme in the legacy code interacts directly with the new NgRx way of working. From now it is possible to write your new business logic in the form of reducers and facades in the new Angular stack. The new and old code are hooked and can work together transparently.

Remove dependencies between domains

We needed more isolation between the domains, but this was hard due all the dependencies between services.

Those dependencies were mainly;

Services that we were consuming application state from
Services that provide methods that could trigger an action in another domain
Helper functions to avoid code repetition

About the latter, you have 3 options:

You move the code to a commons package, that is being imported in all of your packages.
You don’t include it with AngularJS dependency injection, but you import it as a TypeScript function: import { helperServiceWithoutState} from ‘@yourProject/yourPackage/helperServiceWithoutState’
Or you find peace that code repetition does not have to be a deadly sin.

For the services we were using to consume state or to throw actions you create a local facade. In AngularJS it was not possible to create 2 services in the application with the same name, but we solved this by adding a namespace to the service name.

In this example you want to call internalService.doSomething() . This method needs some data from an external service, external as in from another domain.

If you namespace your service you can keep the rest of this code unchanged. And that limits the risks of bugs. We are adding the namespace ‘thisDomain.externalService’.

In the getSomeState() decorator we are at first calling the external service, but here we could rewire the call to get data from the store.

Greenfield

As you could read we started with an in-place refactoring, from inside outwards. We did not went for a complete greenfield refactoring where you rebuild something from scratch.

Am I against greenfield refactoring? No, on the contrary, the refactoring path we went on now gives you more opportunities to rebuild parts of the application from scratch. Small greenfield refactoring that are easier to estimate then saying: we just rebuild everything and we are finished in a year!

Once the dependencies between your domains are cut, you can easily replace a package with a greenfield one. Put this package behind a feature toggle and you can work on it in parallel.

If your new package is not finished and you want to release from master, you ship a version of your application with your old package included. The new package, where you are developing on, will be hidden for your users at less risk.

Risk of bugs is a big concern when refactoring at this scale. It was far from trivial to keep our application stable during this journey, even though we had a lot of safety nets already. Especially the breaking down of the data flow by reference (the quantum entanglement) required lots of extra testing effort, because you could not see the effects of your changes in the code.

However, the more of this data-flow-by-reference you fix, the more stable your application gets. And the main benefit comes when you can reuse isolated parts of your application in a greenfield setup.

Good luck!

Special thanks to Anneleen, Tom, Karel and all the team members of team-eclair. We did an awesome job! ❤