Introducing Astra: A Tool for Refactoring Java Programs at Scale

Joseph Hoare
Engineering at Alfa
5 min readApr 15, 2021

One of the most important features of good software is changeability. Software needs to change.

As the size of a codebase grows, it can gradually become harder to change and avoid accumulating ageing, hard-to-maintain code. Projects with lots of features often use many dependencies. When a new version of a dependency makes backwards-incompatible changes to an API, an upgrade may be blocked until you update all of the places it’s used. In some cases, this can be a huge task — more than can be done manually. Large, feature-rich codebases can be a huge asset, but also a liability.

Astra is a tool for analysing and refactoring Java source code. It’s open-source, and it’s on GitHub. We developed it at Alfa as part of our innovation process, to improve the velocity at which large-scale refactors can be performed. It’s been designed to be an extensible framework available for others to build on.

Astra aims to make changing Java programs easy. With Astra, you can write automated refactors that match on uses of the old API, and then upgrade them to be compatible with the new one. It’s not limited by what a developer can reasonably refactor manually, or the volume of code they can import into their IDE. When changes are easily automated, the size of a codebase stops limiting its ability to change.

When to use Astra

Whenever we make a code change, it’s good practice to try to fix any deprecation warnings we notice. Some of them appear again and again. You know the ones:

Some methods provided by Apache’s ObjectUtils are now deprecated, and the maintainers recommend switching to equivalents from Java's Objects. This doesn't take long — but it would get repetitive quickly, and backlogs, priorities and deadlines might not allow you to spend the time removing them all.

So why not just use regular expressions and String manipulation on the Java code itself? Nuances! Looking solely at the text of the source code, without any extra information, will not help identify nuances. It’s hard to match on the text “equals” without also matching on other uses of those characters — such as other methods declared with that name, text in comments or Javadoc, invocations of other “equals” methods (like other statically imported methods), and so on. This is why it’s so useful to also take account not just of the text of the code, but also the semantics — so that we understand what the code means.

How does Astra work?

Abstract Syntax Trees (ASTs): incredibly useful structures for understanding code, provided to Astra by the Eclipse Java development tools framework. Astra stands on the shoulders of giants.

The rich information from the AST can be used to make informed changes, enabling matching based on features of code — like finding types based on their names, annotations, whether they are classes, interfaces, abstract, their access modifiers and more. But this isn’t just limited to types; Astra provides other matchers (like the MethodMatcher), with the option to define your own — used to match on any features of source code.

Astra uses matchers in combination with refactoring operations. To address our example of updating callers from one “equals” method to the other, we can provide Astra with some configuration (implement UseCase.getAdditionalClassPathEntries with an absolute path to a commons-lang3 jar), and a refactor like this:

This is a very quick way to define simple refactors that can be run across codebases of any size. At Alfa, we have used Astra to refactor thousands of method invocations across thousands of files at a time. It also fits quite well with some definitions of the essence of refactoring:

A controlled technique for improving the design of an existing codebase. Its essence is applying a series of small behaviour-preserving transformations, each of which “too small to be worth doing”.
- Martin Fowler

What we like most about Astra

It’s powerful

It’s incredibly satisfying to set up a refactor, and watch it power through mountains of code, thundering through changes at a pace you’d never be able to manage yourself. I found watching the log files scroll by with thousands of updates extremely gratifying :)

At Alfa we’ve used it for some pretty huge things, like changing our date-time handling from one pattern to another, across our entire codebase.

It’s easy to get started

Astra has a neat and intuitive test framework. When you’re thinking about your refactor, you probably have an idea of what you have now (and want to change), and how you’d like it to look. You therefore have expected input, and output, and Astra can take these real Java files as input and output for unit testing refactors, using the AbstractRefactorTest. This framework makes it easy to TDD your refactors.

Why not have a look in the wiki and get started by unit testing a real refactor.

It makes it easier to do the right thing

Switching and consolidating to newer libraries allows us to:

  • Ship fewer libraries with our products
  • Deprecate and remove libraries, and switch to more fully featured, better maintained alternatives
  • Reduce the number of code paths — fewer methods means fewer opportunities for bugs
  • Improve performance — we found, for example, removing multiple representations of dates (and layers of conversion logic between them) meant we dropped the associated work of instantiating/calculating/parsing — win-win!

Some technical debt tends to stick around for a long time, cluttering code. There can be extra effort to understand and work around it, plus the potential to cause insidious bugs. Spending the time to remove debt on a case-by-case basis can feel less important than other work, so by lowering the effort level, Astra can change this balance.

How annoying is it to see stuff like this?

Astra makes it worthwhile to get rid of them, all in one go. It can be the enabler for a whole class of large-scale change.

If you’re interested, why not take a look at the documentation on GitHub, and fix that linter warning in your codebase?

--

--

Joseph Hoare
Engineering at Alfa

Principal Software Engineer at Alfa Financial Software