VirtusLab
Published in

VirtusLab

Contributing to scalafmt

Working on open-source software can be a very rewarding experience. Whether you contribute to a particular library that you use daily or just to learn something new out of plain curiosity. The scalafmt is an example of an open-source project that is very mature and well managed. While contributing, I could count on the review and guidance of two great Scala community members, @tgodzik and @kitbellew — many thanks for your help!

Due to the upcoming release of Scala 3, much of the work listed here still needs to be done. I had a chance to do the part of it, and in this article, I will show you how implementing a feature in the scala formatter looks like. If you want to contribute to the scalafmt, hopefully you can find valuable information here.

Some Scala 3 features like given imports were pretty easy to introduce as you can see in the related pull request. The implementation that I will analyze is the formatting of the using soft keyword that was a bit more complicated. Therefore, it is a good example to show in a case study. If you are interested in how the final pull request looks like, you can check it out here. The whole pull request is too extensive to analyze it at once. That’s why, in this part of the series, we are only going to focus on creating the test suite. The implementation part is going to be described in the future post.

Let’s first take a quick look at the feature itself — the using soft keyword.

While former Scala versions supported only regular keywords, a new concept of a soft keyword has been introduced in Scala 3. It denotes a token that is a keyword but only under specific conditions. Otherwise, it can be used as an identifier.

One of the soft keywords is the using. It is treated as a keyword if it appears at the start of a parameter or argument list and as an identifier otherwise. Let’s look at the examples of those three scenarios from the Scala 3 reference.

Scala 3 introduces context parameters. Thanks to them, we don’t have to write repetitive arguments explicitly. The using soft keyword is used to declare so-called synthesized arguments. If you are familiar with Scala 2, these types of arguments work very similarly to the implicit parameters. To learn more about this feature, use the reference.

First, we need to clone the scalafmt repository and open it with the IDE of your choice. For the sake of this example, we are going to use a VSCode editor alongside the Metals extension. If you would like to get more familiar with how this IDE helps you in your Scala development — read my article that explores Scala 3 support in Metals. Below I clone the repository from Github using VSCode.

We will need the Metals VSCode extension installed to proceed with further steps.

After opening the cloned workspace, the only thing left to do is to import the SBT build definition with Metals using the popup window.

The build definition import is going to take a while. Afterwards, the workspace will be ready for development.

Though scalafmt codebase is very mature and its general structure doesn’t change much anymore, some elements mentioned in this article will differ in comparison with the most recent version. For the reference, the exact file versions that I ended up with are tests, documentation and configuration. The resulting commit is 93c2d3afd22b1d8e1a747ecfc3cf90b96af669a7 and you can check it out locally.

For now, we know what we want scalafmt to do — format the code containing using soft keywords. What do we start with then? The test-first approach appears to be extraordinarily practical, while working on new feature support, because it helps us twofold. First, it has the standard benefit of giving us feedback on whether the introduced code works and doesn’t break anything. Second, it helps us define how we want the resulting code to look like which is not so obvious in some of the new features.

We are going to start the actual work by creating a unit test suite. To proceed, first, we need to understand how the unit tests are organized within scalafmt. Their structure is unique in comparison with the usual way of testing Scala projects. Its purpose is to facilitate testing the core functionality of the tool. It’s done by letting the contributor provide small code excerpts in a form of statements containing the specific feature that will be tested.

To run the format tests, we are going to use sbt tests/testOnly *FormatTests command. For now, it is going to run all the format tests. Format tests are located within the scalafmt-tests module under src/test/resources. Usually, the new test suite should be put under the test/resources/test directory. This is because other directories not only categorize the test files — some of them provide the additional configuration that influences how the tests are run.

Dotty is a name for the language and compiler that became Scala 3. That’s why I tend refer to the occurrences of Dotty from this pull request as Scala 3.

This is the case with the scala3 directory. Its scope provides the configuration that is helpful while testing the formatting of Scala 3 features. To find out what exactly happens when the scala3 tests are run, search for the string scala3using Search: Find in Files VSCode command. The search should point you to the spec2style method in HasTests.scala. It returns the kind of ScalafmtConfig that contains the Scala 3 Dialect, used by Scalameta to understand this version of the base language. The test cases containing Scala 3 snippets are not going to be parsed correctly without this configuration.

Let’s now create a new file named Using.stat under scala3 and put the first test case in there. How do we know what to start with? A good starting point is the reference of Scala 3 that contains the description of all the new features alongside the examples, including the mentioned before chapter on using clauses. The rule of thumb, is just to take those examples and make the test cases out of them. However, in practice, they require some tweaks. Also, they’re usually not enough for creating the complete test suite, we might also want to test some more complex situations. Hopefully, after implementing them, we will have a better idea of what more should be tested for the test suite to be complete.

Let’s again take a look at the first example from the reference. It’s just a little more powerful maximum function that takes two arguments and returns the greater one based on provided ordering.

The first test case, that was created based on that example looks like the one below. Additional comments preceded by # explain the syntax of scalafmt test cases.

Take a look at the ONLY keyword at the top. When we put it before the <<<, the usual command sbt tests/testOnly *FormatTests is going to run only this specific test suite instead of all the formatting test suites. For the complete instructions on formatting tests, refer to the guide.

Notice that I’ve got rid of the function body because we don’t care about it in the context of our test case. We expect the code statement first — to contain the syntax that we want to test, second — to parse. The empty body in the form of {} is sufficient for this purpose.

Next, pay attention to the maxColumn = 40 scalafmt rule — it is crucial for this test case because the whole to-be-formatted statement is a little bit longer than 40 characters. Hence, we expect the line break to be inserted by scalafmt in the correct place.

How do we know where exactly the line should break? I can think of at least three different possibilities. We don’t want to blindly guess which one of them is correct though.

The first thing to do before selecting the correct place for the line to break is to look for the syntax that exists in Scala 2 and can be used as an analogy. Of course, such an analogy isn’t always present. The task of creating the expected formatting result can be much more complicated in that case. Though, as I’ve already mentioned before, using soft keyword is analogous to the implicit keyword from Scala 2. The relationship between them is even explained in detail.

The part that interests us at this point is the one related to the using clauses. Based on that part, we can conclude that in parameter lists using behaves in the same way as implicit. There is a difference though in the argument lists in which the using has to be written explicitly.

In case of any doubts about our assumptions’ correctness, we can always look at the standard context free syntax of Scala 3. Each reference chapter usually contains the corresponding part, including the using clauses.

Based on the investigation, we already know that in parameter lists using behave like implicit. How can we use this knowledge to implement the using clauses? Let’s simply replace the using in the scope of our test case with implicit and see what the result of the test run is. The tests/testOnly *FormatTests run in the SBT console results in the information that we are looking for, the obtained result looks like the one below.

Sometimes, if the related entry exists in the documentation, another approach would be to look at it and notice what the default behavior should look like.

Now, we finally know that the complete test case should look like the following one.

For now, the obtained result looks like the one below. As you can see, the newline character is being inserted before the using instead of after it.

At this point, we have to decide if we want to complete the test suite with other relevant test cases at once or if we just want to make this one test case work for now. However, in this article, we focus on creating the test suite. A couple of more test cases can be created in a similar way to the first one by taking examples from the using clauses’s reference. How do we know what other scenarios should we cover with the unit tests? The significant thing to keep in mind while working on scalafmt is that it comes with a lot of configuration options. To find out which of them are relevant for us at the specific moment, we can again use the observation that the using formatting is very similar to the implicit.

Let’s look for some existing unit tests containing the implicit parameter lists and notice what configuration options they have. Right-click on the scalafmt-tests/src/test/resources directory in the explorer view and select Find in folder…. Next, type (implicit in the search bar. The ( prefix is helpful because it allows us to find only the occurrences of implicit related to the argument lists.

There are still plenty of them but narrowing down our search and looking only for those containing some relevant configuration options we focus, for example, on the beforeImplicitKW.stat and afterImplicitKW.stat test suites. There appear, among others, the newlines.implicitParamListModifierPrefer and the newlines.implicitParamListModifierForce config options. Looking at the documentation we can see they control the newline behavior around the implicit keyword.

To ensure that using is also provided with those configuration options, we have to do exactly three things. First, decide what to do with the option’s name since it indicates that it is only related to the implicit and parameter lists. Second, update the scalafmt documentation. Third, create test cases with all the possible configuration variants.

For the first matter, after the discussion with other contributors, we decided to create an alias for the existing option. It means that now we are able to use the newlines.implicitParamListModifierXXX and newlines.usingParamListModifierXXX interchangeably. Both of them will work at the same time for the using and implicit. In scalafmt, an alias like this is pretty easy to introduce. The only thing you have to do is use the annotation from metaconfig over the existing option in the Newlines config file.

Moreover, in the context of using, these options work not only for parameter lists but also for argument lists (without the need to specify any different option called newlines.usingArgListModifierXXX).

Considering the second matter, while introducing changes above, we always have to keep in mind the importance of keeping the documentation up to date. In this case, it was enough to add a short note about the change without a need to rewrite any more extensive part of it.

The third thing is to create the tests that make use of all the variants of those options. Let’s consider one of them in which the newlines.usingParamListModifierForce = [before] option is used. Based on our common sense and the required documentation we know that this option forces the new line to be inserted before the using (or implicit). Also, the maxColumn = 60 option is crucial here as the line has to break in the specific range of tokens.

With such a significant amount of configuration options as in scalafmt, it is sometimes tedious to investigate those relevant. Fortunately, you can always count on more experienced reviewers that can suggest what else needs to be tested. The remaining option that appeared to be related to the using is verticalMultiline.atDefnSite. I won’t get into details because it can be approached in a similar way to the previously considered options.

One more thing to remember is that, unlike implicit, using is a soft keyword. We need to test then how the formatter behaves in a case when using occurs in a role of an identifier. The example below is a valid Scala 3 code that contains using both as an identifier and as a keyword.

That’s it for now! As I said at the very beginning, this pull request wasn’t the easiest one. The complete test suite contains as many as 25 test cases. Almost all of them were based on the three approaches described in this article. They are all relevant while implementing other features as well.

In scalafmt, creating the test suite related to the specific feature that we want to implement is half of the success. The other half is to make those tests work. Are you interested in how the process of implementing the actual functionality looked like? Stay tuned for the next part!