3 Tips For Maintaining Your Scala Projects
Here are my three tips for keep maintaining Scala projects:
- Use Scala version upgrade as your learning opportunity
- Manage your core libraries in a monorepo
- Find your Scala Ninja
Use Scala Version Upgrade As Your Learning Opportunity
Upgrading Scala version is not an easy task, especially if you are using a lot of third-party libraries. Although the syntax of Scala has not been changed so much, libraries built for Scala 2.11, 2.12, and 2.13 are binary incompatible, so when upgrading your Scala projects you need to wait until all the other library developers release new versions for your target Scala version. For minor version upgrades, Scala team at Lightbend has been doing a great job to maintain the binary compatibilities. So you can use the same binaries within minor version series such as 2.12.x. Scala compiler version matters only for major version upgrades like 2.x or 3.x.
My recommendation here is turning this pain of Scala version upgrade into a learning opportunity. Actually using a newer Scala version itself has always been fun and exciting experiences. I’ve been using Scala since 2009. The first version I used was Scala 2.7, but it was almost like a joke; at that time Scala’s primary benefit was the capability of using Scala and Java at the same time, but converting Scala collections for Java APIs was not straightforward, so I never thought it would be a replacement of Java. The turning point was Scala 2.8, which introduced a new collection library that greatly improved the compatibility with Java collections. Scala 2.9 added parallel collection support to use multiple threads for processing collections at ease. Scala 2.10 added string interpolation, a handy syntax for embedding Scala expressions inside a string value. For each major release since Scala 2.9, programming in Scala is getting easier and more enjoyable.
Although there has been many version upgrades, the learning cost of Scala itself is not so high in fact. The only book I used to learn Scala language was Programming in Scala, written by Martin Odersky. He is a professor of EPFL and the creator of Scala. If you have some computer science background, this book is the one to read. The documentations on the web have been improved a lot, so you may be able to learn Scala only by using web materials, but I still recommend reading this book to fully understand its beautiful language design. For example, type variances and type mix-ins in Scala elegantly solved some of the short-comings of Java generics and multiple inheritance.
Scala is an object-oriented programming language for the Java VirtualMachine. In addition to being object-oriented…www.amazon.com
Scala 2.11 improved macros and reflection mechanisms, and explored meta-programming in Scala; You can now generate Scala code at compile-time to reduce the burden of writing boilerplate code. For example, airframe-log uses Scala macros to embed convenient logging code, and airframe-surface uses Scala reflection to inspect object shapes, so you can generate your own object serializer/deserializer for each object type without hand coding. Scala 2.12 supported Java8 and improved the compilation performance. Scala 2.13 is on the way for the major release with further compiler performance improvement. Scala 3.x will use a new Dotty compiler, which will improve IDE integration, incremental compilation, whole-code optimization, etc.
Scala’s development is quite active, and also the establishment of Scala Center at EPFL, which has its own engineers and researchers, is also a good news for the community. Every year there will be at least a couple of things that are fun to learn if you keep watching Scala.
Manage Your Core Libraries in A Monorepo
OK. I understand that Scala is evolving, but how can we try new Scala features in our projects?
My best practice is creating an open-source project in a monorepo, that is a single GitHub project that contains several sub modules. For example, twitter/util is a famous monorepo example in Scala.
What is the benefit of monorepo? For general open source projects, focusing on a specific problem and providing a tiny solution is a good practice. But as you get experienced and start creating several open-source projects, it becomes harder to manage all of them. Opening and checking multiple GitHub project activities are time consuming. Releasing a new version for each project becomes painful even if you started the project for helping the other people. The most annoying thing for me was if I upgraded some project, I also needed to upgrade the other projects using this library, even if these were my own projects.
So in 2016 I turned all of my essential core libraries into a monorepo structure. Airframe is such a collection of lightweight building blocks for Scala, which includes codes used for almost all of my product development in Scala. For example, Airframe has libraries for logging, object serialization, YAML-based configuration, JMX metric collection, JDBC connection pool, command-line option parser, object surface inspector, dependency injection (DI) library, etc. I’m managing Airframe to support Scala 2.11, 2.12, and Scala 2.13.0-M3 (milestone release). I still need to keep Scala 2.11 build because Apache Spark still requires Scala 2.11.
This repository is actually the history of my experimental results of using Scala’s new functionalities (e.g., macros, reflections, string interpolations, etc.). If I find something useful in a new Scala version, I usually try it in a new module or applying it to the existing code in Airframe. One of the oldest code in Airframe is StopWatch class for measuring code block performance. This was written in 2012, initially written in Java and ported into Scala when I was evaluating Scala as an alternative language for Java, and after finishing the migration from Java to Scala, I became totally comfortable in using Scala. This was my early learning step of Scala.
Reduced Release Time: Managing core projects in a monorepo is also good for reducing the deployment time. Airframe’s release process is fully automated. For each master branch commit, a new binary for Scala 2.11, 2.12, and 2.13 will be deployed to the Sonatype snapshot repository, and for general releases, simply adding a new git tag is sufficient. Travis CI detects the tag and creates a new release to the Maven central. Open sourcing is good in that you can use the existing ecosystems such as GitHub, TravisCI, Sonatype and Maven Central repositories, etc.
When we were using sbt-release plugin, the deployment was taking more than 2 hours because we needed to run test and deploy processes sequentially for multiple Scala versions, 2.11, 2.12, 2.13, and Scala.js for some projects used for web UI development. After giving up using sbt-release, we are able to run tests for multiple Scala versions in parallel using Travis CI, and we only need to attach a git tag for a commit that already passed the CI tests. The current release srep can be done within 10 minutes.
Key technologies to enable this quick release are as follows:
- sbt-sonatype: Automating synchronization from Sonatype to Maven central repository
- sbt-dynver: Automating versioning of your project based on git revisions and tags.
- sbt-release-early: Releasing your project for each master branch commit. Airframe is not using this plugin directly, but basically following the same practice.
- Automatic Releases to Maven Central with Travis and SBT: Automating GPG signing step at Travis CI.
The libraries in Airframe are used in our production codes, so maintaining them is crucial for our business. With this release automation, our engineers can release a new version or hot fix without waiting the work of the project owner. And also by aggregating all core libraries in one place, you can test the behavior of the dependent libraries in the same place.
I don’t recommend managing all of your projects in monorepo structure, but if you have a set of libraries that are commonly used between your projects, the monorepo approach is powerful and convenient.
Find Your Scala Ninja
Whenever I try to upgrade the Scala version and check the latest versions available for dependent libraries, I often see GitHub pull requests created by @xuwei_k. He is actively helping many projects for Scala version upgrades, sbt upgrades from 0.13.x to 1.x, supporting Java 9, etc.
And now he’s called Scala Ninja:
I think one of the reasons he is fixing so many projects is that he is actually a stakeholder as a user of these Scala libraries, and another reason is he has learned a lot about existing Scala projects through his upgrade experiences. So once he finds a common pattern, he can contribute to all projects he knows. I highly appreciate his familiarity to Scala projects and his contrubutions.
Scala is a growing programming language, so making PRs to upgrade Scala versions is helpful because it accelerates the migration of the entire Scala community to support the latest Scala compiler version.
If the release automation process described in the above is commonly used, merging such a PR and getting the latest library version will be much easier.
Scala Community Build is another way to see the migration status, and I’m happy to see some activities to support upgrading commonly used Scala projects in their dependency order. It seems that there are several Scala Ninjas in Lightbend, EPFL, as well. Your company also needs your own Scala ninja to maintain libraries used in your product code.
Again, here are my three tips:
- Enjoy using new Scala versions, and use this opportunity to learn new things. During this step, you will also learn about dependencies of your projects, and will be more familier with these third party libraries.
- Aggregate core projects into a monorepo so that you can test and upgrade multiple libraries at once.
- Automate the release step, and help your team and your Scala Ninjas!