Code Generating Your Way to Happiness

Published in

e-Legion

15 min readJul 3, 2018

We all do a routine job. Everyone writes boilerplate code. And everyone agrees that it is better to automate this process and work only on interesting tasks. Here are some tips on how to make the computer do the routine work for you.

This article is based on the talk delivered by Zac Sweers, Android Developer at Uber mobile applications, at the MBLT DEV conference in 2017.

Uber has about 300 developers of mobile applications. I work in the team called the “mobile platform”. The job of my team is to simplify and improve the process of mobile application development as much as possible. We primarily work on internal frameworks, libraries, architectures and so on. Because of our large staff, we have to do large-scale projects that will be required by our engineers in the future. It may be tomorrow, next month or even the next year.

Codegen for automation

I would like to demonstrate the value of the code generation process, as well as to consider several practical examples. The process itself looks like this:

This is an example of using Kotlin Poet. Kotlin Poet is a library with a good API that generates Kotlin code. So, what do we see here?

FileSpec.builder creates a file with a “Presentation” name.
.addComment() — adds a comment to the generated code.
.addAnnotation() — adds an annotation with the Author type.
.addMember() — adds the “name” variable with a parameter; in our case it is “Zac Sweers”. %S — the parameter type.
.useSiteTarget() — sets SiteTarget.
.build() — completes description of the code, which will be generated.

After codegen we get the following:

The result of code generation is the file with name, comment, annotation, and author name. So you might ask: “Why do I need to generate this code if I can complete it within a couple of simple steps?” Yes, you are right, but what if I need a thousand of these files with different configuration options? What happens if we start changing the values in this code? What if we have a lot of presentations? What if we have a lot of conferences?

As a result, we will come to the fact that supporting such a number of files manually becomes impossible — you need to automate. Therefore, the first advantage of codegen is getting rid of routine work.

Code building without errors

The second important advantage of automation is correctness. Humans are error prone. This is especially true when we do the similar thing. Computers, on the contrary, perform such work perfectly.

Let’s look at a simple example. There is a Person class:

Let’s say we want to add JSON serialization to this. We will do this with the help of Moshi library, as it is quite simple and great for demonstration. Create PersonJsonAdapter and inherit from JsonAdapter with the <Person> type parameter:

Next, we implement the fromJson method. Then fill in the fields with the first and last name provided by ‘reader’, and get the new Person value:

Next, we look at the data in JSON format, check it and put it in the required fields:

Is this going to work? Yes, but there is a nuance: inside JSON there should be objects that we read. We add another line of code in order to filter redundant data that can come from the server:

At this point, we successfully bypass the area of routine code. In this example there are only two value fields. However, there is a ton places in the code where you could just have accidental issues. What if we made a mistake in the code?

Consider another example:

If you have at least one issue every 10 models or something, it means that at least one problem will occur on this site. And this is the case when codegen can really come in and help. If there are many classes, it will be impossible to work without automation, because all people make errors. With the help of code generation all tasks will be performed automatically and without errors.

Code generation has other advantages. For example, it gives you information about the code or tells if something goes wrong. Codegen will be useful at the testing stage. If you use the generated code, you will be able to see how the production code will really look. You can even generate code while testing to make your life easier.

Conclusion: it is worth considering code generation as a possible solution for getting rid of errors.

Now let’s look at the tools that help with codegen.

Tools

JavaPoet and KotlinPoet libraries for Java and Kotlin respectively. These are the gold standards in codegen.
Templating. A popular example of templating for Java is Apache Velocity, and for iOS — Handlebars.
SPI — Service Processor Interface. It is built into Java and allows you to create and apply an interface and then declare it in JAR. You can get all the ready implementations of the interface.
Compile Testing is a library from Google which helps to test compilation. Within the framework of codegen this means: “Here’s what I expected, and here’s what I finally got.” The compilation starts in memory, and then the system tells you whether the process has been completed or what errors have occurred. If the compilation is complete, you will be asked to compare the result to your expectations. The comparison is based on compiled code, so you shouldn’t worry about things like code formatting or something else.

Code build tools

There are two main tools for code building:

1. Annotation Processing — you can write annotations in the code and ask the program for additional information about them. The compiler will provide the information before it finishes working with the source code.

2. Gradle — an application build system with a lot of hooks (hook — interception of function calls) in its code assembly life cycle. It is widely applied when developing on Android. It also allows you to apply codegen to source code that does not depend on current sources.

Now let’s look at some examples.

Butter Knife

Butter Knife is a library developed by Jake Wharton. He is a fairly well-known figure in the developer community. The library is very popular among Android developers because it helps to avoid a lot of routine work which is faced by almost everyone.

Usually we initialize view as follows:

Using Butterknife looks like this:

And we can easily add any number of view, and the OnCreate method will not grow boilerplate:

Instead of manually binding view every time, you just mark these fields with @BindView annotations and pass the ID that they’re being bound to.

In Butterknife, it is nice that it will analyze the code and generate all of its similar areas. And it also scales nicely with new items as well. Therefore, if new data appears, you do not need to re-apply OnCreate or track anything manually. This library is also great for deleting data.

So what does this look like under the hood? Find view by decode is still there, and the process gets done at annotation processing.

We have this field:

According to this data, they are used in some FooActivity:

It has its own value (R. Id. title), which acts as the target key. Note that during data processing this object becomes a constant value within the system:

That’s fine. This is what Butter Knife should have access to anyway. The TextView component is a type. The field itself is called “title”. If we, for example, make a holder out of this data, it will be something like this:

So, all this data can be easily obtained during processing. It’s also very similar to what Butter Knife actually does under the hood.

As a result, this class is generated:

Here we see that all these pieces of data come together. In the end, we have a target ViewBinding class from the Underscore java-library. Inside, this system is designed so that every time you create an instance of a class, it immediately performs all this binding to the information (code) that you generated. And all this is statically generated upfront during annotation processing so that it is basically correct.

Let’s go back to our software pipeline:

During annotation processing the system reads these annotations and generates the ViewBinding class. And then, during the runtime of bind method, we perform an identical search for the same class in a simple way: we take its name and attach ViewBinding at the end. ViewBinding itself during processing just writes itself out to a given location with JavaPoet.

RxBinding

RxBinding itself does no codegen. It does not deal with annotation processing and is not a Gradle plugin. It’s a regular library. It provides static factories based on the principle of reactive programming for the Android API. This means that, for example, if you have a setOnClickListener, it will have a clicks method that returns an Observable of events. It acts as a bridge (coding pattern).

In fact, there is codegen in RxBinding:

Inside this buildSrc directory there’s a task in Gradle called KotlinGenTask. This means that all of these are actually generated. RxBinding has Java implementations. It also has Kotlin artifacts that have extension functions for all those target types. And all this is extremely mechanical. For example, you could actually just generate all of the Kotlin extension functions and not have to maintain those separately.

So how does this actually look like?

This is a pretty classic RxBinding method. You just have returns an observable of objects. The method is called “clicks”. Work with click-events takes place under the hood. We will skip the extra code snippets for readability of the example. In Kotlin it looks like this:

This extension function returns an Observable of objects. In the internal structure it directly calls through to the regular Java-interface. In Kotlin will have to change this to Unit type:

That is, in Java it looks like this:

And here is the Kotlin code:

We have the RxView class that contains this method. We can substitute the relevant bits into the target attribute, the name attribute with the method name, and to the type that we are extending, as well as the type of the return value. This is all the information that we need to actually start writing out these methods:

Now we can just inline them directly into the generated Kotlin code under the hood. Here is the result:

Service Gen

We do “Service Gen” at Uber. If you work in a company where you’re dealing with shared specs or shared API’s between back-end and client whether it is Android, iOS, or Web application, there is no reason for you to handwrite models and services for that matter for those to work together.

We use the AutoValue library from Google for our model objects. It handles annotations, analyzes the data, and generates a hash of two strings, the equals() method and other implementations. It is also responsible for supporting extensions.

We have a generic Rider object:

We have the strings with ID, firstname, lastname, and address. We use the Retrofit and OkHttp libraries for service and JSON as the data format. We also use RxJava for reactive programming. This is what our generated API service looks like:

We can hand write this if we want to. And we did it for a long time. But it is a huge waste of time. It actually comes to very real time and monetary cost.

What and how Uber is doing today

The latest task of my team was to create a text editor from scratch. We decided not to hand write anything that goes on network anymore, so we use Thrift. It’s like a programming language and protocol at the same time. Uber uses Thrift as a language for technical features.

In Thrift, we define the API contracts between backend and client, and then simply generate the code that matches that. We use a library called Thrifty to parse, and JavaPoet for the codegen. And then we just generate the AutoValue implementations:

We do everything in JSON. There is an extension called AutoValue Moshi, which you can add to the AutoValue classes using the static jsonAdapter method:

Thrift also helps in the development of services:

We also have to add some metadata here to tell it what end point we actually want to hit:

We get our service after codegen:

But that is just one endpoint. One model. As we saw earlier, no one has ever used only one model. We have many models that generate code for our services:

At this point we have about 5–6 applications. And many services. And all go through the same software pipeline. Wiring up this manually would be crazy.

In JSON serialization, “adapter” does not need to be registered in Moshi, and if you use JSON, you don’t need to register with JSON. It is also doubtful to offer employees to deserialize through re-writing of code via the DI-graph.

But we work with Java, so you can use Factory pattern, which is generated through the Fractory library. We can generate this because we know about these types before the compilation has occurred. Fractory generates an adapter like this:

The generated code looks a bit ugly. If it offends the eye, you can rewrite it by hand.

Here you see the previously mentioned types with the names of the services. The system will automatically figure out which adapter to choose and invokes it for you. But we have another problem here. We have 6000 of these adapters. Even if we split this all up into one factory, it will end up in a case where we have “Eats” models going into “Rider” or “Driver” models going into the “Rider” app when they don’t need to. This will bloat the code. After a certain point it will not even fit in one .dex file. So we have to split the adapters in some way:

In the end, we will analyze the code in advance and create a working sub-project for it, as in Gradle:

And these dependencies just become Gradle dependencies under the hood. The elements that use the Rider app now depend on it. Using it, they will form the models they need. As a result, our task will be resolved for, and it all will be handled by the code build system under the hood.

But now we have one more problem in here: now we have N number of model factories. These are all being compiled in different compilation units:

In annotation processing you can no longer strictly read other external dependencies annotations and do extra codegen on that.

Solution: we have some support in this Fractory library that helps us again with clever trick. It is contained in the data binding process. Enter metadata using the сlasspath parameter in the Java archive for further storage:

Now, every time you want to use them in your application, go to the classpath filter on these files and then read this JSON out of there to know what dependencies are available.

How it all fits together

We have Thrift. Data from there goes to Thrifty and gets parsed. Next it goes through a codegen tool that we call Jenga. It outputs files in Java format. All this happens before the prebuild step or before compilation. During compilation this goes into annotation processing. It is the turn of AutoValue to generate the implementation. It also calls out to AutoValue Moshi for the JSON support. And Fractory in involved in this as well. And all that is during compilation. The process is preceded by a component of creating the project itself, which generates Gradle sub-projects in the first place.

Now that you see the broad picture, you start noticing the tools that were mentioned earlier. For example, here is Gradle, templating, AutoValue, JavaPoet for the codegen. All the tools are not only useful on their own, but perform well in combination with each other.

Cons of codegen

We need to talk about the pitfalls too. The most obvious disadvantage is bloat of the code and loss of control over it. So Dagger actually accounts for about 10% of the code in our app. Models actually account for a lot more — about 25%.

At Uber, we try to solve the problem by throwing out the unnecessary code. We have to conduct some statistical analysis of the code and understand what areas are actually involved in the work. When we figure this out, we can make some transformations and see what happens.

We’re hoping this could actually bring about 40% down on generated models. This will help to speed up the installation and operation of applications, as well as save us money.

How codegen affects a project building time

Code generating certainly accelerates development, but the timing also depends on the tools that the team uses. For example, if you work in Gradle, you probably do it at a measured pace. The thing is that Gradle generates the model once per day, but not when the developer wants it.

Video

Learn more about the development process at Uber and other top companies

The 5th International Mobile Developers Conference MBLT DEV 2018 takes place in Moscow on September 28. 800 participants, experienced speakers, quizzes, and challenges for those who are into the Android and iOS development.

The conference is organized by e-Legion and RAEC (Russian Association of Electronic Communications). Join the conference lineup: submit an abstract and receive a travel grant to share your expertise.

Not ready to give a talk? Visit the conference and learn from others. Grab your ticket here.