Ditching Gson’s Field Naming Strategy

A quick reflection

Snippet of Gson’s FieldNamingPolicy file.

I recently started working on a completely new codebase for me, one that is almost 6 years old but that has aged very well, luckily. Many early bets have proven to be right and have definitely paid off over time: the wide adoption of MVP, RxJava, Dagger… the whole gang.

The interaction with the backend is also very traditional, with JSON payloads passed back and forth. The tool used to de-serialize JSON into POJOs and vice-versa is Gson, which has been the de facto standard for quite some time now.

I was very familiar with Gson, having used it on multiple occasions, both for work projects and personal ones. However, when I started looking at the code, things weren’t as I expected them to be.

The lack of annotations

The @SerializedName annotations were nowhere to be seen. For context, this annotation tells Gson to use the value of the annotation as the key to serialize and de-serialize the respective class’ member. For example:

data class User(@SerializedName("first_name") val name: String)
"first_name": "David"

Gson normally uses the member name as the JSON key, but by using @SerializedName you can provide a different key that Gson will use.

Why using @SerializedName in the first place then, and not simply relying on the member names for that? The typical answer is: naming conventions. I’m pretty sure that the following JSON snippet looks weird to at least some of you:

"defaultUser": {
"firstName": "David",
"lastName": "Rossmann"

Whereas this looks way more familiar, doesn’t it?

"default_user": {
"first_name": "David",
"last_name": "Rossmann"

That’s because we’re used to a naming conventions for JSON that relies on snake case. At the same time, applying this convention to Java/Kotlin code would look weird to at least some as well:

data class AnotherUser(
val first_name: String,
val last_name: String

Anyway, back to the codebase. The members of the model classes followed the standard camel case Java/Kotlin convention. There were no annotations, so I assumed Gson would be using the member names as the key. I then proceeded to check this theory by looking at the JSON payloads from the backend, but to my surprise, those were following the snake case convention that JSON typically comes with.

What the heck? 🤔

TIL: field naming policies

My TIL, or “Today I Learned”, for that day was discovering the fact that you can instruct Gson to automatically apply a different naming convention across every serialization and de-serialization: this is called FieldNamingStrategy, and Gson provides a handful of them by means of the FieldNamingPolicy enum.

The default value for the naming strategy is FieldNamingPolicy.IDENTITY, which means that Gson will use the class member’s name as is, without altering it. The complete list of naming strategies is the following one:

IDENTITY                          firstName > firstName
LOWER_CASE_WITH_DASHES firstName > first-name
LOWER_CASE_WITH_UNDERSCORES firstName > first_name
UPPER_CAMEL_CASE firstName > FirstName

And, in fact, the Gson instance that was delegated serializing back and forth the backend payloads was indeed instructed to use a custom field naming strategy, like so:

val gson = 

Good, that settles it. But it led me to reflect a little bit more on this choice, and to propose a change.

Being explicit

My first thoughts on this choice were a little daunting. To me, this looked like a ticking time bomb, hidden in the codebase, ready to go off at any given time. But before jumping to rushed conclusions, I informed myself a little more on what’s the current stance of the community around using field naming strategies, given that I might have been a lone wolf that had never seen such things used, but that were instead widely adopted.

As it turns out, the ability to customize naming policies is a pretty well-known feature of Gson, and it’s also well received by a big chunk of the community. However, I’m not the first person to be concerned that it might not be such an obvious choice, as Jesse has written in this wonderful article (that I recommend you to read).

I settled on using the @SerializedName annotations, for a number of reasons.

First and foremost, it is explicit. The annotation doubles as a hint of the fact that something is using a different name, and that said name should not be changed, but also suggests that the member name can change. This greatly reduces the need to maintain documentation around it (some sort of self-documenting code), or to require somebody to even read it: onboarding is now simpler and more effective.

Second of all, it is less error prone. By not relying on assumptions but rather on explicitness, we can reduce the risk of a team member accidentally renaming a member for the sake of consistency. I’ve discussed it with a couple of friends (h/t Stefano), and their experiences seem to point out that the chances of making mistakes between naming conventions and annotations are nearly identical. But, assuming that it’s true, I’d still take the explicitness over implicitness because of the first point.

Lastly, more modern frameworks like Moshi are moving away from naming policies, and existing tools tend to support annotations but refuse to support naming policies (e.g, gsonvalue).

Bottom line

After all these reflections, we discussed within the team on how to move forward, and we reached consensus over marking members with annotations. In the process of doing so, we even spotted silent errors that were covered only because of the naming policies that were in place, which we were very happy to fix.

We are hopeful that this choice will prove to be the right one going forward. The annotations approach is no silver bullet, and it won’t prevent mistakes from happening. But it will reduce the scope of a potential investigation, and hopefully have a developer think twice before changing a name.