Never say never
aka Thrift `required` keyword in Real World
Another day, another thrift breaking change’s inadvertently checked in and broke lots of stuff. It’s time to revisit basics of good thrift API design. There are several thrift backward compatibility gotchas, I’ll only focus on the keyword `required`. A good source of thrift is the Missing Guide.
`required` keyword is intend to mark a field as prerequisite of a successful thrift call. Proponent of it claims it clearly declares API dependencies. I’m all for good API contract, but `required` keyword is such a blunt tool and fundamentally flawed: it disallows much of API evolution. That is, once a field is marked as `required`, it’s required for eternity. There is no backward compatible way to remove it later on. And compared to a field without any `required` or `optional` marking, `required` field doesn’t provide additional compile time safety.
At the first glance, it is not too much an stringent requirement. After all, there are certain concepts that supposedly shouldn’t change, and thus should be safe to mark as `required` in good API design. Only if the world is that simple, or our understanding of the world is that complete, both of which are typically false in reality. Here I’ll give two concrete (though made up) examples. Those examples will show what’s really immutable are some abstract concepts, rather than the current representation of that concept (a field in thrift structure).
Example one. Take for example of some “user service”, that takes GetUserRequest and returns a User object. Now, you may say for GetUserRequest to work, you must pass an ID. A user identifier is so essential to GetUserRequest and let’s marked it as `required`. Reasonable. Let’s assume at the beginning, user id’s an integer. So we declare something as: 1: required i32 userId. Now a couple of years later, our service’s tremendously successful and we run out of int 32 ID range and decide to migrate to a 64 bits ID scheme. There goes your immutability. In this case, what is immutable is the way to identify a user, rather than a particular incarnation of i32 representation of ID at the beginning.
Example two. A search service takes SearchRequest and return SearchResponse. Within the SearchRequest, there is a field called numResults, which is an i32. That seems a good candidate for `required` marking, since after all you need to tell the service how many items to return. Now let’s assume couple of month down the road, you decided to switch to a scheme, where instead of allowing client to explicitly mark how many items to return, instead the server would infer number of search result based on query type. That seems a possible evolution, only that once you declare numResults to be `required`, there’s no way to move to another scheme to identify how many results to return. Again here, while there’s a constant concept of number of items to return, how that’s expressed may change over time.
In a service oriented environment, geo-distributed teams own different services. It’s very hard and inefficient to guarantee services are deployed in certain order of dependencies. And because of this, it’s almost always bad to use `required` keyword. In the end, don’t save couple of lines, do parameter verification yourselves.
- Make most fields optional, when this makes sense. Check your input and throw on invalid entries.
- For things that look really really really really like ready for eternity, please skip using `required` mechanism for request validation. Just don’t mark the field with `required` or `optional`, and handle validation yourself (essentially null check).
Email me when Xian Xu publishes or recommends stories