Part 1: Avoiding Null-Pointer Exceptions in a Modern Java Application
Null do’s and don’ts
In the talk Null References: The Billion Dollar Mistake, Sir Tony Hoare describes implementing null references as a part of the ALGOL programming language, as well, a billion-dollar mistake. Authoritative programming books like Clean Code: A Handbook of Agile Software Craftsmanship advise you to use null as little as possible, while the book Bug Patterns In Java dedicates three whole chapters to problems stemming from null-values. The StackOverflow topic “What is a null pointer exception and how do I fix it” has 3 million views. It is easy to get anxious working with null-values!
I’m not one to tell Turing-award winners like Hoare how to design programming languages, but I will say that I don’t think null is inherently bad. I will discuss why that is, when you should and shouldn’t use null, and how to elegantly handle null-values over in the following two blog posts. In the current post, I will present my musings on using null values and in the next blog post (going online Friday 21st August), we’re going to look at the practical application of some techniques for working with null, including recent features in Java 8.
While the posts are Java-centric, the underlying principles and discussion should extend to object-oriented programming languages in general. The blog posts are primarily intended for less experienced programmers and anybody else who might be intimidated by null, but old dogs might learn some new tricks too.
The dangers of null
Null is a special value as it is not associated with any type (feel free to verify with instanceof against every other class in the JRE) and will happily take the place of any other object in variable assignments and method calls. This is what gives null its two dangerous properties:
- Every complex-typed return value may be null
- Every complex-typed parameter value may be null
As a result, any return value or parameter object is a NullPointerException (NPE) waiting to happen if not handled carefully.
Is the solution then to check every return value and parameter for null values?
It should be obvious that this is a bad idea. First, it clutters the code with null-checks. Secondarily, it forces the developer to use precious time on considering the right way to handle null-values that never occurs, and the null-checks will mislead other developers.
Is the solution to never assign null to any values? Well, no. As highlighted by the fact that most programming languages have the concept of an empty value (nil, undefined, None, void, etc), having a general value for nothing is tremendously useful.
An error in a while-loop condition can create an infinite loop in any program, but it doesn’t make while-loops inherently bad. It would be similarly misguided to think that null is always bad just because misuse can lead to errors. Null is the most natural value to express certain things, but also a very poor value to express others. The competent developer must know when that is and isn’t. In the succeeding section, I will show you where that is.
When to use null and when not to
In this part, I will examine the scenarios in which null are returned from and passed to methods, discuss some of the traditional alternatives (i.e. pre-Java 8) and argue why using a built-in empty type like null is the best solution in some cases.
One of the core philosophies of object-oriented programming is to model the concepts of the business domain in which our software operates. We do this by defining classes that correspond to these real-life concepts and their attributes. I think it’s meaningful to consider these types of classes separately from other types, when it comes to using null values.
In the project I work on, the Columna electronic patient record system, we have classes that represent concepts in a hospital workflow, like patients, medications, physicians, hospital units, admissions, etc. As a part of modelling any domain, there are cases where we need to allow something to be nothing. For instance, assume that we have a class to represent an admission with attributes that describes the admission, like the hospital unit where the patient is admitted, the reason for admission, the time of admission, etc. Similarly, we might have a class that represents a patient that holds attributes like name and social security number. At any given point, a patient may or may not be admitted. More formally, we have a has-a relation with a cardinality of 0..1.
Assume a method that retrieves admission information in a database for a given patient:
public Admission getAdmission(Patient patient);
What should the method return for the patient who is not admitted if not null? Is there a more precise value to express this? I’d argue that there isn’t.
There are many other scenarios, where an object from the domain is directly associated with optional values as attributes. An object representing a medication holds values such as the medication name, the form of the medication, the strength, the active drug, etc. But medications are incredibly diverse, ranging from antibacterial creams to cannabis teas and not all of the attributes apply to each. Again, returning null for a missing attribute seems like an obvious choice to communicate that value is allowed to be missing. Is there a better alternative?
Some advocate using the so-called Null Object design pattern instead of null. The basic idea is to implement a hollow class with none or little functionality — the null object — that can be used in place of the class that holds the actual functionality. The pattern is best exemplified with a scenario, where you must recursively traverse a binary tree data structure to find the sum of the value of each node. To do so, you essentially run a depth-first search and recursively sum the value of the left subtree with the right subtree in every node.
A search tree is typically implemented such that every node in a search tree has a left and a right child that is either a node themselves or an empty leaf node. If the tree representation uses null for leaf notes, you will have to explicitly check for null children to stop recursing at a leaf node and to prevent trying to get the value of the node. Instead, you should define a Node interface with a simple getValue() method and implement it in the class representing a node that computes the value by summing the getValue() values of the children (see figure below). Implement the same interface in a class representing a node and let the leaf class return 0 when called. Now, we no longer have to distinguish code-wise between a leaf or a node; the need for an explicit check for null goes away and so does the risk of an NPE.
To apply the pattern for the admission example above, we would need to create an Admission interface. We would then define an AdmissionImpl class for when we can return actual admission data and an AdmissionNullObjectImpl for when we can’t. This would allow the getAdmission()-method to return either the real AdmissionImpl or a AdmissionNullObjectImpl.As the calling code uses the shared Admission type, we can treat both objects the same with no risk of exceptions or without cluttering the code with checks for null handling.
However, I personally find it very hard to find usages for the pattern in typical production codebases, where the logic is often far more complex than a simple accumulation of numbers. Patterns exist to simplify solutions, but add complexity with no benefits when used inappropriately.
What is the AdmissionNullObject class supposed to return when it doesn’t hold any real data? What should it return instead of a location object when getLocation() is called? What is the natural start date to return?
In many cases, you’d have to write code that at some point needs to check for something to handle fake values, so why not just have a null check in the first place and avoid defining extra complexity-increasing classes? There are cases where the pattern works great, but I feel they are rare in the real world, as it can only be used for objects with methods with void values or where you can return something that naturally fits into the flow of the surrounding code.
Another alternative, which is sometimes used, is to use the empty string instead of null when an attribute is represented as a simple string. This removes the risk of NPEs, but you’ll likely need just as many checks to handle the empty value correctly as with null. Additionally, the semantics are different: An empty string represents a string with an empty value — null represents nothing. This becomes relevant if an application needs to distinguish between whether a user has not provided information vs. the user has entered some information in the form of the empty string value. You remove the risk of an NPE, but at the cost of using a somewhat misleading value.
Now let’s consider using null in the code that does not model real-life concepts. Most classes in an object-oriented code base have no counterpart in real life and only exist as an abstraction for infrastructure, processing and transformation and to group related functionality. Whether instantiations of these should be allowed to be represented with a null-value is less clear-cut. If we call a getter for a value with a very abstract type like TwoFactorComplexStrategyHandlerDelegateBean would we expect it to be null?
Probably not. With a class that represents a concept from the real world, we can make an educated guess. But these types of classes leave us no chance. We can only know from reading the code. For that reason, returning null in place of these types should be avoided as people rarely have a reason to expect a null value. If you don’t expect null, why would you put in the effort to ensure your code is null-safe?
These objects that shouldn’t be represented by null are de facto. Not only in our own codebase but also in the JRE, libraries and frameworks. It’s problematic, but how problematic? It is important that even though we should try to minimize the usage of null, we shouldn’t trade having fewer null values for codebases full of complex workarounds to prevent values from becoming null. As I’ve discussed above, the alternatives are not always so great. There are, however, some cases of returning null that are easily avoidable, yet frequently seen. A returned null value for these classes often means that something is fishy. For instance, null is sometimes returned from getter-methods, where an object cannot be created due to some particular circumstance, like a server call in the method responding with an error. The method handles the error by logging in a catch-statement, and, instead of instantiating some object, returns null. Or when a method has certain preconditions for the parameters that are not met by the arguments, so again it just returns null. There’s an easy fix for these. An exception should be used to indicate that something is wrong and force the calling code into dealing with it. Returning null in these scenarios is misleading, does not ensure that the problem is handled and might forward the problem to some other part of the code, where it would have been better to fail fast.
Another wrong usage of null is representing a has-a relation with a cardinality of 0..* (earlier I talked about 0..1 relations). Going back to our patient object, a patient may have one or more relatives registered. Or none. We would usually represent this with a list. However, I often see that people return null when there is no data to populate a list or other Collection types. Null is similarly used as an argument to methods in place of missing collections. Using null for this is bad for the following reasons. It is misleading, as the cardinality is perfectly representable with a collection. By assigning null to a Collection type, you only introduce unnecessary risk in your code. A for-loop based on a collection does nothing if the collection is empty, otherwise, it iterates over each element and does something. If you let a collection be null, you express the same thing as an empty list — no data to process here — however, you now need to ensure that there’s a null-check in every downstream method where you want to use the collection to avoid an NPE. Call methods with empty collections, when there is no data to provide. It’s as easy as calling Collections.emptyList(), emptyMap(), emptySet(), etc. Not a lot more work than declaring null, but a lot better.
It follows from the preceding section that it would be acceptable to use null arguments, when calling methods with domain-modelling parameter types with optional values, and the methods must ensure that it is safe to use these. In practice, null parameters are used for much more than that. Whenever we provide a null parameter to a method, we must ensure that all of the subsequent processing of the parameter is null-safe, which can be hard. Even though it appears so, your program might be in a state that hides unsafe behavior and only during other conditions does the problem become apparent. Providing null parameters also adds a risk of introducing errors, whenever we modify any code in the downstream flow from where it is provided.
What can be done to avoid null parameters in general?
Often we need to use some functionality in an existing method, but the calling context that we’re in is slightly different, and we can’t provide all the values the method calls for, or we need to provide more information than the method was initially designed for. We obviously don’t want to reimplement a virtually identical method. Code reuse is one of the pillars of maintainable code; the same functionality should not be implemented in multiple places as it introduces extra work to keep the code in sync and risk of errors. So, we modify the existing code for our purposes and use null for parameters that aren’t always provided. Some methods can by design tolerate at least some null-parameters, while others can’t. However, it can be hard to figure out which parameters are allowed to be null, and if null is even the correct value for a missing value.
In a language like Python, method signatures may contain default values for parameters that are used if an argument value is left out of a method call. However, this is not possible in Java. The closest thing is to use method overloading, where the same method signature is defined multiple times in a class with different parameters. One method will contain the full functionality and take the full set of parameters, while the others are just “decorators” for calling the method that each takes some subset of the parameters. The decorator methods define what values should be used in place in the missing parameters, so the caller won’t have to provide these. By hard-coding, which values should be provided when the caller doesn’t have all the values, we reduce the risk of errors and make the acceptable parameter values explicit.
You can similarly decompose constructors, but you could also use the Builder design pattern. The Builder design pattern helps to minimize constructor parameters and removes the need for passing null-values to the constructor by using a Builder object for creating your class. The gist of the pattern is that you instantiate an object indirectly through an intermediary builder class. To provide the arguments that you would have passed to the constructor directly, you call a setter corresponding to each. If a value has not been set by you, the builder will provide it instead. You then call Create() on the builder, and it instantiates the object for you. It’s a bit too complex to go into detail with here, but Refactoring.guru has a very nice article on the pattern and the trade-offs of using it. As with most patterns, it introduces extra classes and complexity, so before using it, make sure that it makes sense; using it just to avoid calling a constructor with a few null values is probably a bit overkill.
In the solutions above, null-values are still being passed to methods internally in the object, but caller and called methods are intended to do so by design. Any method called with one of these values is expected to have proper null-handling in place. Outside callers should be prevented from providing null-values for parameters that were not considered in the design, as null-values might not be supported and proper handling is not be guaranteed.
How to think about null
An increasing number of programming languages are starting to use a “safety on” approach to certain features. In languages like Clojure, F# and Rust variables are immutable by default; the compiler will only allow variables that are declared with a special modifier to change value. This approach to dangerous features forces programmers to override the default behavior, thereby indicating “I understand this is dangerous, I know what I’m doing and have a good reason for doing so”; doing so should not be the norm. We should think about null in the same way. We need to reserve the value for a few special cases, where it is the right thing to use, but generally, we should restrict ourselves to not use it — however, not at the cost of introducing complexity with creative workarounds. Whenever considering using null in place of a value that crosses the boundary between methods, you should take a moment to consider if you have a good reason for doing so, if you can guarantee it doesn’t end up somewhere it can cause trouble and if other developers would expect the value to be null. If not, you should reconsider the alternatives.
Curious to know more?… “Part 2: There’s got to be a better way — modern null handling” is going live Friday 21st August 2020 on the Destination AARhus TechBlog.
By Jens Christian B. Madsen
Systems Developer, Systematic
Jens Christian B. Madsen is a developer on the Columna clinical information system, holds master’s degrees from Aarhus University in Molecular biology and Computer Science and is a certified ethical hacker. He has contributed to books on such topics as python, computer science and Linux. He enjoys heavy books, heavy metal and heavy weights.