Invariants in Code Design

Published in

code-design

10 min readOct 24, 2016

Invariant, quite literally, means something that does not change or vary.

In the context of computer programming, it can be seen as a set of assumptions a piece of code takes before it is able to perform any computation of importance. And if those assumptions aren’t actually true the result of the computation is meaningless, or more appropriately, it cannot be guaranteed that the result is correct.

Consider the following examples you would have come across before.

Bubble Sort

The bubble sort algorithm maintains the invariant that after i passes of the algorithm on the array, the last i elements of the array appear in sorted order. And once n passes are complete, the array is guaranteed to be sorted because the last n elements (equivalent to all the elements in the array) are sorted.

If the invariant of the bubble sort were to fail at any intermediary step, the entire algorithm would fail in sorting the array. Invariants hence directly relate to the correctness of a program.

The Shared Pointer Class in C++

In C++ the std::shared_ptr class maintains the invariant that once all copies of the std::shared_ptr go out of scope, the underlying object will also be deleted. For this, the object maintains a reference count of the number of sharing owners. The invariant it holds is that the underlying object only exists as long as the reference count to it is greater than zero.

Corollary to the invariant stated above is that if a std::shared_ptr exists, it either points to nothing (holds a null pointer), or it references a valid object (holds a dereferenceable pointer).

In the scenario where the underlying object is deleted from outside the std::shared_ptr due to ownership mismanagement, no guarantees can be placed on the std::shared_ptr object and on any systems relying on it.

Invariants in Object Oriented Design

Why Invariants?

Invariants once introduced in code make it easy, sometimes even trivial to understand where exactly a bug lies, or which component is at fault. Well established invariants greatly help a programmer to reason about their code. Hence, probably the best feature of object oriented design, also the least mentioned one, is the ability to introduce invariants in code.

How to introduce invariants?

In languages such as C++, a constructor-destructor pair becomes the best tool for introducing invariants. The proverbial life cycle of an object, begins when the class’ constructor returns and it ends only when the destructor is called. At each point of time in between these two events, the object is said to exist.

void doSomething() {}struct Socket {
    Socket() {}
    ~Socket() {}    
};int main() 
{
    /* socket object does not exist */
    {
        /* socket object does not exist */
        Socket s;        /* socket object exists */
        doSomething();
        
        /* socket object exists here as well */
    }
    /* socket object no longer exists */
}

It is common practice (and an excellent one at that too), to couple the object’s invariants with its life cycle. In other words, with the help of a constructor a programmer can ensure that an object is created only when the object’s invariants hold. Furthermore, if the programmer were careful to not let the invariants of the object be destroyed, he/she can be very well assured that as long as the object exists its invariants will hold. And that the invariants can only falter after the object is destroyed.

void doSomething() {}/* 
** Invariant: If a socket object exist it is open, i.e., 
** a socket object represents an open socket. 
*/
struct Socket {
    Socket() {
        if(!open()) {
            throw exception("Failed to open");
        }
    }
    ~Socket() {}    
private:
    bool open() { // returns false if fails to open
        // implementation
    }
};int main() {
    {
        /* socket object does not exist */
        Socket s;        /* socket object now exists and the socket is surely open as well, because if the socket had failed to open the constructor would have thrown an exception and the object would not have been constructed successfully */
        doSomething();        /* socket object still exists and is still open */
    }    /* socket object no longer exists */
}

An object that does nothing other than being created and getting destroyed does not offer itself to much use. It is likely that the object would require to have methods, that help the client achieve what they need to. As a writer of the class the only rule that an object’s methods must follow is to not destroy the object’s invariants.

To achieve this end goal encapsulation comes in handy. The power of encapsulation is such that once the programmer is sure that no object method, accessible from outside the class, destroys the invariant of the object, he/she can be assured that the object’s invariants will hold good as long as the object exists.

Moreover, it is not mandatory that the invariants be never destroyed. It is okay if the invariants are destroyed while the method is executing, only that they need to be restored before the method returns. Also, this restriction does not apply to private methods that are inaccessible from outside the class. Private methods of an object are free to destroy its invariants. Just that the public method that called the private method should restore the invariants before returning.

void doSomething() {}/* 
** Invariant: If a socket object exist it is open, i.e., 
** a socket object represents an open socket. 
*/
struct Socket {
    
    Socket() {
        if(!open()) {
            throw exception("Failed to open");
        }
    }    ~Socket() {}
    
    reconnect() 
    {
        close();
        /* The invariant of the object has been destroyed as the object exists but it no longer holds an open socket */        open();
        /* The invariant of the object has been restored before the function returned */
    }private:
    bool open() { // returns false if fails to open
        // implementation
    }
    void close() {
        // implementation
    }
};
int main() 
{
    {
        // At no point in this block were the invariants destroyed
        Socket s;
        doSomething();
        s.reconnect();
        doSomething();
    }
}

Lastly, externally accessible object fields that take part in the object’s invariants are an absolute no-no. There should not be any second thought about this one.

Life Cycle of an Object

Invariants coupled with an object’s life cycle

The diagram on the left summarizes the coupling of the invariants associated with an object with its life cycle.

Once the constructor returns, the invariants are established and do not falter even when any procedure is called. The only time the invariants are destroyed are once the object ceases to exist.

Invariants might be temporarily violated within the object’s method, but all is well if they are restored before the method returns. The client that uses the object will never come to know if the invariant were ever destroyed.

Side Note

Object oriented languages lend themselves well to this technique. However there might be slight differences overall.

In C++ the destruction of an object is eager, i.e. the object’s destructor is called and its memory is reclaimed as soon as the object goes out of scope.

Java however, does not have an equivalent for C++ destructors. The finalize method comes close, but is quite inadequate. For one, it is not eager and secondly there are not many guarantees provided by the language specification as to when it will be called (read Effective Java by Joshua Bloch to know more).

The only problem with non-eager destruction is that the object’s life cycle cannot be used for RAII (Resource Acquisition is Initialization), which again is a fantastic technique I might cover later on in the series.

Common Pitfalls and FAQ

Whose responsibility is it to maintain the object’s invariants?

The onus of maintaining an object’s invariant lies with the writer of that class. At no point should one expect the client of the object to use the object in the right manner, or to not destroy its invariants.

It does not take an attacker to break the security of your object, any careless programmer would do. It is always better to assume that the user of your object will not maintain its invariants. Thus, it is your job to make your class air-tight, which ultimately lead to more robust code.

Defensive Copying

Consider the following piece of Java code

public class PrimeFactors {
    private List<Integer> primes;    public PrimeFactors(List<Integer> primes) {
        this.primes = primes;
    }    public List<Integer> get(Integer i) {
        // returns list of prime factors for the integer i
    }    public static void main(String[] args) {
        List<Integer> primes = Arrays.asList({2, 3, 5, 7});
        PrimeFactors factors = new PrimeFactors(primes);        factors.get(10); // returns 2 and 5         primes.clear();
        factors.get(10); // undefined behavior
    }
}

In the above example, the correct functioning of the class depends on the list of primes it receives as argument. Thus it becomes an invariant of the PrimeFactors class to contain a list of primes.

However in the example above, the programmer has not done enough to maintain the object’s invariants. The list of primes is mutable, and as soon as someone updates the lists’ contents the invariants of the factors object go for a toss.

Here the programmer should have made a defensive copy of the list in the constructor as shown below.

public PrimeFactors(List<Integer> primes) {
    this.primes = new ArrayList<>(primes); // defensive copying
}

Argument Checking

Consider the case when the user of the PrimeFactors class creates an object by passing in a list of numbers that aren’t prime.

public class PrimeFactors {    private List<Integer> primes;    public PrimeFactors(List<Integer> primes) {
        this.primes = new ArrayList<>(primes);
    }    public List<Integer> get(Integer i) {
        // returns list of prime factors for the integer i
    }    public static void main(String[] args) {
        List<Integer> primes = Arrays.asList({1, 2, 4});
        PrimeFactors factors= new PrimeFactors(primes);
        factors.get(10); // undefined behavior
    }
}

As the object’s correct functioning depends on the list of primes but no argument checking was done, the constructor returns without an error but the invariants do not hold. Here the programmer has not done their due diligence again, and has again opened their class for exploitation.

Thus, if your object’s correct functioning depends on an object supplied by the user always check if it satisfies your invariants. See the correction below.

public class PrimeFactors {
    private List<Integer> primes;    private void checkPrimes() {
        // throws an exception if the list has a non prime number 
    }    public PrimeFactors(List<Integer> primes) {
        this.primes = new ArrayList<>(primes);
        checkPrimes();
    }
    public List<Integer> get(Integer i) {
        // returns list of prime factors for the integer i
    }    public static void main(String[] args) {
        List<Integer> primes = Arrays.asList({1, 2, 4});        /* following statement throws an exception */
        PrimeFactors factors= new PrimeFactors(primes);        factors.get(10); // control never reaches here
    }
}

Argument Checking in Multi Threaded Environment

If you would have noticed closely, the checkPrimes method in the previous code snippet is an object method that does not take any arguments. Hence, if it is actually checking the arguments it is doing so on object fields rather than the arguments that were passed in. This is not by accident.

In a multi-threaded environment, checking arguments and then assigning them to object fields open the class up for a check-then-act exploit. We will discuss it more when we will come around to thread safety. Till then, just remember to assign first and then check the fields instead of the arguments.

Here’s a cool-ass definition for thread safety:

Thread-Safety is just the ability to maintain an object’s invariants in a multi-threaded environment.

Argument Checking in Release Build

A lot of people have asked me this question, “Should argument checking and assertions go into release build as well?”. The answer is — it depends.

My personal opinion on this one is to keep it in unless you know you have to get rid of them. Here is a line of questioning you can use to identify whether or not you should disable them in release builds.

Is the argument checking actually input validation from the end user? If yes, definitely keep it in.
Is performance critical to your system and have you run profilers on your code? And have those profilers pointed at argument checking to be the bottleneck? If no, keep it in.
Is your automated test suite actually good? Code coverage is a factor but not the only one. If no, keep it in.
Do you have balls of steel and weekends to spare for prod issues? If yes, do what you gotta do.

Inheritance

Another factor to consider while closing all open holes in your class, is inheritance. If you have protected members (fields and methods), and your class is not final (can be sub-classed):

Protected fields that take part in the object’s invariants are again an absolute no-no.
Protected methods should not destroy the invariants of the object, if destroyed the invariants must be restored before the method returns.

Clients that inherit your class can be no more trusted than the user of the class, hence, the protected access members should also count as externally accessible members.

Conclusion

Invariants are a great tool and make the life of a programmer easier, when done right. Furthermore, object oriented design lends itself well to invariants.

If your takeaway from the post is — “I get it, invariants are important.” or “I can’t wait to use them in my code.”, then my job here is done.

Leave me a comment if you found the post interesting or insightful, or if you have a different perspective than mine.

-Manik Jindal