Benefits of writing stateless code with 7 essential tips on how to write stateless C#
What is stateless code?
Almost every system has state. It’s a question of where the state is held: is it on the client? (e.g. RESTful scenarios) Is it in the service’s memory? Is it in external storages (ephemeral or durable)? Or maybe a combination of all of these…
It is hard to imagine creating a system that doesn’t have any state in any form whatsoever. Stateless code or services usually refers to the services/middleware not storing any state within local memory. Where an interaction with a program (e.g. a REST Api call) finishes execution, and the state of the program (memory) has not changed. (But the resource that the REST call actioned against may have changed — externally managed in external storages)
Let’s see a simple example of server side program state:
Simplified Example
This is a simple stateful program:
This program can be converted into a stateless program as follows:
In the first example (stateful code), the stateful program mutates the memory: so each subsequent time Increment()
its being called, the behavior and/or result will be different. In contrast, each call to the stateless program will never mutate the state of the program, so you will always get back the same answer if you call it many times.
It seems obvious. Granted this is example is quite contrived. So let’s look at something a little more complicated, and perhaps a bit more realistic:
A More Realistic Example
The following is a example of stateful code:
A clean coder’s eye will see that this example in general is a bit horrid since it has many code smells, this was partly done to put a spotlight on creating the stateful aspect of code. e.g: why would AccountProvisioner
take on so many responsibilities in the first place?! Anyway, back to the point: let’s see how this can be refactored to become stateless instead:
Notice that interactions (from the collaborators) with the two different classes would be very different: the public method signatures are very different and so it follows the contracts of how the common method between the two (CreateAccount()
) are different. Converting stateful code to stateless can sometimes be a viral exercise where you need to ensure all upstream code is also refactored to handle calls in a stateless way.
So what approach is better? Stateless or stateful programs? …
Why you should aim to write stateless code
Here are some key benefits of writing stateless code:
- Adds a high degree of certainty.
- — No one likes non deterministic code / bugs 🐛 Stateful code is just an incubator for bugs
- — Especially for concurrent systems
- — It’s easier to investigate and debug issues:
Since stateless code eliminates questions regarding program and component states and lifecycles out of the equation, keeping your sanity intact 😵 - Makes it easier to understand how complicated flows are handled codebase
- Enables services to scale out in order to maximize performance and handle load
- — Enables you to load balance many instances, since no program holds exclusive state in order to work correctly
- — Reduces the memory footprint — especially in large scale scenarios
Just because you built a Web API, doesn’t mean its auto-magically stateless (aka RESTful)
People talk about RESTful APIs ( REpresentational State Transfer) as being awesome for many reasons — including its state management aspect: where state is managed on the client side. A typical client navigates through a resource representation of a system, keep tracking of its own state (HATEOAS). This avoids servers holding session state so the server side part of the architectures can easily scale without incurring high costs.
But you don’t magically inherit a stateless service hosting an Api without intentionally designing it to be RESTful. Which you could to possibly using guidelines such as the Richardson Maturity Model. And in additional to that, it takes some awareness of the developer to avoid common pitfalls leading to writing stateful code.
This is where functional languages shine
Why are functional languages (e.g. Haskell) are so desirable? A key point of difference between imperative and functional programming is that functional code inherently avoids mutating state.
“Pure functional programming” is a subset of functional programming where it is guaranteed to be 100% stateless — where you cannot apply any tricks to manipulate state.
But functional programming is not among the popular choice of languages, and it would be hard to come by a large scale enterprise project written in this style. However, there are a lot of features that modern imperative languages are incorporating in their language. Likewise, there are lessons we can take from functional coding style in general regardless of what language we use.
C# is becoming more functional by design (for example: linq, immutable types, monads etc…).
Leveraging functional programming paradigm in your OO coding to achieve statelessness
Why wouldn’t we want to use all the powerful stateful features of a OO available to us all the time? (Such as state encapsulation with heavy use of inheritance and polymorphism?). After all, many of us have learned of all these awesome powerful features in the early days of learning OO languages, so shouldn’t we just use them whenever we see fit?
It’s never really been the case in my personal experience to leverage all aspects of OO has to offer, at least not often. And many many others over their personal years of experience as a software developer start to see this pattern emerge too:
David Farley explains this whole topic really well in his video on OO vs Functional programming:
Adopt programming habits that constrain you to help you limit your mistakes … any programming paradigm assist with that!
Personally I see the 80–20 rule used for a programming language’s offerings used in practice: we find ourselves using 20% of a language’s feature set 80% of the time.
One more example to prove the point: stop for a moment and consider the commonly followed mantra on “preferring composition over inheritance”, a self imposed limitation (in this case, more of a guideline) that a programmer takes upon themselves to avoid writing a messy solution.
Stateful aspects in a typical program that are fine to keep
There are many cases that you need to track state in a program! As long as you intentionally limit the amount of state complexity as a guideline then you are on the right path. Common examples where you may want to track state:
- Local (in-process) memory caches (though these can be moved externally from the running process on the same host)
- Circuit breaker states (though these can be externally stored to the running process — and may be more desirable to be shared among a scaled out process)
- Background processing for things like long running jobs (though maintaining this type of program state can be avoided by instead using external queues)
Tips to write stateless object orientated code (some C# specific)
Tip 1: keep DI container state simple, and consider favoring transient lifetimes for (almost) everything
- Try aim for all component registration lifetimes to be transient for dependency injection
- — If you are worried about performance, don’t be (unless you are writing serious ultra high performing CPU-bound code, which is very very rare), object creation and cleanup, and DI frameworks in all modern OO languages are generally really, really fast. Favor simplicity over over optimization!
- — Pro tip: you can make autofac do this automatically for you, and so anything “special” with have to explicitly be registered which will help highlight what components are actually stateful.
- Don’t ever allow for a component to be owned by another component with a longer lifetime in the dependency graph
- — Pro tip: some container frameworks (like the service collection form microsoft) automatically validates your registrations don't have these lifetime issues for your. Or if you container framework doesn't support this: you can write a unit test to read registrations and fail if it finds ancestors with longer lifetimes
☝️in a nutshell: keep your DI container state super duper simple. Why? In some cases you may need to write stateful code. Maybe their is some “temporary” statefulness in your system, that is only meant to be be held for a limited controlled time, like a single transaction, or web request. However, in doing so, you open yourself up to risk of the state being unintentionally kept longer than you think. Maybe not now, but maybe someone else refactors something without knowing.
Here is an example to illustrate the issue…
The AccountProvisioner
class in the “realistic” example above may work fine, maybe all the components in the system were registered as “transient”. But then later down the line, a developer thought: why should AccountManager
(an owner of AccountProvisioner
) be transient? it holds no state, so lets make it singleton to keep IoC fast. This will lead to a state bug:
In the first request (A)to create an account, the AccountProvisioner
— which is intended to only be transient — is constructed. All is well.
But then, in the second request (B): the AccountProvisioner
from the first request gets re-used because its parent (AccountManager
) is singleton and is still holding onto the old AccountProvisioner
instance! So all sorts of bugs can start to happen. Even if it won’t right away ; as the system evolves, this erroneous state / lifecycle configuration may go unnoticed and bury the bug deeper until it strikes.
In addition to this: its very common for developers to accidentally encapsulate state in their components. So as an extra line of defense, keep all object lifetimes short.
Worried about performance? (singleton avoids unnecessary object recreation and so less garbage collection effort). Don’t be! Unless it actually matters and you have measured it: maybe it saves you $1 a year with an millisecond latency — don’t optimize up front! Though for services running at a large scale this can be an issue and effect auto-scaling with memory usage spiking: so be mindful of this.
Tip 2: Separate data from logic in your class design
This may be a bit controversial. If you separate (immutable) data classes that have (virtually) no logic, from “business” classes that act upon data classes: then it becomes easier to write stateless code. In practice this is the domain modelling I see everywhere in my work experience as well as on the internet:
Martin Fowler classifies the approach of separating of data from logic as an Anemic domain model. This is because the classes with all the “logic” rarely end up mapping to the real world domain model. e.g. AccountProvisioner
. So there are costs to separating the logic you should be aware of. In my opinion, usually its a lot simpler and more pragmatic to separating the logic from the data.
Tip 3: Don’t cache objects all over the place: use a uniform mechanism to manage it
If you want to cache something: lean on a specific cached component that owns the data that uses standard caching framework of the codebase.
- For cache, favor using common caching layer to hold cache state.
- Consider exclusively creating cached versions of a component using the open/close principle. For example wrapping a repo with a cached repo (see the decorator pattern).
- Reserve class member variables to only hold injected dependencies: so there should generally always be
readonly
where they are assigned in the constructor (and hence never changed during the course of the owning object’s lifetime).
*Bonus* Tip 3.5: And don’t try cache connections yourself
Connections to things like databases, external services over http etc have a certain lifetime. You may decide to keep a connection alive and store the reference to the connection for the lifetime of the app, because you can re-use it and so its more efficient. But lots can go wrong when you do this. And often you will find there are dedicated components out of the box that manages connections efficiently for you (e.g. using some time of connection pooling) — and ensures things won’t go wrong.
Classic examples:
- Use HttpClientFactory instead of holding onto specific clients/http connections (dotnet standard). This will pool connections for you.
- For SQL: typically the SQL driver will natively pool connections for you be default: its usually an explicitly opt-out option on a connection string itself.
If you try manage your own connections:
- Not all connection components support communicating in parallel. If they do: then it only batches queues your requests in the background, giving you poor performance.
- Connection strings can change. These don’t commonly happen at runtime, rather at deploy time. However, for multi-tenant services this is a classic source of bugs; since the components can be servicing requests for tenants that use difference storage/service infrastructure.
Tip 4: Use immutable data classes
By “data classes” I mean simple classes that do not contain business logic. As per tip 2: if you subscribe to separating business logic from data, then these are the classes that represent “entities” that typically map to real world “things” (borrowing from DDD). e.g. an “Account” class.
- Avoid mutable POCOs/POJOs in “domain” layers (with public setters)
- — except maybe for DTOs (passing through service boundaries, though serializer libs are getting so much smarter, and can match up constructor arguments just fine)
- — If you need to construct a really complicated model in the program, consider the builder pattern
- — C#9 has done leaps and bounds to improve this experience and readability with “Record types” and “init-only setters”
- Favor
IEnumerable<T>
types, and other immutable collections on offer for you data classes. Please please please stop exposingList<T>
orIList<T>
on data classes. Also exposing arrays isn't perfect: they can still be mutated but re-assigning elements by index - — for more complicated data classes, consider creating readonly first class collections (Object Calisthenics)
Tip 5: Favor passing immutable collection types between ALL components (on public/internal interfaces)
It is very common for people to simply use List<T>
(or any mutable collection) everywhere as a way to pass around collections of data! Once again this just exposes the possibility of mutating state though the program. You can easily protect this with many ways. Some examples are to replace your mutable collections with:
- Use
IEnumerable<T>
- If you need multiple enumeration and/or indexing: Use interfaces:
IReadOnlyCollection
,IReadOnlyList
,IReadOnlyDictionary
etc.. - Use immutable collections (
ImmutableArray
,ImmutableList
,ImmutableDictionary
etc..)
Tip 6: Copy collections in constructors (and public setters)
I have been burned by this one, it was not easy to track down… consider this:
Since the constructor on Basket
takes the same reference of a mutable collection, it assumes that it owns it. But it cannot assume that! In the example usage, the first basket gets the list of products. Then a second basket is created re-using the same reference to the list. Now this may seem contrived, but believe me this can crop up in very subtle ways. Why leave your code open to risk when you can simple follow this guideline: copy all mutable types on constructors that are to be retained as members.
So a couple of common solutions to addressing the issues in the example above:
Note of your component accepts data from a public/internal setter, or function: same principle applies (though, if you are doing this, it becomes a stateful component in itself!)
Tip 7: Favor writing pure methods
Personally I really like C++’s killer feature called const
which can be applied to method declarations and argument inputs. The compiler ensures a const function cannot mutate state of anything passed in — nor will it allow a call non-const functions within a const function. Guaranteeing statelessness by a strong contract protected by the compiler.
In dotnet, there is a “shim” for this that JetBrain’s provide in their code annotation library called [Pure]
methods: Code Annotation Attributes | ReSharper. Violating pure method checks can yield resharper (or jetbrains rider) warnings (which you can configure to error level checks and run in CI/CD).
However there have been recent moves towards supporting const
in C# as a first class citizen construct: C# 8 allows you to declare methods as readonly
for structs to give compiler time checks that it remains pure. Unfortunately it currently only supports structs.
If you have cases where you want to change a data class in the depths of your system’s “business” or “domain” logic: then I would consider making the data class immutable and use pure methods (like all string operations in C#):
Note that in the usage: any references to any Account
will always be protected form the state being changed. At no point in time will state change in its lifetime in anyway possible, giving maximum certainty in your code.
Conclusion
System state is something that a developer should be acutely aware of: in all of its forms. It’s another aspect of development that ought to be lodged into the brain of a developer, a key element the builds up the aspirational “engineering mindset”.
If left unchecked, state can creep into the corners and cracks of your codebases, causing unnecessary complexity that your team will most certainly pay for.