Keep your source code SIMPLE

Kevin Goslar
Nov 9 · 9 min read

As software developers, we are fortunate to have many useful best practices for productive and fun coding, like the SOLID principles, GRASP patterns, or STUPID anti-patterns. These principles are timeless and apply to many forms of coding, no matter which programming paradigm or language you use.

Here are a few more that I have found useful to keep the process of developing, maintaining, and evolving software over a long time simple and easy. They are aptly named the SIMPLE principles:

  • Strong data types: model your data precise enough for your domain
  • Immutability: limit mutability where possible
  • Misuse-proof APIs: make it impossible to use your interfaces the wrong way
  • Pure code: separate business logic from side-effects
  • Lean components: keep all parts of your architecture small and focused
  • Expressive errors: provide helpful error messages

Before we get started, let’s remember Uncle Bob’s wise words about such principles:

They are not rules, laws, nor perfect truths. They are statements on the order of “an apple a day keeps the doctor away”. They give a name to a concept so that we can talk and reason about it. They provide a place to hang the feelings we have about good and bad code. They attempt to categorize those feelings into concrete advice.

Also, notice that the focus of the SIMPLE principles is to make the overall development process — maintaining, debugging, refactoring, and adding features with limited knowledge of how the entire system works — simpler, at the expense of adding some complexity to your codebase. This is complexity well invested. Let’s check them out!

Strong data types

Even with static type checking, you can end up with lots of different meanings for basic built-in types, making it too easy to mix them up. Joel Spolsky provides convincing examples for this problem in his excellent piece Making Wrong Code Look Wrong: In the Excel codebase, there are at least a few dozen possible meanings for an integer. It could be:

  • a row or column number
  • a horizontal or vertical coordinate relative to the layout or window
  • the difference between two horizontal or vertical coordinates (a width or height)
  • a count of bytes (an offset)

Only certain types of integers should be combined. For example, a horizontal offset relative to the layout with a width relative to the layout. Adding a horizontal offset to a vertical offset, or mixing offsets relative to the window and layout is most likely a bug. Having so many meanings for integers makes your type system weaker than you might think. The Excel team, being limited to what C++ can do, addresses this by including the domain-specific meaning of variables into their names via Hungarian Notation. If you use a more flexible type system, you could also define domain-specific type aliases. In Go, this would look like:

type LayoutHeight int

You want to define dedicated new types that you have to manually cast if needed, not type aliases or typedefs. But don’t over-engineer your type system: do this only when the benefit of the additional consistency checks exceeds the cost for the increased complexity.

Immutability

Shared mutable state data is a hotspot for bugs and concurrency problems because its value can change in ways that are hard to reason about in complex code. Immutable data — which once set never changes — prevents many of these issues. Immutability doesn’t, however, prohibit mutability. It avoids unnecessary mutability by controlling when data can change.

An example of unnecessary mutability is using a shared mutable variable to accumulate text in small steps. Having direct access to this variable, somebody (some piece of code) could accidentally change or remove already accumulated content in it. Doing so would be unexpected, violate assumptions, and increase the likelihood of problems. A StringBuilder class, which hides the fully mutable variable from the rest of the code base and exposes only methods to append text and get the accumulated content, prevents such issues. Here is an example implementation in TypeScript:

class StringBuilder {
private content: Array<string> = []

An easy first step towards more immutability is to make most variables and class members constant or read-only so that their value cannot be re-assigned. Try to return new values rather than mutating existing values in place: newFoo = oldFoo.Add(change) instead of oldFoo.Add(change) changing the value of oldFoo. If parts of an object have to be mutable for performance or simplicity, consider separating them from the immutable data.

You can also use languages or libraries that provide immutable data structures, but as the StringBuilder example has shown, this isn’t necessary to control mutability.

Misuse-proof APIs

Good interfaces make it easy to use them the right way. They expose only controls to perform valid operations. Here is a hypothetical API to download a file that is easy to use wrong:

client = new http.Client()
client.setHeader("foo", "bar")
client.setCredentials(myCreds)
client.request("https://acme.corp/info.txt")
if (client.success()) {
return client.receivedText("utf8")
}

Several things are unnecessarily easy to misuse here:

  1. This class requires calling its methods in a particular order. For example, before starting the download, the user must callsetHeader and setCredentials to configure this client. You better not forget this!
  2. It’s not clear what happens if you call setCredentials twice. Does it use both values? How would that even work? If not, which of the two credentials does it use?
  3. Some state, like the received text or whether the request was a success, is visible to the user right away but only meaningful after the download has completed. Examining this state earlier is invalid, but this API cannot prevent it.
  4. We could start a second download using the same client. What would happen then? Does it run both requests in parallel? Does it abort one of them? If so, which one? Does it reuse the credentials and header values from the first call? If yes, how does one unset one of them?
  5. What are other possible text encodings besides "utf8"?

Better than throwing errors or exceptions at users when they misuse an API is to avoid these issues with an interface that is impossible to use the wrong way. Here is an example for such an “idiot-proof” file download API:

response = http.downloadFile({
url: "https://acme.com/info.txt",
headers: { foo: "bar" },
credentials: myCreds
})
if (response.success()) {
return response.receivedText(encoding.UTF8)
}

This new API exposes just one function to perform a download. It is obvious which configuration it uses for this download, the one that it receives in the arguments. It is also obvious how to perform another download using this API: call the downloadFile function again with whatever configuration you want, and you get a new Response containing the outcome of that download. Users only see responses once they can do something with them: when the download is complete. All fields in the response are meaningful and populated. Possible text encodings are enums or constants.

Pure logic

Pure functions are functions that:

  • only use their arguments to determine their result
  • have no side effects: they don’t interact with external variables, databases, files, or the network

You want to make most of your business logic pure since that makes it easy to reason about, reuse, and test. Since pure logic always produces the same output for the same input, no matter in which state the rest of your system is, it requires fewer tests than impure, stateful code. Because it has no side effects, its tests need almost no setup: just call the pure function with test data and verify its results. You can aggressively cache/reuse these results and run pure code concurrently.

While the business logic should be pure, code at the application’s boundaries, where it interacts with the stateful world around it, must be impure. Gary Bernhardt calls this architecture functional core, imperative shell.

Lean components

When I worked at Google, the Picasa Web Albums codebase had a God class that drove everybody to drink. It was a monster of many thousand lines of entangled JavaScript that implemented a large portion of the core functionality of our application: a thumbnail view showing all photos in the collection, and when you click on a thumbnail, this class also showed that photo enlarged with details. This God class was so hard to work on that the best developers on the team had to take turns dealing with it. It was easy to find the developers currently on God class duty: the ones whose status was something like “God class needs to die” instead of the usual upbeat socializing. Every change made to this class broke something else, causing the team’s velocity to approximate zero.

All such abominations that I have encountered over the years in many places have one thing in common: they are so large and entangled that almost nobody can wrap their head around how they work in a reasonable amount of time, if at all. Because even trivial changes to them feel like open-heart surgery and affect so many places, it takes prohibitively large amounts of effort to make meaningful improvements to them. Hence, these monsters accrete more and more cruft and technical drift over time until they turn into black holes for developer productivity.

To prevent these issues, keep things simple and separate. Extract orthogonal concerns into their own components using the single-responsibility principle. Keep your codebase lean, intuitive, and smelling good by cleaning up technical debt, drift, and cruft every sprint to prevent it from accumulating and getting heavy. And trim the fat: when in doubt leave it out, you probably ain’t gonna need it.

A rule of thumb for lean code is that files shouldn’t be larger than about 100 lines of actual code (ignoring comments and data), give or take depending on how chatty your programming language is. Bigger files likely do too much and become hard to work on, at least for some people. Functions shouldn’t exceed 20 lines of actual code. But don’t take it too far and make Ravioli code where each function contains only one or two lines, and the application logic is spread out too much.

Expressive error messages

Besides intuitive APIs, what makes applications or libraries easy to use and maintain are helpful error messages. When something goes wrong, it’s not enough to spill cryptic stack traces, roll over and die. The application knows best how to use it, and it has just experienced the problem. This gives it enough information to help investigate how and why things didn’t go as expected and what to do better in the situation. Here is an example of a poor error message:

Error: ENOENT, no such file or directory '~/foorc'

Googling reveals that this means “I couldn’t access the file ~/foorc”. To the user, it might not be clear why the application tries to find this file, why it couldn’t access it in case it’s there, what’s supposed to be inside it, nor what to do to make this error go away. A stack trace would add the where inside the code, but that isn’t particularly readable nor helpful to the users. They don’t know the application/library’s source code, nor do they want to change it. They want to use it as it is. A more helpful error message could look something like this:

Error: cannot determine the application configuration:
problem reading the global application configuration file:
found file ~/foorc but unable to open it.

Such an error message gives the user helpful insights into the nature of the problem, proposes solutions, and points to the respective parts of the documentation for more background.

Making error messages useful can require significant amounts of additional code. You enrich the error as it bubbles up the call stack. This extra complexity is a worthwhile investment since it makes your code easier and simpler to use, you have more and happier users, and spend less time and money on customer support.

Wrapping up

So, should we go all the way and use a functional programming language where sophisticated typing, immutability, and pure logic are primary paradigms? You certainly can, but that’s not the point here. The SIMPLE principles aim to bring the best ideas from functional programming to the architecture of any codebase to make working on it simpler. The downloadFile function in the examples above returns an object, and that's okay. While an apple a day keeps the doctor away (or any other person if you throw it hard enough), it doesn't imply that one should eat only apples from now on. Hopefully, the SIMPLE principles help you write better code in any language.

Happy hacking!

Discuss this story on Hacker News.

The Startup

Medium's largest active publication, followed by +535K people. Follow to join our community.

Kevin Goslar

Written by

The Startup

Medium's largest active publication, followed by +535K people. Follow to join our community.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade