Software Engineering Discipline with the Help of Static Types

Published in

That’s What I’m Talking About

12 min readAug 2, 2016

Last week, I wrote about the many tradeoffs that we face when writing software. In that post I asserted how important it is to have some intrinsic architectural choices made ahead of time so that we have some anchors to hang on to as the push and tug of implementing real functionality attempts to thrash our code into oblivion. Now it’s time to start making some of those choices — choices that will close some doors behind us. If we choose Java, then we can’t choose Ruby. If we choose Play then we can’t choose Rails. We can’t choose Linux and Windows at the same time.

But how do we choose? After all, there are many implications here. These choices may affect staffing in the future — is it easy to find people who will be able to work with your technology? Back in 2011 when I joined Cisco we had begun down the path of building a Ruby on Rails application, but at that point in time Ruby/Rails was immensely popular and we found it hard to hire in anyone but the most costly of contractors. These choices may also affect how you are perceived by customers, partners, and the ecosystem into which you are entering. You may, for example, find that your target customers have some impressions and biases about the types of technology they will buy.

In this conversation though, I prefer to play the role of Software Architect, and to go back to my core principles and then see where we land on staffing and sales and marketing implications. For me, the first principle that matters is the server side code must be implemented in a statically typed language. There will be additional nuances and corollaries to this first principle of course. And there are also times when I will use interpreted languages (such as shell scripts) to accomplish point solutions to very specific problems. But when it comes to building a large piece of back-end functionality that serves the problem domain laid out by the business, I refuse to build it in a language that doesn’t have a compiler.

But when it comes to building a large piece of back-end functionality that serves the problem domain laid out by the business, I refuse to build it in a language that doesn’t have a compiler.

Am I saying that everyone who chooses Ruby or PHP or JavaScript or any other dynamically typed language is wrong? Remember, this is religion and we all have reasons for the choices we make. I’ll do my best to explain my reasons here, but some require far more context and background than can be provided in a one week installment.

A Quick Word on PHP

I mentioned our struggles finding Ruby developers to come and work for Cisco. For various reasons (including the skillset of the personnel that we did have) we ended up building a PHP layer instead of Ruby. I didn’t know PHP at the time, and thought that, as an architect on the team, I should at least have some knowledge of how to write code in that language. So I set a weekend and an O’Reilly book aside to learn PHP. I found the language very easy to pick up, and the internet full of resources to help you get things done quickly. Indeed, even still at the time of this writing in 2016, PHP stands at #6 on the TIOBE rankings of “popular” languages. So clearly there are a lot of people who find the language effective and there is much community around it.

To me, PHP is like shell scripting for the web. You can get lots of stuff done really quickly without the need for a lot of setup. However, what I found very quickly as my first PHP program grew to a couple hundred lines of code was that maintenance was quickly going to become an issue. In my case, I decided only a couple of hours into my development that I had named a variable poorly and I wanted to change it. Immediately I missed being able to lean on my compiler to help me through this very lightweight “refactor.” I think that it’s quite possible to write complex systems in PHP, and there are lots of PHP resources and frameworks online that will help, but if it only takes me a weekend to run up against a problem that does not exist with a staticly typed language, then I can only imagine how often I will run into it in the coming months and years, and how much worse the problems will become.

The Importance of Discipline

Without much consideration, I have just spent 2 paragraphs completely dismissing PHP as an option. I don’t want that to be the main takeaway here though, and I am a strong believer that it’s possible to write an application in any language you like. It’s not so easy to write a large application in some languages, and no matter what language you choose it’s almost impossible to keep a code base “clean” over years of active development. But with discipline and a strong set of architectural principles, we at least have a chance.

Discipline takes energy though, and different types of languages and frameworks require different types of discipline in order to keep things clean. Different developers enjoy different types of work, and buy in to different development philosophies (e.g. Domain Driven, Test Driven Development, Behavior Driven Development, etc.) and thus, into the different types of diligence required to stay true to those philosophies. My bias towards static type checking with a compiler is an indication of the type of work that I enjoy. It is through this lens, the lens of programming discipline and the work it necessitates, that I will explain my own religion.

The Baseline Work

In the absense of architectural guide posts, it still takes effort to create code that accomplishes some task. Consider the example of user registration on Pirc.com, where we began modeling the User object. Strictly speaking, there are really only a handful of things we have to do in order to deliver this feature:

Receive an HTTP POST from the web browser that contains the information for the new user. This will be dealt with by the framework.
Validate the submitted information, most likely with tools provided by the framework.
Using the submitted data, create a new User object in the database.
Log that user in and redirect the browser to a some sort of “Getting Started” page.

So, assuming that our framework is going to give us a place to put this code, how to name it, and how to receive input from and send a response back to the browser, all we really have to do is code some validation rules on the input variables, insert something into the database, and set a cookie to link subsequent browser requests to this newly created user record. This is a small amount of work, and could reasonably be done in a single method invoked by your web framework of choice.

This method, of course, would be a crime against cohesion and is really not how we want to do things. No one would really do things this way, but enumerating why is a good way to flesh out the type of design and coding discipline that I am talking about:

Making this a single method that is invoked by the framework inextricably ties the functionality to that framework. This is bad because it makes it more difficult to exercise this code from different execution contexts, whether they be unit tests or from other types of tools (e.g. a bulk import job not invoked via HTTP request) that may need to register users.
Inserting the user right from this method (either via direct SQL or some other type of object/relational mapping type solution) links the framework and the business logic directly to the persistence layer.

These are the immediate ramifications of putting all the registration work into a single method. There are also longer term ramifications that won’t really become clear until it’s too late:

As your application grows and changes, there will be more and more code that relies on the User object, and that object itself will probably change and become more sophisticated over time. This means that there will be more pieces of code that need to access and modify the User object. It will become increasingly difficult to keep track of all those pieces of code if there is not some organized way of managing the User objects.
It will become harder to teach new people how your application works as it grows without some abstraction layers that make it easier to reason about.
Without a layer between your controllers and your database, it will be virtually impossible to maintain a complete picture of what your application actually does. Perhaps the discipline that you have and the energy you want to spend is in carefully documenting new functionality you add to your application over the years. Even though the code may not impose structure, your documentation will. Good luck with that.

The Role of a “Type System”

But I digress. The purpose of this article was to explain why I am so adamant about having a compiler. In thinking more deeply on this topic, it’s not really the compiler that I like so much, but the type system. This is an intrinsic architectural choice that I’ve had for so long that I have a hard time explaining it. I did find an article that goes into great depth on exactly what a static type system is compared to a dynamic type system. In a nutshell, static type systems (and the compilers that know how to process them) give us a way to impose structure on our code at the time we write it rather than at the time we run it. As we make changes to the code, this structure must be maintained in order for the program to compile and run.

But in some languages, this notion of a static type can take on even more meaning and importance. Not only can we have the compiler test to make sure that we’ve named all of our methods and modules and variables correctly, but we can use constructs of the language to achieve higher order goals such as effective domain modeling. A couple of posts ago, I began to get into the importance of domain modeling when you are building a new application. Here we can dive a little deeper into that idea focusing on the discipline/energy function I was talking about before.

Beyond the Baseline: Design Before You Build

In most engineering disciplines, it is unthinkable to start building something before you have designed it. That’s because in most engineering disciplines, mistakes made in the build phase are very costly to fix, and potentially very dangerous.

Tacoma Narrows Bridge Collapse, 1940 — a hard-learned lesson on “aeroelastic flutter”

Software, on the other hand, is completely weightless and fricitonless, it exerts no forces and it carries no load. It exists in a completely virtual space, and can take any form or no form at all. There isn’t even any regulatory or licensing body for those of us who claim to be Software Engineers. Indeed, if you give 10 different engineers an adequately complex problem to solve, you’ll almost certainly get back 10 different solutions to the problem. And even more befuddling is the fact that all 10 solutions may be acceptable (but some will most certainly be better than others). There isn’t even a commonly accepted practice for what it means to design your software. There is a visual component — what your end users are going to see, and there are many layers below on which you could focus. Or, you could choose not to “design” anything and just start in on creating pieces of code that accomplish certain tasks and then stitch them together later — which is the essence of how shell scripting is done.

This is where I want to impose some Engineering discipline that is at the same time lightweight but also very demonstrative of what the system does. Further, this discipline also has a validation tool of sorts, the compiler, to make sure that the design itself meets some base level syntax requirements. This is my domain model, and for me it serves as the blueprint for the application. The domain model is coded as pure Java interfaces. Interfaces have no code, no implementation, they define only the inputs and outputs, the attributes of objects as well as the relationships between them. Interfaces that have lots of JavaDoc comments also turn into very readable and usable design documentation. However, even if this were not the case, the interfaces provide an even more important bit of structure to the system: they tell the compiler all about my application, and they give the compiler the knowledge needed to check for the correctness of my implementation. This does’t guarantee that my design is “good,” but it does guarantee that the code I write will match the contracts that I laid out in my design.

Domain-First Development at Work

So let’s revisit the baseline example above, which involved User registration in our system. Recall that the baseline work involved receiving an HTTP request to create a new user, a way to verify the input data, a way create the new user in a database, and a way to return an HTTP response back to the caller. Building on the work from a couple posts back, we can now fill in the rest of the picture using the Play! Framework to communicate with the web tier and using our (yet to be implemented) domain classes to do the work.

First, let’s look at the HTTP endpoint. In Play, we just need to add a new route to our application “routes” file:

POST /signup controllers.User.signUp()

This is just about as minimal as can be, and it simply means that when the application receives a POST request on /signup, that it should send that request on to a controller method called signUp() which lives in the controllers.User class. Over in that controller class, we’ll add the following method:

public static Result signUp() {
    com.pirc.api.User user = new UserBinder(request());    return created(
      Json.toJson(API.getUserDirectory().signUp(user))
    );
}

What we are doing here is instantiating a “binder” object that knows how to expose the inbound request as an instance of User, a technique which will be covered in a future post. The binder merits its own post because it is also the place where the field validations are going to happen. With this User instance in hand, we are then able to invoke the signUp() method on the UserDirectory object, and then to return the resulting User object back to the caller with a 201 (CREATED) status. Error handling (e.g. what if this user already exists?) is also delegated to the implementation of either the binder or the signUp() method itself.

The controller code above does not actually implement the baseline work invovled with registering a new user in the system. Rather, the code directly leverages the design — the domain model — and it introduces new layers to which we will delegate the various concerns of implementing this function. This code also redeems us for our crime against cohesion from before: the controller is no longer imperatively telling us how to achieve registration, it is declaratively stitching together objects and method calls to achieve the same.

Static Types are Necessary, but not Sufficient

I have been trying to make the case for statically typed languages, and specifically for Java interfaces as a great method for introducing an additional level of engineering discipline on your projects. But it’s also important to realize that just because your code is compiled doesn’t mean that it’s good. You need to also maximize your use of static types and to be especially mindful of which tasks belong to which layers in your architecture. In the above example, we see right away that some work is appropriate for the controller, some for the parameter binder, and some for the API implementation. In future posts we’ll also see delegation of field validation and error handling as well as data persistence.

As a Software Architect and Engineer, this statically typed, domain-first approach is the discipline and energy that I invest over and above the baseline requirements. It provides me with a design for the system and a roadmap of the functionality. It does not bind me to a certain framework, execution context (web, desktop, unit test, or REPL), or persistence mechanism. And since it is statically typed, it is constantly cross-referenced during the compile stage to make sure that all of the code conforms to the design put forth. The statically typed domain is by definition the low-level documentation for the system, and it therefore never falls out of date. When it’s time to add new features to the system, our approach is clear:

Figure out whether the existing domain model support the new feature. If not,
Make thoughtful changes to the domain model first, not just directly in support of the new feature but also thinking more holistically about what the changes mean to the overall domain.
Compile your implementation classes which, if you have changed the domain model, will now no longer conform to the domain model. The compiler will tell you exactly where your problems are.
Add the implementation for the changes to the domain model.
Exercise the changes in the domain model in your controllers by either creating new methods or modifying existing ones.

By carefully sticking to this approach, I’ve been able to make substantial design changes to my server-side, and then with the help of the compiler, change the implementation classes to remain conformant with the design. But more importantly, I’ve also been able to bring on new people and give them a great first place to look in order to figure out what the system does, and I’ve been able to provide them a sturdy framework for developing new features.