Software Engineering is the application of a systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of software.
We who make our living building and maintaining software are very much in demand, are paid well, and are expected to demonstrate the competence and skill of professionals. Sadly, not all of us are the professionals we could be. Learning programming languages does not, by itself, make one a software engineer. It is only a start. Writing code is a small part of creating, deploying, supporting, and maintaining software. As our profession transitions through the latest major paradigm shift — into a world of software systems whose data and components are distributed far and wide over high-speed networks — we must really be professional software engineers and designers, not just dabble at it. To accomplish this, we have to truly understand the choices and trade-offs necessary to create high-quality software and we must build upon the knowledge of the engineers and designers who came before us.
It is likely that by your standards, I’m an old man. After 5 years as an Army officer, I began my career developing commercial software in 1972, the same year Ritchie and Thompson began creating the C programming language at Bell Labs. At Brown I had studied to be a biologist, but the Vietnam War upset my plans. By a fortuitous accident, when I got out of the Army I was offered a job at a fledgling software company and soon discovered the joys of building things that run on computers. For me, that was as close to instant gratification as was legal in most states. Much to my amazement companies are still willing to pay me to do one of the things I enjoy most, thereby funding my other interests like wife and family, golf, and building and messing about in boats. Doing something for 47 years can teach you things that are hard to learn by any other means — things such as context and perspective.
Throughout that entire period, I’ve watched many of the same cycles in software technology repeat themselves over and over. One of the most common patterns is what Gartner calls the Hype Cycle for Emerging Technologies. The primary factor driving this cycle is that building good software requires highly-skilled designers and developers — commodities that have always been in limited supply. Businesses are often eager to find the elusive silver bullet that will make building software easier.
The hype cycle postulates a period of marketing hyperbole and adoption of a great new thing — until reality asserts itself and makes it obvious that the new thing is not quite as magical as advertised. The “new thing” then falls out of favor and is then discarded entirely or languishes until a sufficient knowledge base for its successful use evolves. Many excellent methodologies and technologies have been abandoned because they were not as magical as claimed and required knowledge and skill to use effectively. The Hype Cycle isn’t really a cycle, it’s more a curve, and it’s not particularly scientific or quantifiable — but it clearly describes the very human way that we and our employers often jump on a technical trend, inflated by marketing hype, and then have to learn the hard way how or whether it will work for us in the real world.
Most of us are eager to learn new and better ways of doing things. That is probably one of the reasons we are driven to build software in the first place. But not all new ideas stand the test of time and not all good ideas become widely understood or applied. I was motivated to write this piece after reading a number of online articles that were negative about methodologies, techniques, and programming languages that I believe are critical to meeting today’s software development challenges. I’m not opposed to criticism. Nothing is perfect and responding to criticism is one of the ways we can learn to make things better. What does concern me was that the criticisms were primarily anecdotal — displaying little or no engineering knowledge or rigor — and were supported only by logically-flawed arguments.
There are probably many reasons why good software engineers can have wildly differing opinions about how to build software. I believe that one of the major factors is the context of our individual experience. Just as in the parable of the Blind Men and an Elephant, we humans have a tendency to claim absolute truth based upon our own limited, subjective experiences and ignore other people’s limited, subjective experiences which may be equally valid. In today’s software development world, it is not unusual for a software engineer or designer to spend months or even years touching only one small part of the elephant. So, it’s terribly easy to start thinking of the elephant as only a snake or spear. Computing and software technologies are changing so fast that this narrow perspective does not serve us well and the hard-earned lessons of yesterday are too often forgotten and disappear from our common knowledge base — leading to an almost continuous cycle of reinventing the wheel.
Software engineering is the application of a systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of software. It is a discipline like other disciplines such as mechanical, aeronautical, electrical, and civil engineering. It is built upon a body of knowledge and practices that extend back more than half a century and most of its fundamental principles are now well understood and quantifiable. The knowledge and application of those principles and practices is one of the ways that software engineering differs from just programming.
Alan Kay is one of the truly great thinkers in our profession. Even today, his 1984 Scientific American article Computer Software is a must read for serious software developers. This single article — painting a vivid picture of the power and potential of software — changed my whole professional perspective. It triggered my transition from programmer to software engineer and then to software architect. Inspired by Kay’s ideas, the development team I led in 1985 built a family of object-oriented applications that are still in commercial use today.
During my 47 years in software development there have been many paradigm shifts. When my career began we built programs, not software systems. All processing was essentially batch and a program read a sorted file of input transactions and a sequential master file, did its job, and output a new sequential master file. There were no databases, only sequential files. The first IBM mainframe computer I worked on had 32 KB (yes, kilobytes) of RAM, would take in punched cards as input, read a sequential master file from a magnetic tape reel, write a new updated magnetic tape master file to another tape reel, all while printing a report on a noisy chain printer. More complex processing was achieved by stringing together a series of programs as a larger job. It’s pretty obvious that we’ve progressed through many paradigm shifts to get from there to the Internet, GUIs, relational and document databases, multi-threaded computing, local and wide area networks, mobile phones, and the myriad of other things that modern computing takes for granted.
We are currently in the midst of a new paradigm shift — the advent of the hybrid cloud — which Margaret Rouse describes as a cloud computing environment that uses a mix of on-premises, private cloud and one or more third-party, public cloud services with orchestration among platforms. The hybrid cloud enables us to build ultra-scalable software without sacrificing reliability, performance, and agility. To achieve this we need to move toward different patterns of software design and architecture.
The cloud can change everything. Most obviously, it can provide computing power, memory, and disk as a billable, network accessible, service, rather than solely a capital outlay. The lead time for physical data center provisioning, measured in months, can be reduced to hours, or even minutes for virtual data center provisioning. Not surprisingly, today, most corporate cloud implementations involve little more than moving existing data center applications (or new applications built like the old data center applications) and data out to places like AWS EC2 instances. To be fair, that can in itself be a daunting task, and it can provide the substantial benefit of better management of IT costs — but it also misses many of the competitive advantages cloud-native applications can provide over traditional application architectures.
Cloud-native applications are intended to exploit the automated deployment, scaling, reliability, and fail-over capabilities available with the hybrid cloud. Ultimately, the idea is that by increasing the isolation between software components, we can deliver discrete parts of a system both rapidly and independently, and by utilizing containers and container orchestration we can deliver a high degree of horizontal scalability and fault tolerance. The old patterns of application composition, deployable units, and methods of resource access do not promote the scalability and fault tolerance we need. This requires us to move from relatively monolithic applications to deployable and stateless containerized components. To create these new models, we need to agree on some acceptable constraints.
- Software components should be built as independent stateless services and deployed in containers.
- All business logic in a service should be encapsulated with the data upon which it acts.
- There should be no direct access to a database from outside a service. Any and all access to a database should be accomplished by invoking a service specifically implemented to do so
- Each service should publish an interface that enables access to its data and functionality by other services.
- To optimize performance, reliability, and scalability, services should be invoked through asynchronous (pub-sub) messaging — and when they cannot, services should be invoked through synchronous (request-response) messaging.
- The models used to decompose system/application functionality into discrete services must help to manage, not increase, complexity.
At first glance, these constraints might seem too limiting, keeping you from doing things in the easiest way — making things too difficult. In reality, all architectures require well-defined constraints if they are to result in practical, usable systems. Architecture refers to the fundamental structures of a system and the discipline of creating such structures and systems. Each structure is made of elements, relations among elements, and the properties of both elements and relations. Without effective constraints we end up with chaos. Constraints are simply a statement of the real-world boundaries within which a system must function.
As an example, let’s look at constraint #1. Why might we want a service to be stateless? That seems counter-intuitive. The simple answer is that things (programs and network connections) break and container orchestration gives us a way to deal with that. You can specify that you want n instances of a particular service (container) running on a cluster. If the number of network accessible instances becomes < n, the container orchestrator (Kubernetes, for example) will start up new instances until the number of active service instances = n. This works seamlessly if containers start up in a second or less. If they do not, we start getting request timeouts and seamless goes out the window. Two container attributes help us achieve that 1 second or less startup.
The first one is pretty easy. We need to keep a container small so that we minimize the computing resources required to start it. The second one is a little more difficult. If the container/service needs to duplicate the state of the container/service that it’s replacing, we have added a whole new layer of complexity. Where do we get access to the state of a failed container when we don’t even know where and what it was executing? We can’t and we don’t! When we say that a service must be stateless, we mean that everything it needs to do its job is either in the request message or in the database it accesses. Any state that persists between requests must be in a database. Processing a request on a service that’s been running for weeks and processing a request on a service just started in a new instance are identical.
We don’t have enough space for an in-depth discussion of all the constraints mentioned above, but I’ve tried to expand on them in this subsequent article.
There are many new technologies on the market that are intended to help you exploit hybrid cloud technologies and a lot of money is being spent to hype them. Some are good, some are not, and most are a mix of both. Part of being a professional software engineer is the ability to analyze effectively and to think critically. That takes study and practice. Most of the engineering problems we face today have been, in some form, identified and solved over the last half century. If you know how to phrase the question, you can find almost anything on the Internet.
As we look at new technologies, one of the most important attributes to evaluate is managing complexity. Most of the serious advances in software technology have been based, all or in part, on abstracting complexity while retaining the level of control necessary to do the job. Abstraction exists to hide the complexity we don’t need to see without impairing our ability to use it effectively. The brave new world of containers and services means that we will be dealing with orders of magnitude more “things”. If we cannot manage them, they will eventually overwhelm us. Think of technical debt on steroids. It’s not just how fast we can build things, it’s also how well we can manage them.