Image for post
Image for post

Creating reproducible snowflake servers

An introduction to Software Defined Infrastructure

Thomas Kraus
Sep 4, 2017 · 8 min read

I used to automate in order to create many identical servers. Now I automate to provision the exact same server again and again. Here’s what I learned on my journey towards using Software Defined Infrastructure.

Back in 2004 I co-founded a small company in software development and web hosting. For the web hosting, we started out with a few servers in a shared data center; in a shared 19" rack even. When it was time to purchase a 4th machine, we decided we should have our own 19" rack. We announced a maintenance window for the few websites and services we were running, and moved the machines to a dedicated rack nearby, still in a shared suite.

As our hosting business grew a little over time we made some redundancy- and connectivity improvements. For example, we added serial port console access to our servers to recover from firewall-related incidents and we added some basic fail-over capabilities. Most of the configuration activities (operating system, networking, middleware, monitoring, backups) were performed manually, so eventually all machines became unique snowflake servers over time. Only our customer-related operations were largely automated, i.e. scripted in Bash.

Any new servers I used to prepare as much as possible at home before installing them in the datacenter. Some installation tasks were scripted, other tasks were fully manual. After sliding them into the 19" rack, I “only” needed to fine-tune the networking configuration via the console for 20 minutes or so.

How are things different now?

Nowadays, for many administrators, bare metal is not a concern at all anymore. IT infrastructures, the composition of software, hardware and network resources to run software applications, have gone through quite some evolution the last few decades. Not just from a hardware and performance perspective, but especially from a management (i.e. system administration) perspective.

Both lead time and the number of mistakes in for instance setting up new servers or modifying configuration of network equipment, have greatly decreased due to automation. At the same time, traceability and testability of your infrastructure changes have improved enormously.

Hold on, I don’t need all my servers to be identical…

  • Lack of automation, which impedes scaling
  • Poor testing abilities
  • Poor change control

To deal with these disadvantages, the classical infrastructure setup activities got more and more automated in the recent past. As an intermediate step, some organizations (my own company as well) used virtualization technology for some years. Virtualization technology, e.g. VMware, Xen, or Microsoft Virtual Server, helped solve some challenges in resource utilization but still left the management of these virtual resources a rather manual task.

Software Defined Infrastructure to the rescue

Software Defined Infrastructure (a.k.a. Infrastructure as Code) is a way of approaching network and system administration tasks from a software development perspective. Renting the resources instead of owning them (often referred to as Infrastructure as a Service, or IaaS), allows you to further optimize the activities that are more closely related to your business.

Benefits of Software Defined Infrastructure include time savings and improved reliability (reproducibility, traceability) due to, for instance, version control, code re-use, and automated testing. This enables system administrators to quickly define and launch (“provision”) new servers of various sizes. Provisioning is done either manually through web interfaces, or in an automated fashion by custom code or 3rd party tools that connect to APIs of the infrastructure provider. Example (open source) provisioning tools include Vagrant, Terraform or Ansible.

When the logical resources have been created and connected, something useful needs to happen with them, i.e. software applications need to be installed and configured. This is typically done by so-called configuration management tools. Popular open source examples include Puppet and Chef.

Image for post
Image for post
What does a Software Defined Infrastructure look like?

Although implementations, tools, platforms and providers may vary, let’s use the picture above as a shared conceptual view on what an SDI can look like. Note that the technologies mentioned are examples.

With the software-defined approach, your entire compute, storage, and networking infrastructure, plus the software applications running on top, is expressed as code and configuration. Stored in a repository, typically version-controlled, your infrastructure is testable with automated tooling. Executing this code multiple times (with some variation in parameter values) can quickly provision and configure several environments (let’s call them Test, Acceptance, and Production) that only differ in terms of the parameter variance you specified. For example: less CPU/memory/storage in Test, different IP ranges and firewall settings, different server login authorizations, but otherwise identical.

Great, but isn’t virtual machines as a platform a thing from the previous decade?

In the recent past additional services have emerged that all leverage the benefits provided by flexible infrastructure that is maintained as a software project. Platform as a Service (PaaS) allows for deploying software, configuration, or content at an abstraction level higher than the operating system. PaaS examples include AWS Elastic Beanstalk, Heroku, or the Facebook App Development platform. Software as a Service (SaaS), in turn, is about making complete, fully functional software packages available to customers as a service. SaaS examples include Google Docs, Salesforce CRM, or Dropbox. You may have heard the terms Functions as a Service or Backend as a Service as well, more on those in a bit.

Another example of increasing the level of abstraction is the rise of container technology, which was made easily accessible to many system administrators and even developers by the open source project (and company) Docker, in 2013. Although not the sole implementation of container technology, Docker allows you to package a software system with all its run-time dependencies in a so-called Docker image. This Docker image can be deployed (that is, a container with running processes gets instantiated) on a server or a platform prepared specifically for running containers. This server or platform does not need to have any prerequisites (e.g. software libraries, tools, middleware) installed, since containers are, well, self-contained.

Furthermore, containers provide isolation (at process and filesystem level) within a shared operating system. This has clear benefits: whereas before you could run, say, 5 virtual servers on a physical server, now you can run dozens of containers on a single server. It has become irrelevant on which server they are actually running, because regardless of where a container gets instantiated it will run just fine and does not interfere with other containers. Obviously, when containers manage persistent data or state, this needs to be handled properly across the container’s lifecycle. That is, when a container gets renewed or deployed onto a different server, the data or state should remain accessible.

And I hear we can do computing even without servers nowadays?

The setup consists of small functions that get invoked by a trigger (e.g. an HTTP request). They perform some computation or lookup in a database or remote service, and return some result. A typical use case for a function is a chat bot implementation: it gets an event (a message in your favorite chat application), it looks up or computes some result, and posts this back as a chat message. The functionality may be generic, e.g. current traffic information for your route to home, or the local weather forecast. It may also be about specific business functionality, i.e. on demand compiling a report with information from a database.

For mobile apps that require processing or persistent state, (M)BaaS, short for (Mobile) Backend as a Service, is a popular way to implement lightweight backends. Although functions themselves don’t maintain state and only run briefly, they can communicate with persistent storage to store or lookup some data, provide social media integration, and facilitate push notifications. New versions of the backend are instantaneously available to all mobile app users after deployment, which can be an advantage over having to wait until all users have updated the app on their mobile devices.

All major cloud providers offer Functions as a Service: AWS calls it Lambda, Microsoft calls it Azure Functions, and Google offers this as Cloud Functions. Beware of the pricing models, though. For small amounts of requests it’s either free or very cheap. When scaling up it can get expensive, but the total cost of ownership calculation becomes complex. For instance because you need an API gateway (paid per request), the function itself, some persistent storage, and perhaps a message queue; but you save effort on managing resources at the machine or container level. Some more thoughts on cost comparison, specific to AWS, can be found here.

Image for post
Image for post

Want to learn more?

The website SDxCentral (Software Defined Everything) contains news, research and white-papers on various infrastructure topics.

A fun and interesting introduction to serverless computing was presented by James Thomas at the Codemotion Amsterdam 2017 conference, find slides here and code examples here.

Along with the increase of abstraction and automation, security considerations have become more important. In a follow-up article, we’ll see how to address and improve your SDI security.

Thanks to Evelyn van Kelle and various other colleagues at the Software Improvement Group for providing valuable input and feedback.

Software Improvement Group

Getting Software Right for a healthier digital world…

Thanks to Jeroen Heijmans, Evelyn van Kelle, and Joost Visser

Thomas Kraus

Written by

Software Defined Infrastructure expert at Software Improvement Group.

Software Improvement Group

Getting Software Right for a healthier digital world:

Thomas Kraus

Written by

Software Defined Infrastructure expert at Software Improvement Group.

Software Improvement Group

Getting Software Right for a healthier digital world:

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store