Containers are a discrete packaging format that helps us quickly and quietly move applications from one place to another. But what’s so clever about that?

We believe that using containers and orchestrators together with metadata will deliver the ultimate data centre score: more security, more resilience, more energy efficiency and ultimately much, much lower cost. We’ll also argue that to pull this off convention-driven metadata will be vital — but is often overlooked.

Firstly, let’s consider why containers, orchestrators and metadata are being used now.

Moving between environments without setting off alarms

Containers stash everything an application needs in a neat package that can be moved between dev, test & prod without setting off incompatibility errors.

This may be the most common Docker use-case, but it’s not the only thing containers can do.

Walking away with millions

AWS “achieves ~65% server utilization rates versus ~15% on-prem.”

I.e. cloud providers like AWS could host the same workload as on-prem but spend around 4x less doing it. How do they get away with that?

Smart orchestration.

Aside: AWS are currently masters of VM orchestration, Google of container orchestration — but AWS want to persuade us to move our workloads to Lambda (container-style orchestration). Why are they so keen?

What do I mean by smart orchestration? For AWS it’s mostly clever over-subscription. If your non-dedicated VM isn’t using all the resources available to it then AWS’s orchestration will try to ensure there’s another VM on the same physical machine that will use those resources (potentially, your noisy neighbours). That’s because it’s very wasteful to power a machine up and then not use all the available CPU and memory. The more of the available resources AWS can use (higher utilisation or server density) the more $$$ they make and the more energy efficient they are.

Google do the same clever stuff as AWS but with containerised workloads. Containers can be orchestrated even more effectively than VMs because containers are generally smaller and start and stop several orders of magnitude more quickly.

However, both AWS and Google are limited in how clever they can make their orchestration without good information about what they can safely do with any given VM or container. Is it safe to just switch it off for a while or move it to another location? Could they reduce the resources available to it? Additional information leads to increased utilisation.

For example, for their own internal services Google achieve utilisation > 70% because they use containerisation AND orchestration AND they have lots of metadata (which applications are user-facing, which are time-sensitive, which have a high re-start cost, for example). They then intelligently juggle the containerised applications running on their servers, in real time, to optimize both utilisation & behavioural targets.

Metadata is vital to this high utilization, so cloud providers are now subtly persuading us to give them more behavioural metadata about our applications or to adopt their pre-defined behavioural models (i.e. implicit metadata).

For example, the Azure Service Fabric requires loads of information about how your application should behave on their platform. Services like Lambda impose strong behavioural constraints (i.e. implicit metadata: your application is stateless and short-lived). Choosing an AWS throttled instance like a T2 also divulges a lot about your application’s behaviour (implicit metadata: low use with occasional spikes).

Every additional piece of behavioural metadata provided or adopted helps the cloud provider be smarter in their over-subscription, get better resource utilisation and make more money. That’s good for them (more profits, more competitive pricing) but it’s also good for you (cheaper, greener compute). The more granular, detailed and accurate the metadata we can provide about an application, the more sustainable and cheap our systems should become.

So, the big cloud providers generate huge hosting cost savings by combining defined image formats (containers or VMs), orchestrators and metadata.

Speculative aside — I suspect that AWS have realised they can get very high utilisation (low cost) out of the Lambda behavioural model, which is why they are pushing it so hard. I would guess that requires huge scale or very clever scheduling, or both, so there’s also a barrier to entry, making it especially attractive to them. Of course, that would give us very cheap, eco-friendly computation so I’m not going to complain.

Foiled again

Finally, what about those villains who want to exploit vulnerabilities on your production servers and steal your $$$?

Actually, that’s another example of where containers + orchestrators + metadata could help.

Immutable containers + orchestrators can reduce vulnerabilities by helping ensure software only gets deployed to prod in a controlled way.

Being secure could include

  • Regular security scanning of your registry images (already supported by Docker Hub and Quay, for example).
  • Tools that enforce your audit-able deployment processes (i.e. everything is deployed via the orchestrator from secure locations and logged).
  • Tools that combine run-time metadata from your orchestrator with security scan metadata to close security holes in production by identifying and replacing vulnerable containers (Twistlock works along these lines, I believe).

This security approach requires immutable containers, orchestrators and metadata produced by scanning tools and orchestrators, and consumed by enforcement tools.

So What’s the Plan?

Most of the example scenarios above need 4 things

  1. A well-defined, self-contained packaging format for images.
  2. Orchestrators that deploy, control and track those packaged images in production.
  3. Tools that generate build and run-time metadata about your image (build-date, scan status, state, location, behavioural expectations, priorities etc. — where some of this data may be implicitly imposed by behavioural constraints).
  4. Other tools that consume this metadata and act on it via the orchestrator (starting, stopping, moving, checking, logging, alerting or assigning resource, for example).

With the rapid adoption of Docker, the new OCI container standard and the growing popularity of orchestrators like Kubernetes we’re now making good progress on 1 and 2. However, it remains early days for metadata. To fully leverage an orchestrated world we need to encourage metadata conventions like the label-schema project, which encourage tools, particularly open source ones, to produce metadata for consumption by other tools rather than merely being tool-specific and limited.

Please hit the Recommend button below if you found this article interesting or helpful, so that others might be more likely to find it.

Check out MicroBadger to explore image metadata, and follow Microscaling Systems on Twitter.