Fully Embracing Your Cloud Platform

Published in

Iress

7 min readApr 4, 2019

I’ve been working at IRESS now for almost six months. In that time I’ve been lucky enough to play a part in setting the foundations for the future cloud solutions, including delivering one of our first cloud applications. As part of that, there have been the expected challenges and headaches; environment teething issues, unknown integration challenges and a rate of change so rapid that by the time we have determined an answer, the questions have changed. During this journey, a particular statement, that I believe comes from a misconception, was repeated on a regular basis. I wanted to address it here and offer my perspective.

When talking about architecture options, I regularly hear something along the lines of:

“We want to be cloud platform agnostic, so let’s not use that feature.”

To me, this sets a dangerous precedent that could risk destabilising our journey to the cloud, or at the very least be sub-optimal. Let me explain.

We want to be cloud platform agnostic: This in and of itself is a totally reasonable and worthwhile aspiration. Given the ever changing technical climate, it is important to choose the right platform for the job. However, it shouldn’t be a hard line. Expecting teams to architect and develop systems that will simply port from one cloud platform to another, with minimal rework or fuss, while also delivering the promises of cloud solutions such as high availability and low maintenance costs is an unfair expectation.

Why? Well that leads me to the second part of the quote: let’s not use that feature, this is normally in response to recommending the use of a platform specific service such as RDS Aurora, Kinesis and SQS. The problem with that response is that achieving high availability requires running more than one instance of a given service, and either sharing traffic between the services (clustering or load balancing) or re-routing traffic to a backup instance in the event the primary instance is unavailable (failover). If this is done in a non-platform specific way, we would be forced to craft these individual services as virtual machines in the cloud and manage the clustering and failover manually, as well as performing all maintenance tasks including security patches, backups, monitoring and reacting to disk and RAM usage. This approach might offer high availability, but possibly at the cost of increased maintenance, diminishing the benefits of migrating to the cloud in the first place.

So, does that mean that all is lost? Is it possible to be cloud platform agnostic and still deliver on the benefits of cloud computing?

In short, yes, but you need to taper your expectations on cloud portability. There are steps to make this as low an impact as possible, but there is always going to be a non-trivial effort in making the move. To leverage the full potential of cloud computing, you really need to fully embrace your cloud platform of choice. To do that, while minimising the cost of porting to another cloud provider, here are my recommendations:

1. Use a cloud agnostic approach to infrastructure as code (IAC)

It is really important to describe and create a solution’s infrastructure in a way that supports multiple cloud platforms. Take a look at AWS Cloudformation. It is very specific to AWS, so it works fine there. Defining an equivalent environment in Google Cloud Platform requires a completely different configuration format and a completely different runtime to execute it. Basically, using these platform-specific IAC solutions works for just that platform and porting over to another platform is the equivalent of starting all over again.

Thankfully, at IRESS we are already using an IAC solution that supports multiple cloud platforms, Terraform. Terraform won’t allow you to describe a solution once and deploy it to different cloud platforms. It provides the means to configure these using the same toolset, but it does require configuring the infrastructure once for each platform. So pipelines, toolset and scripting shouldn’t be affected, but the IAC configuration will need a fair amount of re-work when it comes to porting over to a different cloud platform.

2. Containers, always use containers

Ahhh, containers. These are so important when it comes to cloud development, they give you the ability to pivot in so many ways when it comes to how you best deploy your solutions into the cloud. With containers you can choose to deploy them within your own client-managed virtual machines or into fully platform-managed virtual solutions and you can pivot across these solutions with minimal re-work. More importantly, containers are now a widely supported artefact on cloud platforms and can be ported from one platform to another.

For instance, a docker container can be used to deploy to AWS via their standard Amazon Linux EC2 instances. Or, you can opt to use their docker-optimised ECS EC2 instances, or utilise their own fully managed ECS stacks called Fargate. On Google Cloud Services, the same docker containers can be deployed using their docker support in their compute engine, or via their managed Kubernetes Engine. Azure has wide support for docker containers as well with their managed Kubernetes clusters, official docker App Services and fully managed, burst optimised jobs using containers with their Batch service.

Serverless cloud solutions are a different matter though. In fact, Containers and serverless are often considered to be two competing strategies. But, one of the biggest challenges in serverless development is how to develop and test your solution without needing to run it in the cloud. Relying on the cloud platforms for development and testing purposes causes problems, including feedback loop delays, and any network or internet downtime causes downtime for your delivery team. To mitigate this, you need to be able to develop, run and test your serverless solutions locally. How to do this in a simple, repeatable manner? Containers! It is easy to use solutions such as the GCP emulator, Azure Functions runtimes or Localstack for AWS to run serverless solutions. However, these runtimes don’t excuse poorly structured code, leading me to the final point…

3. Structure your code such that cloud specific integrations are not co-mingling with your logic

This is just a good coding practice. Any of your integrations should be separated and isolated from your business logic as much as possible. In fact, you should try to structure your code to ensure as much as possible that the business logic is not tied to a particular platform. This should be the default recommendation. Of course, there can be exceptions to the rule, such as performance considerations, but they should be treated as exceptions and be documented and agreed upon by the developers and maintainers. This enables changing integrations without risking changes and regressions of business logic.

For example, an integration could be a MySQL database but you’ve decided to move to PostgresSQL for its advanced query optimisation power. If the database integrations are spread throughout your business logic, then you will have to make a lot more changes and risk a higher rate of regression. The same can be said for cloud integrations, like the specific object/data stores (AWS DynamoDB, GCP Cloud Store, Azure Table Storage), but also for things like queues and platform service APIs. This is to allow you to migrate from one cloud platform to another when the decision is made, and minimise your impact and risk on the codebase and existing functionality.

Serverless functions such as lambdas should be treated the same way. It is really tempting (and easy) to ignore this with lambdas as they tend to be small, focused functions. Consider the following AWS lambda:

In the case above, everything is in the same file (as is often the case in many online example lambdas). As a result, changing this to another cloud platform is a much more risky task as it requires changing 100% of all files. Updates are harder to follow as it’s not clear what parts of the code use logic and what is specific to integration. All of this makes it harder to update, harder to review and increases the chances of introducing a bug.

In the example below, the business logic is separate from the lambda specific entry point and DynamoDB integration. This allows us to change to another provider and only need to provide a different entry point and datastore integration. One tradeoff is the need to update your references to the datastore integration, but this can be rectified by either refactoring further or using inversion of control technologies. An added bonus is that the business logic can be tested as a separate unit without needing to invoke the lambda entry point, making automated testing a lot more manageable.

In short, I just want to reiterate that I do believe in the value of being cloud platform agnostic. While it is unlikely that we will see the demise of AWS anytime soon, there are many other factors such as cost and support that could influence a pivot to another cloud platform. However I don’t believe that this should come at the cost of designing and developing sub-optimal solutions that result in not being able to fully benefit from the advantages of a cloud platform. Awareness of what is available, good practices and acceptance that cloud platform migration is not a trivial task, will help to minimise the impact of migrating while utilising the best solutions for the chosen platform.

While I hope this has been informative, I’m happy to discuss (or debate) further so feel free to leave any comments, criticisms or more ideas on how to better embrace our cloud platforms below.

Fully Embracing Your Cloud Platform

1. Use a cloud agnostic approach to infrastructure as code (IAC)

2. Containers, always use containers

3. Structure your code such that cloud specific integrations are not co-mingling with your logic

Written by Ryan Brown