Cloud and Enterprise IT

The actual difference and what we need to do about it

Martin Hargreaves
Future and Emerging Technology
6 min readJun 16, 2013

--

After talking about it for many months, my company bit the bullet and started using AWS for various things in Feb 2012. One of the things we’ve been doing is experimenting and learning about it, and lessons about how to integrate Cloud with commercial IT by trial and error.

I’ve also been talking to all sorts of people about cloud for the last couple of years, most recently CIOs, but also line of business senior managers, IT folks, vendors and their evangelists, plus some devops folks.

Main comment from potential users is that nobody has any useful advice for them, only presales. I’ll add my experiences and opinions here, based on our experimentation.

There’s two communities in the conversation so far - startups running devops style and older IT shops. The latter are prone to believing vendor pitches that a Private Cloud in a box means they have all the important benefits of cloud - my view is that the IT department is colluding with the vendors, both defending their turf from the new model. In many cases they’ll be successful, probably to the detriment of their organisations.

Where this isn’t the case, there’s confusion about trying to run Enterprise IT as normal in public cloud environments. We want the cost and agility advantage of Cloud, but without any change costs to muck about with our workloads, just like virtualisation gave us.

Cloud is much more problematic than this, it’s harder to do right, but the benefits are larger too, for those that make it. Here’s why.

Trust boundaries are different

This is outsourcing key functions to relatively faceless large companies, often US companies (which is definitely a worry in Europe). They have low SLAs and standard products that we can’t vary.

From a designers point of view, we’re deploying the same apps into hostile networks, on hostile storage and have a completely different set of building blocks and variables, so hardly any of our design patterns are still fully valid.

From an operations and risk point of view, we’re being asked to trust a third party with some critical responsibilities and we need to integrate them into our support and delivery processes.

Commercial impact is different

We can only gain serious financial advantage really by using automated capacity planning. It can be a lot of work to get our servers to the point where we can turn them off at night and on again in the morning without worrying. Even more work is needed to autoscale server farms with demand.

We only gain agility if we can use the Cloud as an excuse to create a fast path through our usual bureaucracy. We coud provision things fast internally if we didn’t have twenty teams all wanting forms filling in, timesheet codes and incident tickets.

Technology is different

Virtualisation added a layer or five to the technology stack, public cloud, especially Amazon is really a different stack altogether - the CLI has about 600 or so new commands to do things that don’t always have any parallel in other systems.

Technologists very easily think that new things are like things they know, generally in this case VMware and Unix people understand AWS and other cloud systems as similar to the stuff they do now. 18 months of hands on experience has taught use they’re not.

Enterprise IT folks that know all about cloud without having implemented on it are very similar to Mainframe folks that know all about Unix without having tried it.

There’s very few staff experienced with Cloud implementations and design patterns working in Enterprise IT. Partly because they don’t have to.

It’s hard to reap the advantages with old applications

You can use IaaS as a big off-premise VMware farm. Most enterprise IT Cloud users seem to do this, and it’s an OK first step. Cloud has several drawbacks here though:

The app’s servers are all on all the time increasing costs

The quoted infrastructure SLA the app to give its SLA is lower than internal

The security perimeters the apps rely on for defence are outside the company.

Vendors run TCO figures against this and tell you they’re cheaper than public cloud. They probably are, if you’re just doing this, and if you never plan on doing any more than this (and have decent scale), then OK, private cloud.

However, if you have applications that don’t rely on all components being on all the time, that doesn’t rely on infrastructure SLAs for service SLAs and doesn’t rely on perimeter security for its own security then the risk/benefit equation changes significantly towards Cloud.

You also have a much better app to deploy internally, for those that run mission critical production from their own data centres.

Cloud IT Design Patterns

Security

Premise: All networks are hostile, all storage is hostile.

Response: Encryption, lots of encryption

Encryption keys live outside the cloud (On premise or Saas), all possible filesystems and databases are encrypted, all network traffic is encrypted. Encryption overhead is a cost of doing business, key management is an ops discipline.

Response: Limit the Attack Surface

Restrict accounts using something like EnStratus to only create accounts and keys on the servers when they’re about to be used.

Use strong IAM / RBAC systems to control access and add processes to ensure separation of responsbility, etc. where it’s needed.

Harden systems by removing packages and services, use JEOS systems where possible to automate this, if you can use unusual systems that won’t have off the shelf attack toolkits (e.g. Solaris x86, SmartOS, Illumos, or BSD for internet facing systems rather than Linux or Windows).

Availability

Premise: All infrastructure can fail

Response: Continuous Availability

Active-active systems or duplicate systems in different regions

If you can, design your application decoupled with isolation for fault domains and restarts. Use suitable languages for this like Akka, Erlang if you can find and retain the programmers (it is hard to).

Response: High Availability

Replicate between data centres in a region, replicate to other regions, have static standby sites. Most cloud systems including AWS, automate much of this, so it may just be an ops wrapper to spin up the app on the new site.

Data Replication and automation to bring the recovery site up are the key areas. This is bread and butter stuff to a lot of Enterprise IT folks, but is now: the bare minimum needed, cheaper to do but uses different tools and models.

Costs (or elasticity)

Premise: You pay only for deployed resources

Response: Unused apps should consume minimal resources

System startups need to be clean and need to operator intervention, environments should be able to be switched on and off without a lot of fuss.

Environments should be able to be copied and cloned. This usually means creating them from scripts of configs - CloudFormation, VApp, etc. everybody will have this shortly, although most implementations aren’t very good yet.

Autoscaling the application tiers is needed for dynamic peaky applications. It’s a tricky trick though, so consider whether it’s actually needed for most apps, or just rebooting the environment into a different sized infrastructure would be OK.

“Enterprise IT in the Cloud” isn’t a very good model

Enterprise IT applications make a bunch of assumptions about the environment they run in. These aren’t true in a cloud environment without quite a lot of work, work that probably costs more than you’re going to save in terms of environment costs.

The downside of private cloud is another article, but the upshot is “did you want to be a service provider or a service consumer?”

“Cloud IT in the Enterprise” is a good model

An app written or designed to run in a cloud environment will also be easy to live with in a traditional IT environment. All the things apps can suck at and have ops look after (like booting up properly) will have been sorted out.

Likewise, the decoupling and service based nature of an app that can do autoscaling also creates an app that scales well in house, and probably allows horizontal scaling on commodity systems rather than needing big iron.

Can’t get the Staff

Why don’t we just get on with it then? Not enough people, and a lot of demand for them.

As a run of the mill enterprise IT shop you’ll have many people that won’t get it (see above), vendors telling you that whatever they’re selling is the one true cloud, corporate firewalls and processes that stop people experimenting and very few staff that have any hands on experiencing designing or implementing on cloud.

It will be hard, even at CIO level to get a view of what’s possible without some actual experience, and will probably stay hard for a few years yet.

--

--