As a senior tech manager, your organization is challenging you to achieve these hard-to-define buzzwords, and much more. To address these expectations, perhaps you decided to migrate to the cloud. But are you seeing the improvements you’d hoped for?
Being in the cloud is not only about lifting and shifting your organization’s infrastructure from on-prem or its datacenter to a virtual location, potentially saving some money, and then being finished with the project. In fact, the move itself is only the first step.
Making effective use of the cloud is an ongoing process, as Anthony DiMarco, Chief Technology Officer of StockTwits, can attest. DiMarco characterizes StockTwits’ cloud evolution as a “…transition from an antiquated way of thinking where we deployed code to a fixed number of physical servers that we owned and operated, to deployments on EC2 instances, to containers orchestrated by Kubernetes, to a few fully serverless applications running with Lambda and API Gateway.”
This is a process that requires you to reexamine everything — your technologies and your culture — through the lens of “cloudy thinking.” What is cloudy thinking? It is the exercise of continuously evaluating your organization versus the following best practices:
· Optimize for cost
· Become scalable
· “Think services, not servers”
· Automate and monitor everything
· Manage your data in a cloudy way
Optimize for Cost
In terms of a cloud infrastructure, cost optimization often means moving as much inflexible capital expense to flexible operating expense as you can. You may have achieved some of this when you migrated from your on-prem servers to cloud instances. To continue to see cost improvements, however, you need to take the pay-as-you-go mentality even further.
Cloudy thinking involves asking questions such as:
· Are you optimizing the size and number of your instances for your workload and responding in a timely fashion to surges and down times?
· Have you increased the redundancy of your cloud resources to remove single points of failure?
· Are you taking advantage of your cloud provider’s different pricing structures?
· And perhaps most important: Do your infrastructure and application provide the information you need to make these decisions?
For example, perhaps you are moving away from a monolithic application with a login function to a login service used by all product teams throughout your organization. In the old model, you had to buy the servers you needed for the maximum load. But now you can instrument your system so that when demand increases, so does the size of your infrastructure — and when it decreases, you adjust again, saving money.
DiMarco points out that optimizing for cost is not only about saving money; it also reduces financial and technical risk and encourages innovation. According to DiMarco, “Treating compute as an operational expense allows StockTwits to deliver software at a much higher velocity, and to iterate much more quickly and safely. It also gives us the freedom to experiment and tinker with new ideas in ways that we never could before.”
This cloudy thinking best practice is applicable to an IT group’s culture, as well. If you’ve got employees with overlapping skill sets (for example, people with varying, complementary amounts of development, testing, and security experience), then you’ve increased flexibility and reduced bottlenecks and costs. If you also promote and fund a learning culture, then you can keep your employees engaged. Taking such steps empowers your people to come up with creative ways to continuously improve not only your product pipeline, but also your bottom line.
For best results, your cloud infrastructure should not replicate your datacenter. Instead, it should provide only the services required by your application. Because you are now using on-demand computing resources, be sure you are monitoring them to detect when you need more resources to satisfy customer demand, and when you can decrease them or turn them off entirely during slower periods.
Implementing this best practice as they migrated to AWS was a game-changer for StockTwits. “Suddenly we could scale arbitrarily and roll back with ease,” recounts DiMarco. “We could leverage Auto Scaling Groups to scale elastically with demand, instead of over-provisioning and praying we didn’t get a sudden traffic burst.”
Similarly, your IT group can be scalable. Again, hire for and encourage overlapping skillsets. Decrease (or, better yet, remove entirely) any functional-area silos by creating cross-department teams that make sense for your organization. You might research the DevOps and DevSecOps cultural philosophies, for example, to see if one of them would be a good fit. And, just as in your infrastructure, you can monitor your teams’ workloads and processes, and redeploy people as needed.
Reduce the need for individual heroes and their heroics. Instead, aim for teams that get their work done close enough to schedule and budget to allow for weekends off, vacation time, and the other benefits that lead to healthy and happy employees. In other words: own the productivity of your teams.
“Think services, not servers”
There’s a saying in the cloud world: Think services, not servers. But what does it mean? To most effectively use the cloud, you must regard your computing resources as disposable. This may be a big change from the servers you bought in the past for thousands of dollars and treated as pets, complete with clever naming schemes.
In other words, instead of computing resources and monolithic programs, think of services that can complete a single task, and that can be combined with other services as needed to complete a bigger series of tasks, and so on, until you’ve built an application.
“Envisioning and building systems as autonomous pieces with independent lifecycles reduces inertia,” says DiMarco, “which lets StockTwits innovate more quickly, and also allows us to scale in a more granular and effective manner.”
“Think services, not servers” extends to security concerns, too. Each service can be responsible for its own security within the security matrix. If each component is individually following security best practices, then someone trying to subvert your system has to work through each layer before achieving success.
You might be wondering how this cloudy thinking best practice translates to your organization’s culture and whether it conflicts with the earlier advice to remove functional-area silos. Here’s some clarity: if each person can be considered a service, and if you’ve hired or grown employees with overlapping skillsets, then you can build focused teams. For example, perhaps the login team builds and maintains the login service that is consumed by other service teams. And perhaps many or even all of the login team members are also on other service teams, as their skillsets and interests — and your organization’s needs — dictate.
In such a scenario, the API that each service team creates acts as a contract, not only among the system’s other services, but also among your work teams. In other words, APIs serve as communication methods and accountability measures for your technology and your culture.
Automate and Monitor Everything
In general, automating repeatable processes, combined with monitoring and testing for continuous improvement, can lead to better quality, faster iterations, and lower costs.
You can automate the building of your infrastructure. You can also monitor your application and infrastructure so that they automatically react to conditions that are important to your organization, such as load, performance, and security alerts. Automated scripts can be triggered to deal with the predictable issues, and the system can alert humans when important anomalies occur.
DiMarco has found this to be an effective strategy at StockTwits. “The confidence to deploy more frequently was bolstered by the increased adoption of instrumentation in our deployed software. We had a repeatable process for building software, an automated means of deploying it, and detailed information about its real-time performance in production.”
Keep logs and monitor everything. Your employees can use big data and/or machine learning techniques on the data you have collected to continually improve the effectiveness and quality of every service.
What does this cloudy thinking best practice look like in your IT group? You can automate the boring and repetitive tasks, and then let your people loose on the interesting technical and business challenges that will arise. Hold blameless post-mortems. Monitor team productivity and hours worked and determine whether your organization’s educational and PTO benefits are being used. Again, you don’t need heroes — you need engaged, creative, and reliable employees for the long haul.
Manage Your Data in a Cloudy Way
Cloudy ways of managing data include understanding and characterizing your data. Although this can be a daunting task, it helps you decide which data should be stored permanently and/or redundantly and/or accessibly and/or securely. You may be able to automate some of these processes so that services perform them for you — in scalable ways that optimize costs.
You can also take advantage of the features of your cloud provider’s storage offerings. You may be able to optimize your NoSQL database’s throughput levels, implement storage versioning, or take advantage of other options. For example, StockTwits encrypts data stored in AWS’s S3 and RDS services. “Having those things available as a built-in option makes those kinds of decisions very easy,” says DiMarco.
Your employees can be thought of not only as services, but also as knowledge (“data”) repositories, with their own specialties, experiences, and personalities. As discussed in the “Think services, not servers” section, you can create focused teams. But you can also ensure that you’ve spread people from each core service team among the other teams, either by making them team members or allowing them a presence at each other’s standups. For example, each service team that makes use of the login service could invite a login team member to their meetings.
As you can see, there’s a lot of work to do before you can claim that your organization is truly engaging in cloudy thinking. It’s an ongoing process of continuous improvement, which requires monitoring, analysis, and course correction.
DiMarco acknowledges that the process is not always an easy one. For StockTwits, “Decomposing monolithic software into microservices involved an initial increase in complexity. It introduced some integration pain until interface contracts were agreed upon, automation was finely tuned, and everyone involved was comfortable with the new way of thinking.”
This new way of thinking, aka “cloudy thinking,” can lead to benefits such as higher-quality products, with faster release cycles, at lower costs — and an engaged, happy workforce. That’s when you can proudly proclaim your organization to be nimble, high velocity, cost effective — and so much more.