CloudOps Conference Pills: What’s new for AWS Cloud Operations?

Michael Capponi
Storm Reply
Published in
8 min readJun 20, 2023

In March 2023, from the 14th to the 17th, the AWS CloudOps Conference took place in Seattle at the AWS headquarters. The conference focused on the latest developments and innovations AWS was adopting and proposing to its customers and was exclusively dedicated to AWS partners.

As an AWS Premium Partner, Storm Reply was invited to attend the conference, where my colleagues and I had the opportunity to engage in insightful discussions on AWS technology with AWS itself and other partner companies. We gained valuable insights from the experts and exchanged ideas with fellow professionals, fostering collaboration and knowledge sharing.

CloudOps: why adopt it?

IT operations are at the heart of every organization. Operating in the Cloud allows IT teams to focus on business outcomes, optimizing IT processes while accelerating software development and innovation.

Nowadays, it is no longer a question of whether your organization is moving to the Cloud but how fast you can move with security, visibility, control, and safety.

AWS Cloud Operations provides a model and tools for a secure and efficient way to operate in the Cloud. You can transform your organization, modernize and migrate your applications, and accelerate innovation with AWS.

Conference

The conference was about three main topics:

  • Governance
  • Operations
  • Observability

Governance

AWS presented its approach and solution to ensure complete security and systems compliance with best practices regarding the topics discussed.

Compliance at an enterprise level

For an Enterprise-level customer looking to migrate to the Cloud, AWS offers the possibility to structure their systems within an Organization, dividing different subsystems into multiple organizational units (OUs) and accounts, thereby providing scalability, isolation, and greater control over resource billing within the OUs.

Through dedicated AWS services, it is then possible to provide an advanced level of security at three different stages:

  • Organization, OU, Account level: Service Control Policies (SCP)
    Recently, a mechanism for proactive controls has been implemented to improve and prevent unauthorized access to services and limit user/group privileges.
  • Application level: CloudFormation Guard and Open Policy Agent
  • Resource level: AWS Config and AWS Security Hub
    Combining these two services makes it possible to inventory all resources (of supported services), monitor their compliance based on predefined rules, and apply automated remediation for non-compliant resources.

AWS Audit Manager

Again, an example can help understand its features and usefulness: often, issues related to auditing customer systems and applications are linked to:

  • Evidence was collected and sampled at individual moments, implying a limited view.
  • High costs and significant time spent collecting evidence.
  • High risk of non-compliance with regulations.

The AWS Audit Manager service addresses these needs, simplifying the auditing process by continuously collecting evidence demonstrating the compliance of resources and systems.

FinOps

FinOps, short for Financial Operations, is a set of practices and processes that aims to optimize cloud spending and maximize the value of cloud investments. It combines financial management, cloud governance, and cross-functional collaboration to improve cost control and efficiency in cloud environments.

By adopting FinOps practices, organizations can gain better visibility into their cloud spending, make data-driven decisions to optimize costs, align cloud usage with business priorities, and achieve financial accountability and control in their cloud operations.

Within a company, selecting a Cloud Financial Management (CFM) Organizational Model is critical.

Achieving a balance between these two categories allows a company to innovate and move quickly while maintaining profitability.
There will be times at different stages of growth when one will outweigh the other (for example Economic downturn may result in more focus on financial responsibility). Still, the goal is always to try to move back to equilibrium.
During the conference, AWS presented us some CFM Practitioner Responsibilities one should follow when adopting FinOps best practices:

  • Evangelizes cultural change within the organization
  • Create cloud cost management best practices
  • Create benchmarks for teams to use
  • Create visibility and transparency to Cloud cost
  • Create or inform cloud budgets and forecasts
  • Create automated solutions to clean up and reduce waste

Operations

The second Conference session primarily focused on Operations management in AWS.
In this regard, the “Systems Manager” service is the center of such operations.
AWS Systems Manager (AWS SSM) allows for comprehensive management of various aspects that characterize applications and systems in the Cloud from every perspective.
The goal and direction that AWS is pursuing are to meet the increasing demand for as centralized control as possible over all actions performed on Cloud resources, ensuring standardization and compliance with Operations practices.

The main points were three:

  • Enhance visibility
  • Proactively automate
  • Remediate centrally

AWS SSM operates at different layers:

  • IT Management Layer: AWS SSM offers features to manage and operate within a ticketing system. It lets you schedule and automate operations over resources. Finally, it provides a centralized view of each CloudOps activity at an account level.
  • Application Management Layer: some AWS SSM features let to securely set up and manage application parameters with the AWS environment.
  • Node Management Layer: using AWS SSM, you can have resource inventories to catalog, manage and operate at the application server level

Patch Manager: Quick Setup Patch Policies

One of the most common operational challenges emerging on the journey to the Cloud is patch management and compliance.

Now, AWS supports centralized patches managed across AWS Organization using AWS SSM.
Previously, customers could scan instances daily for missing patches across all instances in their organization through the Host Management Quick Setup Configuration. Additionally, customers could implement patching using default patch baselines in patch groups.

In January 2023, AWS released Quick Setup Patch Policies, powered by Patch Manager, enabling you to set up patch management across an AWS Organization easily. Patch policies would allow customers to scan and schedule patch installation for multiple patch baselines across AWS accounts and across AWS Regions.

You can apply AWS default or your own custom patch baselines to multiple operating systems for the patch baselines. You can also target Amazon Elastic Compute Cloud (EC2) instances, hybrid managed nodes across the entire AWS Organization or specific Organizational Units (OUs) and Regions, and select all managed nodes or filter based on specific resource tags. You can create and manage multiple patch policies simultaneously, enabling you to control patching operations for different instances.

With the introduction of patch policies and Quick Setup, you have enhanced control over scanning and applying patches to managed nodes throughout your environment. Previously, customers might have been required to log into numerous accounts to observe patch compliance and apply patches. Since this latest release, customers can conveniently use a patch policy across their entire organization, also reviewing resource compliance for the targeted managed nodes.

Announcement blog posts: https://aws.amazon.com/it/blogs/mt/centrally-deploy-patching-operations-across-your-aws-organization-using-systems-manager-quick-setup/

Observability

Full-stack observability at AWS includes AWS-native, Application Performance Monitoring (APM), and open-source solutions, allowing you to understand what is happening across your technology stack anytime.

AWS observability lets you collect, correlate, aggregate, and analyze telemetry in your network, infrastructure, and applications in the cloud, hybrid, or on-premises environments to gain insights into your system’s behavior, performance, and health.
These insights help you detect, investigate, and remediate problems fast. Moreover, coupled with artificial intelligence and machine learning, they let you proactively react, predict, and prevent problems.
AWS supports two approaches for observability: a Native one and an Open-Source one.

Today, if your application is publicly distributed over the web, understanding the impact of Internet events on performance and availability is crucial for delivering an exceptional user experience with your AWS applications. Factors beyond your control can significantly affect user experience, often going unnoticed. To ensure a reliable digital user experience for your applications, it is crucial to comprehend both the “internet weather” and user perceptions of performance. Focusing on these key insights can elevate the bar and provide an outstanding user experience.

Recently, AWS developed a new CloudWatch feature to provide excellent visibility and insights over these metrics.

AWS Cloudwatch Internet Monitor

AWS released this feature at the end of 2022.

Internet Monitor offers ongoing visibility into internet availability and performance, customized to your workload footprint on AWS. By utilizing Internet Monitor, you gain access to valuable information regarding average internet performance metrics over time and insights into issues (events) categorized by location and internet service provider (ISP). This tool enables effortless identification of the events directly impacting the end-user experience for applications hosted on Amazon CloudFront, Amazon WorkSpaces directories, or within Amazon Virtual Private Clouds (VPCs).

The measurements associated with your application endpoints help you to quickly identify the extent, location, and root cause of issues so that you can take necessary actions for remediation. You can do this without modifying your application code and impacting your workload performance.

Internet Monitor bridges the network path of the internet between your users and your application, creating a complete CloudWatch stack:

  • User experience: CloudWatch Synthetics and CloudWatch Real User Monitoring (RUM)
  • Internet health: Internet Monitor
  • Application stack health: CloudWatch ServiceLens and AWS X-Ray
  • Resource health: CloudWatch Metrics and CloudWatch Logs

Announcement blog posts: https://aws.amazon.com/it/blogs/networking-and-content-delivery/introducing-amazon-cloudwatch-internet-monitor/

AWS Observability Accelerator

Detecting and remediating performance slow-downs within their systems is a common topic for customers and may take time and effort and cause disruptions.

AWS has recently proposed a purpose-built observability solution based for Amazon EKS cluster workloads based on IaC developed with Terraform.

The solutions are provided by using open-source tools managed by AWS.
By deploying the infrastructure using Terraform, you can automate and accelerate the setup of a monitoring system within the EKS cluster:

  • AWS Distro for Open Telemetry collects data from Kubernetes cluster components
  • Metrics and traces are sent to AWS services to store and catalog them based on specific customized rules
  • Finally, Amazon Managed Grafana lets the users query and visualize data by consulting dashboards

Announcement blog posts: https://aws.amazon.com/it/blogs/mt/introducing-amazon-eks-observability-accelerator/

GitHub Repository: https://github.com/aws-observability/terraform-aws-observability-accelerator

Conclusion

Those mentioned above were just some of the exciting services and topics the AWS CloudOps Conference focused on.
The direction AWS is taking toward Cloud Operations is clear: trying to provide fast and powerful solutions to let its customers operate in the Cloud.
To reach its goal, AWS is going all in about centralized and automated solutions which let you gain complete control and move safely across cloud environment systems.

Storm Reply, as a consulting AWS Partner, has offered its knowledge and expertise in designing and implementing innovative Cloud-based solutions and services since 2012.
Our company is constantly updated with CloudOps evolutions, starting from the very beginning of the Cloud environment adoption and following with automation, containerization, autoscaling, and so on.
If you wish to know more about who we are, please visit this link, or let’s get in touch by referring to our contacts page.

--

--