AWS App Mesh — Deep Dive

Hareesh Iyer
4 min readJan 9, 2023

--

App Mesh — Overview

AWS App Mesh is a service mesh that makes it easy to monitor and control services. A service mesh is an infrastructure layer dedicated to handling service-to-service communication, usually through an array of lightweight network proxies deployed alongside the application code.

App Mesh standardizes how your services communicate, giving you end-to-end visibility and helping to ensure high availability for your applications. App Mesh gives you consistent visibility and network traffic controls for every service in an application.

App Mesh uses the open source Envoy proxy to manage all traffic into and out of a service’s containers. App Mesh configures this proxy to automatically handle all of the service’s application communications.

This AWS Documentation provides a good overview of App Mesh.

App Mesh — Features

Let’s look at App Mesh features focusing three key use cases of Service Mesh.

  • Traffic Management
  • Traffic policies
  • Traffic telemetry

Traffic Management

Request Routing

App Mesh lets you configure services to connect directly to each other instead of requiring code within the application or using a load balancer. When each service starts, its proxies connect to App Mesh and receives configuration data about the locations of other services in the mesh. You can use controls in App Mesh to dynamically update traffic routing between services with no changes to your application code.

AppMesh supports routing based on URI path, header, query parameters, traffic port etc. You can find more details here

Traffic shifting

With AppMesh, you can distribute traffic to one or more versions of the service using the target virtual nodes. You can specify relative weighting for each virtual node.

Here is a blog on how to use App Mesh to do canary deployments on ECS

Request Timeouts

You can configure timeouts on App Mesh virtual nodes and individual routes. AWS App Mesh supports two types of timeouts: per-request, which controls the amount of time that a requester will wait to complete a response, and idle, that controls the time at which the connection will be terminated if there are no active streams. You can control these timeouts on the virtual node listener level, or at each individual route.

Here is an example on how to configure Request timeouts on App Mesh

Circuit Breaker

You can control connection pool configuration in the mesh and leverage outlier detection functionality that simplifies implementing circuit breaker capabilities. With the connection pool configuration for your service mesh, you can limit the number of simultaneous connections or requests to your application endpoints. The outlier detection feature enables your service to track the health status of individual hosts in each of its upstream services and temporarily stop routing traffic to hosts that exhibit elevated errors. Connection pool configuration and outlier detection enable you to limit the impact of failures, latency spikes and network fluctuations on your application behavior.

More details here

Retry

A retry policy enables clients to protect themselves from intermittent network failures or intermittent server-side failures. You can define a retry policy as part of the Route configuration. The settings include number of retries and retry timeout.

If you don’t define a retry policy, then App Mesh may automatically create a default policy for each of your routes.

Here is the RetryPolicy API object documentation.

Ingress/Egress

A virtual gateway allows resources that are outside of your mesh to communicate to resources that are inside of your mesh. The virtual gateway represents an Envoy proxy running in an Amazon ECS service, in a Kubernetes service, or on an Amazon EC2 instance. Unlike a virtual node, which represents Envoy running with an application, a virtual gateway represents Envoy deployed by itself

More details here

DNS

App Mesh relies on VPC DNS or AWS CloudMap for DNS resolution. App Mesh currently does not support DNS Proxying.

Traffic Policies

mTLS

App Mesh supports TLS and mTLS for inbound and outbound traffic. You need to configure TLS authentication for your mesh endpoints, such as virtual nodes or gateways. These endpoints provide certificates and specify trusted authorities.

More details here

Certificate Management

App Mesh allows you to provide the TLS certificate to the proxy in the following ways:

  • A private certificate from AWS Certificate Manager (ACM) that is issued by an AWS Private Certificate Authority (AWS Private CA)
  • A certificate stored on the local file system of a virtual node that is issued by your own Certificate Authority (CA)
  • A certificate provided by a Secrets Discovery Service (SDS) endpoint over local Unix Domain Socket.

More details here

Traffic Telemetry

Metrics

Envoy emits many statistics on both its own operation and various dimensions on inbound and outbound traffic. These metrics are available through the /stats endpoint on the proxy’s administration port, which is typically 9901.

You can install CloudWatch Agent or Prometheus to your cluster and configure it to collect a subset of metrics from your proxies. The App Mesh metrics extension provides a subset of useful metrics that give you insights into the behaviors of the resources you define in your mesh.

Logs

When you create your virtual nodes and virtual gateways, you have the option to configure Envoy access logs. You can export them to a log storage and processing service like CloudWatch Logs using standard Docker log drivers such as awslogs.

Tracing

App Mesh supports distributed tracing through the Envoy proxies. You can visualize the traces with X-Ray, Jaeger or DataDog

Final Thoughts

App Mesh, being a managed service, reduces the complexity and overhead of managing the service mesh. App Mesh’s roadmap is public and is available here.

References

--

--