Sitemap

Capturing and Retaining OTEL Logs from AWS EKS: Strategies for Cost-Effective Searchability

4 min readJul 8, 2025
Press enter or click to view image in full size
Photo by Natasha Jenny on Unsplash

OpenTelemetry (OTEL) has emerged as the de facto standard for observability in modern cloud-native applications. It enables developers to emit metrics, traces, and logs in a vendor-neutral format. When running applications in Amazon EKS (Elastic Kubernetes Service), leveraging OTEL can provide deep insight into workloads. However, capturing, storing, and making OTEL logs searchable for an extended time period, while maintaining cost efficiency and performance, requires careful design.

Before OTEL’s rise, engineers typically relied on a patchwork of log collectors like Fluent Bit, Fluentd, Filebeat, or vendor-specific agents tied to logging platforms such as Datadog, Splunk, or ELK stacks. While these tools remain viable in many contexts, they often lock users into proprietary formats, complicate interoperability, or fragment telemetry data into isolated silos. OTEL stands out by offering a unified, extensible specification for logs, traces, and metrics, which simplifies instrumentation and opens the door to standardized observability pipelines. This makes OTEL the preferred choice for cloud-native environments where flexibility, portability, and long-term sustainability are essential.

This blog post explores three architectural approaches for capturing OTEL logs in AWS from EKS workloads, compares them in terms of complexity, cost, and query performance, and concludes with a recommendation optimized for long-term retention and searchability.

Approach 1: Push to CloudWatch Logs with Export to S3

The most straightforward strategy is to use the OpenTelemetry Collector to push logs directly to Amazon CloudWatch Logs. From there, a CloudWatch subscription filter can be used to route logs to Amazon Kinesis Data Firehose, which then delivers the data to Amazon S3 for long-term storage.

Architecture Overview:

  • OTEL Collector sends logs to CloudWatch Logs.
  • CloudWatch Logs trigger a subscription filter to a Kinesis Data Firehose stream.
  • Kinesis Firehose buffers and batches logs into S3.
  • Athena is used to query logs from S3.

Pros:

  • Fully managed services with native AWS integration.
  • Minimal operational overhead.
  • Athena provides SQL-like querying on logs in S3.

Cons:

  • CloudWatch Logs ingestion ($0.50/GB) can really add up.
  • Querying is slower with Athena on cold data.
  • Real-time search capabilities are limited.

This method is reliable and scalable but becomes cost-inefficient if real-time querying and extensive indexing are required.

Approach 2: Route OTEL Logs Directly to OpenSearch via Fluent Bit

Another design involves using Fluent Bit or Fluentd as a log forwarder on each EKS node to collect and forward logs to Amazon OpenSearch Service. OpenTelemetry logs are emitted to stdout/stderr or files, and Fluent Bit parses and pushes them to OpenSearch.

Architecture Overview:

  • OTEL Collector emits logs locally.
  • Fluent Bit collects logs from containers and forwards them to OpenSearch.
  • Index Lifecycle Management (ILM) in OpenSearch rolls over indices and stores older data in UltraWarm or cold storage.

Pros:

  • Near real-time search and filtering.
  • Powerful full-text search and Kibana dashboards.
  • Good for operational and forensic investigation.

Cons:

  • OpenSearch clusters are operationally heavier than managed alternatives.
  • Indexing costs grow quickly with high log volume.
  • ILM helps manage cost but is still more expensive than S3-based options.

Approach 3: Tiered Architecture: Hot in OpenSearch, Cold in S3 via OTEL Collector

A hybrid model provides a balance between cost and performance. Logs are first streamed to OpenSearch for six months of real-time analysis, then archived to S3 for long-term retention. The OpenTelemetry Collector can be configured to send logs to both destinations simultaneously using multiple exporters.

Architecture Overview:

  • OTEL Collector is deployed as a sidecar or DaemonSet in EKS.
  • Logs are sent to Amazon OpenSearch (hot path).
  • Simultaneously, logs are delivered to Amazon S3 via Kinesis Firehose or directly through the OTEL Collector.
  • Amazon Athena or AWS Glue is used for long-term querying.

Pros:

  • Fast search on recent data.
  • Lower cost for older, cold storage logs.
  • No vendor lock-in due to open formats (e.g., Parquet or JSON).

Cons:

  • Requires dual-write logic in OTEL Collector.
  • Slightly more complex configuration and monitoring.

Cost Comparison

Here’s a simplified breakdown based on 500 GB/month of OTEL log data:

CloudWatch + S3:

  • CloudWatch Ingestion: ~$250/month
  • CloudWatch Retention (6 months): ~$75
  • Firehose + S3 Storage: ~$15/month
  • Athena Queries: Low and usage-based

OpenSearch Only:

  • Cluster (3 nodes): ~$600/month
  • S3 Snapshots: ~$10/month

Tiered (OpenSearch + S3):

  • OpenSearch (6 months): ~$400/month
  • S3 (18 months): ~$30/month
  • Athena Queries: ~$10/month (light usage)

The tiered model delivers a strong balance between cost, query performance, and retention flexibility.

Implementation Notes

To implement the tiered model, configure the OTEL Collector as follows:

exporters:
opensearch:
endpoint: https://search-otel-logs.region.es.amazonaws.com
index: otel-logs-%{+yyyy.MM.dd}
awskinesis:
stream_name: otel-logs-firehose
region: us-east-1
encoding: json

service:
pipelines:
logs:
receivers: [otlp]
exporters: [opensearch, awskinesis]

Use Firehose to deliver logs into an S3 bucket, partitioned by date or metadata for efficient Athena queries. Catalog it with Glue for easy exploration.

Final Recommendation

For applications in Amazon EKS using OpenTelemetry, a tiered architecture with Amazon OpenSearch for hot storage and Amazon S3 + Athena for cold storage is the most efficient way to meet long-term search and retention requirements.

It offers:

  • Real-time access to logs from the past six months
  • Low-cost, queryable archive for long-term retention
  • Reduced vendor lock-in and more flexibility

While CloudWatch or OpenSearch alone may suffice in specific scenarios, the hybrid model is best suited for teams looking to optimize both performance and budget.

--

--

Don Spidell
Don Spidell

Written by Don Spidell

Cloud Architect Lead at Allocore (formerly Summit Technology Group). Long-time AWS user highlighting interesting use cases and solutions built on AWS.

No responses yet