Into the Kestrel’s Nest: A Deep Dive into High-Volume Log Management

Published in

Agoda Engineering & Design

3 min readJul 27, 2023

This week, Agoda hosted an exciting tech talk, "Log Kestrel: A Deep Dive into High-Volume Log Management." Our speaker, Evgeniy Zuykin, Technical Lead at Agoda, provided deep insights into log aggregation and how Log Kestrel has transformed how Agoda handles vast amounts of log data. This blog post summarizes the key points discussed during the talk.

Watch the full talk for insights into Log Kestrel's capabilities and how it empowers Agoda to manage and analyze logs efficiently.

We will dive into curious aspects like data compression, query processing, and scalability. By understanding the platform’s technical nuances, we gain insights into how Log Kestrel efficiently handles large-scale log data and provides real-time access to developers and engineers.

Data Compression

Log Kestrel leverages string interning and the LZ4 compression algorithm, optimizing the trade-off between compression efficiency and decompression speed. While ZSTD offers a higher compression ratio, LZ4 excels in fast decompression, making it an ideal choice for Log Kestrel, providing a better overall user experience.

Chunk-Based Storage and Organization

Chunk-based storage forms the backbone of Log Kestrel’s architecture, allowing efficient grouping and organization of log data. Each chunk contains a batch of logs, grouped by hourly blocks, application names, and log levels, stored on disk, and metadata kept in memory. This approach enables Log Kestrel to handle vast amounts of log data daily, enhancing data retrieval and minimizing disk reads. Additionally, Log Kestrel organizes block management, which helps to reduce the number of entries in S3 storage.

Query Processing: Balancing Real-Time and Historical Data

Log Kestrel focuses on robust access to the most recent data, storing it on ingestion nodes. Historical nodes apply aggressive caching strategies to minimize S3 and network workload.

Scalability and Federated API

Log Kestrel employs a specific scaling approach to meet the demands of large-scale log data processing. Data streams are divided into shards, each with its primary and backup instances. This distribution allows efficient handling of data from Kafka and ensures high availability during deployments and maintenance.

Federated API is critical in distributing queries between real-time and historical data. By dividing the query workload across instances, the Federation API ensures optimal cache hit rates and even workload distribution utilizing rendezvous hashing. However, the method requires caching metadata on all historical instances, which might require additional vertical scaling of historical nodes.

Conclusion

Log Kestrel is Agoda’s approach to contemporary log aggregation solutions. Its emphasis lies in data compression, segment-oriented storage, and optimized query handling. Log Kestrel equips developers with the tools needed to access log data in real-time, increasing the speed and accuracy of their analyses.

It tactfully addresses the spectrum of requirements for detailed historical data alongside scalability hurdles. In adjusting to the demands of large-scale, heavily data-focused environments, Log Kestrel stands as a refined offering in log aggregation platforms. As Agoda pushes for continued enhancements and growth of Log Kestrel, it promises to significantly reshape how logs are managed and assessed in upcoming times.