How Airtel revamped the in-house API Logging Framework

Sachin Sharma

Published in

Airtel Digital

6 min readOct 14, 2022

Brief about Smart API

Every modern application based on REST APIs requires extensive logging and auditing of all the interactions.

Application logs are always there to serve the purpose, but it is tiresome and time consuming to debug via application logs. At airtel, we came up with the idea of defining a definite schema for logging all requests and responses. The idea was to be able to easily visualize all the required data by restricting it to a predefined schema and at the same time provide applications the flexibility to choose the data points as per use case basis.

The key issues which led to the building of Smart API:

· Highly unstructured nature of application logs.

· Too much noise in application logs.

· Tools like Logstash could be used but, it again was CPU intensive and had maintenance and integration overheads for every new application.

What is Smart API?

Smart API is a framework which leverages Kafka to push auditing logs and then persists the logs in Elastic Search to enable visualization via Kibana.

Seems almost like Logstash? Doesn’t it? Well, the difference is in the data that is pushed to Kafka. Smart API does not push unstructured application logs, instead it leverages Apache Avro Schema to push structured data.

Apache Avro™ is a data serialization system.
Avro relies on schemas. When Avro data is read, the schema used when writing it is always present. This permits each datum to be written with no per-value overheads, making serialization both fast and small. This also facilitates use with dynamic, scripting languages, since data, together with its schema, is fully self-describing.

SMART API ARCHITECTURE:

Need for Revamp?

Though we had been using this in many of our applications for a few years now, recently we hit a bottleneck in using the core logging framework of SMART-API. To give more context this is basically a java artifact which can be included in any application and exposes an annotation that can be used to push smart logs (an AVRO schema based log) to KAFKA.

The existing Smart API has proved to be quite robust and efficient at handling audit logs, but as mentioned above we ran into a few bottlenecks:

The existing framework is not able to handle Reactive Stack neither on netty nor on tomcat. We had to do a bypass of an async event for that. It will also be difficult to handle GRPC calls via existing system.
It does not support Java Modules. So, to use it in a project using java module, we will have to add — add-reads my.module=ALL_UNNAMED to JVM.
It relies extensively on Spring’s RequestContextHolder. There are scenarios where we need to do some logging asynchronously and there is no HttpContext available.

Challenges?

While designing a generic system which can support multiple stacks, we faced some really interesting challenges:

Unavailability of JAVAX in an app running on Netty using Spring WebFlux.
Our utility should be usable by both Named and Unnamed Java Modules.
Since Reactor is completely non-blocking and we do not want to interfere with actual request execution. We wanted to perform logging in doOn Hooks. But as mentioned in this GitHub issue. DoOn Hooks do not support context propagation.
Utility should be easily extensible for new stacks and new logging strategies in future.
Utility should provide a definite structure to populate log data but should be flexible enough at the same time to allow applications to choose data ingestion sources.

Solution:

As a solution, we decided to re-write (re-arch 😁) the smart-api utility.

This new smart API exposes an configurable annotation, which takes in two parameters:

loggerName: An Enum which defines the kind of logging to use. Currently we support NON_REACTIVE_AVRO_LOGGER and REACTIVE_AVRO_LOGGER(for Mono and Flux return types).
loggingEvent: An interface which defines base operations performed as a part of all logging strategies.

Logging Event:

This interface is what allows the utility to be flexible enough to handle different scenarios and provides multiple interception points at different stages of logging and api Execution.

getApiName: Returns a name to identify the different APIs
generateNewLoggingSchema: The type of Logging Schema is a generic. The important thing to note here is that the pipeline relies on logs to follow a definite schema. So logging events should declare their schema. The type of schema is used to determine the strategy for logging.
populateRequestFields: Here, application is expected to populate any information before the actual execution of API. The input arguments are provided in arguments array, This method enables providing an interception point before actual execution.
populateResponseFields: Here, application is expected to add any logs related to result of the API execution.
populateExceptionDetails: This interception point is invoked in case the API execution throws an error.
getLoggableRequestContext: This allows an application to provide custom request context, thereby enabling the application to control request context ingestion. If not provided the utility is smart enough to identify the server stack the application is running on and then uses in-built utilities to populate Request Context.

This interface can always be extended to provide specific operations for different Logging Strategies. Currently we, have only Avro based logging Strategy.

getKafkaTopic(): The Kafka topic to push API logs to.

Challenges Resolved:

The abstractions that we have make our utility flexible enough to handle multiple logging strategies and application stacks.
We identified all our packages that will have to be used by applications, and exposed them to all modules in our java module.
For java x servlet, we included it as an optional dependency in our pom and a static requirement in our module-info.java . This ensures that the dependency is available at compile time and is not required at link time and run time. This assumes that whenever the functionality related to servlet is invoked, it will be provided by the application or system.
We use Project Reactors Mono Context and Hook with a Thread Local to maintain request context in reactive applications.

ReactiveRequestContextFilter uses Mono.contextWrites to write requestContext in Mono context before the execution of request.

a) ThreadLocalContextFilter: It implements a BiFunction which can be used as a Hook in a project reactor app and will intercept all subscriptions. Here we use ThreadLocal to preserve the context and then use this context to create our LoggableRequestContext.

b) This ContextFilter is registered in Reactor Hooks like this:

Conclusion:

We have been able to use this across all our Spring based projects for capturing and viewing more detailed audit logs with required important business KPI’s and provide better debugging with more insightful data rather than just access LOGS. The USPs of this revamped version are

1. It can be used with both Spring Reactive and Web MVC and also non web applications.

2. The ability to provide more control to the application to log Request Context.