Post custom logs containing requests and responses passing through Azure API management!
The original article was written in Japanese on July 12, 2023.
Recently the number of Azure OpenAI Service (AOAI) users has increased rapidly. Some of them asked me a question about custom logs.
Query
How do we use AOAI and Azure API Management (APIM) to post custom logs containing prompts and responses from/to AOAI?
I asked them to elaborate their use cases and requirements.
- They are developing internal systems powered by AOAI.
- They want to capture requests and responses to/from AOAI to monitor if each prompt and response is proper.
- As logs collected from AOAI don’t include request/response payload information, they deploy APIM in front of AOAI to capture requests/responses to/from AOAI at the API gateway layer.
- Following the URL below, they are configuring diagnostic settings of APIM. However, this cannot meet their needs — this configuration cannot capture whole logs more than 8,192 bytes.
All of them are worrying if they can capture whole data when using large token model such as gpt-35-turbo-16k
and gpt-4-32k
.
One of Solutions
I shared the following solution idea to them. I understand that this is not the only solution, so other ideas should appear from others.
- Composing log data in APIM outbound section, post the log to Event Hubs.
- Function App or Logic App listen to Event Hubs to pass log data to persistent layer such as storage, NoSQL, etc.
- To simplify the configuration, set this configuration in API scope, not operation scope.
Of course, calling a different API instead of Event Hubs also works, but in any case, the main policy is the same: the information to be retrieved collectively in the outbound section. Event Hubs Basic SKU can handle up to 256kB (up to 1 MB in case of Standard/Premium SKUs), so Basic SKU can work for this case.
Try it!
For Event Hubs, APIM has a policy called log-to-eventhub
.
1. inbound section
As we can use context variables set in the inbound section at the outbound section, headers and body of each request are captured and stored in context variables in the inbound section. This is an example of obtaining the request body and the api-key
in the request header.
<set-variable name="request" value="@(context.Request.Body.As<JObject>(preserveContent: true))" />
<set-variable name="api-key" value="@(context.Request.Headers.GetValueOrDefault("api-key",""))" />
Please note that if a value is retrieved with context.Request.Body.As<T>()
, the value is no longer available after that, as described below. If you want to keep the value, you need to specify preserveContent: true
as an argument of context.Request.Body.As<T>()
.
By default, the
As
andAsFormUrlEncodedContent()
methods:
- Use the original message body stream.
- Render it unavailable after it returns.
To avoid that and have the method operate on a copy of the body stream, set thepreserveContent
parameter totrue
, as shown in examples for theset-body
policy.
2. outbound section
Similarly, the response from the back-end service is stored in a context variable. It is also okay to deal with the body stream directly, of course.
<set-variable name="response" value="@(context.Response.Body.As<JObject>(preserveContent: true))" />
<set-variable name="id" value="@(context.Response.Headers.GetValueOrDefault("apim-request-id",""))" />
Next, log-to-eventhub
policy is used to post formatted logs to Event Hubs. In this case, all elements are packed into JSON format (JObject
). As log-to-eventhub
policy accepts only string, JSON should be converted to string when passing data to log-to-eventhub
policy.
<log-to-eventhub logger-id="logicojp-eventhubs">
@{ return new JObject(
new JProperty("id", context.Variables["id"]),
new JProperty("api-key", context.Variables["api-key"]),
new JProperty("request-body", context.Variables["request"]),
new JProperty("response-body", context.Variables["response"])
).ToString();
}
</log-to-eventhub>
logger-id
needs to be configured by referring to the following document.
Please note that…
- We cannot create/update/delete/get
logger-id
via Azure CLI as of now. - It can be done in PowerShell if you want to connect to Event Hubs with a connection string.
- If configuring RBAC using Managed Identity, it can only be configured via REST API, Bicep and ARM template (if you use REST API, I recommend using
az rest
to call REST API).
That’s it! Let’s now do test!
Test it!
In this case, Logic Apps captures custom logs posted from APIM to Event Hubs, and finally inserts them to Cosmos DB (of course, it is okay to use other options such as Functions and Stream Analytics, and so on). AOAI should be used for a backend service, but I simply set up a mock service made with Logic Apps, which returns a 9,000 bytes long string, in order to confirm there is no 8,192 bytes long constraint in this test.
The Logic App is quite simple. When messages are found in the Event Hubs, this app is triggered and inserts messages to Cosmos DB.
To simulate calling AOAI APIs, I called the API hosted by APIM with the following request message.
{
"prompt": "Generate a summary of the below conversation in the following format:\nCustomer problem:\nOutcome of the conversation:\nAction items for follow-up:\nCustomer budget:\nDeparture city:\nDestination city:\n\nConversation:\nUser: Hi there, I’m off between August 25 and September 11. I saved up 4000 for a nice trip. If I flew out from San Francisco, what are your suggestions for where I can go?\nAgent: For that budget you could travel to cities in the US, Mexico, Brazil, Italy or Japan. Any preferences?\nUser: Excellent, I’ve always wanted to see Japan. What kind of hotel can I expect?\nAgent: Great, let me check what I have. First, can I just confirm with you that this is a trip for one adult?\nUser: Yes it is\nAgent: Great, thank you, In that case I can offer you 15 days at HOTEL Sugoi, a 3 star hotel close to a Palace. You would be staying there between August 25th and September 7th. They offer free wifi and have an excellent guest rating of 8.49/10. The entire package costs 2024.25USD. Should I book this for you?\nUser: That sounds really good actually. Please book me at Sugoi.\nAgent: I can do that for you! Can I help you with anything else today?\nUser: No, thanks! Please just send me the itinerary to my email soon.\n\nSummary:",
"temperature": 0.0,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"max_tokens": 350,
"stop": null
}
When I checked data inserted to the Cosmos DB container through Cosmos DB Explorer, I could confirm the expected data was stored in the container.
We can setup this configuration in any scope. So, we don’t have to set up the configuration in each operation scope.