In the previous post, we inspected calls to OpenAI APIs triggered within Langchain and LlamaIndex by using OpenTelemetry auto-instrumentation. The spans shown in Jaeger UI were nice to see, but were missing rich information that is expected from a proper instrumentation approach. In this post, we will explore how to enrich spans with additional information using manual instrumentation.
Manual instrumentation
OpenTelemetry provides means to add additional attributes to spans. The OpenTelemetry standard defines two rules:
1. Keys must be non-null string values
2. Values must be a non-null string, boolean, floating point value, integer, or an array of these values
Additionally, most commonly used fields follow naming conventions and are referred to as semantic attributes.
Note: Beware of adding fields that may contain PII information to span context. Unless you guarantee that all systems processing the telemetry drop stored data after a fixed period of time (e.g. 30 days), you may run into challenges related to privacy regulation, such as GDPR and its ‘Right to be forgotten’.
Adding instrumentation to own code
Instrumenting own code is as simple as shown in the code below. It starts a new span called function_name
with an attribute arg
with value 42
.
Any spans that are added using auto-instrumentation to functions called by function_name
will automatically become its child spans.
from opentelemetry import trace
(...)
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("function_name") as span:
arg = 42
span.set_attribute("arg", arg)
result = function_name(arg)
Alternatively, one can use the provided decorator, which results in simpler code in case it’s not necessary to capture any attributes in the spans.
@tracer.start_as_current_span("foobar")
def foobar(arg):
result = foo_bar(arg)
Adding instrumentation to Langchain’s LLM Chains
Langchain offers Custom Callback Handlers as means to execute additional functions in well-defined stages of the chains. To collect statistics on the prompts and token usage from the LLM calls, we can add spans in the on_llm_start
and on_llm_end
calls:
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
) -> None:
with tracer.start_as_current_span("on_llm_start") as span:
prompts_len += sum([len(prompt) for prompt in prompts])
span.set_attribute("num_processed_prompts", len(prompts))
span.set_attribute("prompts_len", prompts_len)
def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
with tracer.start_as_current_span("on_llm_end") as span:
# example output: {'completion_tokens': 14, 'prompt_tokens': 71, 'total_tokens': 85}
token_usage = response.llm_output["token_usage"]
for k, v in token_usage.items():
span.set_attribute(k, v)
Adding instrumentation for OpenAI Embeddings in LlamaIndex
LlamaIndex does not provide callback mechanisms for its embeddings functions. Instead, we can to extend the OpenAIEmbedding
class, include instrumentation code in the overridden methods, and pass an instance of this class to the relevant methods of the library. In the added spans we collect the text lengths as span attributes.
class InstrumentingOpenAIEmbedding(OpenAIEmbedding):
def __init__(
self,
mode: str = OpenAIEmbeddingMode.TEXT_SEARCH_MODE,
model: str = OpenAIEmbeddingModelType.TEXT_EMBED_ADA_002,
deployment_name: Optional[str] = None,
**kwargs: Any,
) -> None:
"""Init params."""
super().__init__(**kwargs)
self.mode = OpenAIEmbeddingMode(mode)
self.model = OpenAIEmbeddingModelType(model)
self.deployment_name = deployment_name
def _get_query_embedding(self, query: str) -> List[float]:
with tracer.start_as_current_span("_get_query_embedding") as span:
span.set_attribute("query_length", len(query))
return super()._get_query_embedding(query)
def _get_text_embedding(self, text: str) -> List[float]:
with tracer.start_as_current_span("_get_text_embedding") as span:
span.set_attribute("text_length", len(text))
return super()._get_text_embedding(text)
async def _aget_text_embedding(self, text: str) -> List[float]:
with tracer.start_as_current_span("_aget_text_embedding") as span:
span.set_attribute("text_length", len(text))
embeddings = await super()._aget_text_embedding(text)
return embeddings
def _get_text_embeddings(self, texts: List[str]) -> List[List[float]]:
with tracer.start_as_current_span("_get_text_embeddings") as span:
span.set_attribute("texts_len", sum([len(txt) for txt in texts]))
return super()._get_text_embeddings(texts)
async def _aget_text_embeddings(self, texts: List[str]) -> List[List[float]]:
with tracer.start_as_current_span("_aget_text_embeddings") as span:
span.set_attribute("texts_len", sum([len(txt) for txt in texts]))
embeddings = await super()._aget_text_embeddings(texts)
return embeddings
The obvious downside of the approach is that the code needs to be kept in sync with the extended base class, which results in increased maintenance effort in case of library upgrades.
Inspecting the spans
Running and using the code mentioned earlier produces two traces. First, the embedding span with the added attribute texts_len
:
Next, the embedding traces and on_llm_start
and on_llm_end
traces with the captured query_length
and token usage attributes:
Writing an Instrumentor for OpenAI Embeddings in LlamaIndex
Extending classes can be cumbersome and an unnecessary maintenance overhead. The built-in instrumentation offered by many of the OpenTelemetry instrumentation packages for Python offer inspiration for a different approach of instrumentation using function wrappers.
Following the example of the Redis instrumentation library, we use the convenient wrapt package to write a simple wrapper function for three methods in the OpenAIEmbedding
class. The wrapper _traced
calculates the length of the passed string(s) depending on the function’s argument type (str
or List[str]
).
from wrapt import wrap_function_wrapper
from llama_index.embeddings.openai import OpenAIEmbedding
from opentelemetry import trace
from opentelemetry.instrumentation.instrumentor import BaseInstrumentor
from opentelemetry.instrumentation.utils import unwrap
from opentelemetry.trace import SpanKind, Tracer, get_tracer
def _instrument(
tracer: Tracer
):
def _traced(func, instance, args, kwargs):
with tracer.start_as_current_span(
"get_embedding", kind=SpanKind.CLIENT
) as span:
if span.is_recording():
if len(args) > 0 and args[0]:
if isinstance(args[0], list):
span.set_attribute("text_length", sum(len(e) for e in args[0]))
else:
span.set_attribute("text_length", len(args[0]))
response = func(*args, **kwargs)
return response
wrap_function_wrapper("llama_index.embeddings.openai", "OpenAIEmbedding.get_query_embedding", _traced)
wrap_function_wrapper("llama_index.embeddings.openai", "OpenAIEmbedding.get_text_embedding", _traced)
wrap_function_wrapper("llama_index.embeddings.openai", "OpenAIEmbedding._get_text_embeddings", _traced)
class OpenAIEmbeddingInstrumentor(BaseInstrumentor):
def instrumentation_dependencies(self) -> Collection[str]:
return ("llama-index ~= 0.4.32",)
def _instrument(self, **kwargs):
"""Instruments llama-index module"""
tracer_provider = kwargs.get("tracer_provider")
tracer = get_tracer(__name__, "custom-tracer-version", tracer_provider)
_instrument(tracer)
def _uninstrument(self, **kwargs):
unwrap(OpenAIEmbedding, "get_query_embedding")
unwrap(OpenAIEmbedding, "get_text_embedding")
unwrap(OpenAIEmbedding, "_get_text_embeddings")
To ensure the instrumentor is actually used, it needs to be initialized with OpenAIEmbeddingInstrumentor().instrument()
before the first library calls are initiated. The resulting traces generated by the instrumentor code are as follows:
OpenAIEmbeddingInstrumentor
Summary
We explored adding additional context to spans by adding instrumentation in three different ways: (1) manual instrumentation of individual function calls, (2) extending classes to override methods with ones that include tracing code, (3) instrumenting library code using function wrappers. When to use which approach is highly contextual and depends on the use case at hand. Approach 1 is best used for one’s own code, approach 3 for instrumenting libraries, and approach 2 when a high degree of control over instrumentation is required.
It’s important to be careful and not overdo instrumentation and rely on the provided instrumentation packages whenever applicable. When considering adding manual instrumentation, it’s important to balance the benefits of additional detail with the potential complexity it may introduce. Note that in production deployments, tracing data is often sampled to deal with high data volume and keep the tracing cost footprint in check and this affects the accuracy of the collected data.