Originally published on medium.
In the previous post, we inspected calls to OpenAI APIs triggered within Langchain and LlamaIndex by using OpenTelemetry auto-instrumentation. The spans shown in Jaeger UI were nice to see, but were missing rich information that is expected from a proper instrumentation approach. In this post, we will explore how to enrich spans with additional information using manual instrumentation.
Manual instrumentation
OpenTelemetry provides means to add additional attributes to spans. The OpenTelemetry standard defines two rules:
- Keys must be non-null string values
- Values must be a non-null string, boolean, floating point value, integer, or an array of these values
Additionally, most commonly used fields follow naming conventions and are referred to as semantic attributes.
Note: Beware of adding fields that may contain PII information to span context. Unless you guarantee that all systems processing the telemetry drop stored data after a fixed period of time (e.g. 30 days), you may run into challenges related to privacy regulation, such as GDPR and its ‘Right to be forgotten’.
Adding instrumentation to own code
Instrumenting own code is as simple as shown in the code below. It starts a new span called function_name with an attribute arg with value 42.
Any spans that are added using auto-instrumentation to functions called by function_name will automatically become its child spans.
| |
Alternatively, one can use the provided decorator, which results in simpler code in case it’s not necessary to capture any attributes in the spans.
| |
Adding instrumentation to Langchain’s LLM Chains
Langchain offers Custom Callback Handlers as means to execute additional functions in well-defined stages of the chains. To collect statistics on the prompts and token usage from the LLM calls, we can add spans in the on_llm_start and on_llm_end calls:
| |
Adding instrumentation for OpenAI Embeddings in LlamaIndex
LlamaIndex does not provide callback mechanisms for its embeddings functions. Instead, we can to extend the OpenAIEmbedding class, include instrumentation code in the overridden methods, and pass an instance of this class to the relevant methods of the library. In the added spans we collect the text lengths as span attributes.
| |
The obvious downside of the approach is that the code needs to be kept in sync with the extended base class, which results in increased maintenance effort in case of library upgrades.
Inspecting the spans
Running and using the code mentioned earlier produces two traces. First, the embedding span with the added attribute texts_len:

Screenshot from Jaeger UI showing the added embedding span with its attributes
Next, the embedding traces and on_llm_start and on_llm_end traces with the captured query_length and token usage attributes:

Screenshot from Jaeger UI showing the captured traces and LLM token usage attributes
Writing an Instrumentor for OpenAI Embeddings in LlamaIndex
Extending classes can be cumbersome and an unnecessary maintenance overhead. The built-in instrumentation offered by many of the OpenTelemetry instrumentation packages for Python offer inspiration for a different approach of instrumentation using function wrappers.
Following the example of the Redis instrumentation library, we use the convenient wrapt package to write a simple wrapper function for three methods in the OpenAIEmbedding class. The wrapper _traced calculates the length of the passed string(s) depending on the function’s argument type (str or List[str]).
| |
To ensure the instrumentor is actually used, it needs to be initialized with OpenAIEmbeddingInstrumentor().instrument() before the first library calls are initiated. The resulting traces generated by the instrumentor code are as follows:

Screenshot from JaegerUI depicting the span generated by the generic OpenAIEmbeddingInstrumentor
Summary
We explored adding additional context to spans by adding instrumentation in three different ways: (1) manual instrumentation of individual function calls, (2) extending classes to override methods with ones that include tracing code, (3) instrumenting library code using function wrappers. When to use which approach is highly contextual and depends on the use case at hand. Approach 1 is best used for one’s own code, approach 3 for instrumenting libraries, and approach 2 when a high degree of control over instrumentation is required.
It’s important to be careful and not overdo instrumentation and rely on the provided instrumentation packages whenever applicable. When considering adding manual instrumentation, it’s important to balance the benefits of additional detail with the potential complexity it may introduce. Note that in production deployments, tracing data is often sampled to deal with high data volume and keep the tracing cost footprint in check and this affects the accuracy of the collected data.