Skip to content

LLM Providers

TraceVerde auto-instruments 19+ LLM providers. No code changes are needed - just install the provider SDK and TraceVerde handles the rest.

Providers with Full Cost Tracking

Provider Models Install Extra Example
OpenAI GPT-4o, GPT-4 Turbo, GPT-5.2, o1/o3, embeddings (50+) [openai] example
OpenRouter All models via OpenAI-compatible API [openrouter] example
Anthropic Claude Sonnet 4.6, Claude 3.5/3 series (15+) [anthropic] example
Google AI Gemini 2.5/2.0 Pro/Flash, PaLM 2 (30+) [google] example
AWS Bedrock Amazon Titan, Claude, Llama, Mistral (25+) [aws] example
Azure OpenAI Same as OpenAI with Azure pricing [openai] example
Cohere Command R/R+, Embed v4/v3, rerankers (15+) [cohere] example
Mistral AI Large/Medium/Small, Mixtral, embeddings (20+) [mistral] example
Together AI DeepSeek-R1, Llama 3.x, Qwen (25+) [together] example
Groq Llama 3.x, Mixtral, Gemma, Whisper (20+) [groq] example
Ollama All local models with token tracking [ollama] example
Vertex AI Gemini models via Google Cloud [vertexai] example
SambaNova sarvam-m, Saarika, Bulbul (12+) [sambanova] example
Sarvam AI Indian language models [sarvamai] example
Replicate Hardware-based pricing ($/second) [replicate] example

Quick Example: OpenAI

import genai_otel
genai_otel.instrument()

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is OpenTelemetry?"},
    ],
    max_tokens=150,
)

print(f"Response: {response.choices[0].message.content}")
print(f"Tokens used: {response.usage.total_tokens}")
# Traces, metrics, and costs are automatically captured

Quick Example: Anthropic

import genai_otel
genai_otel.instrument()

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
)

print(message.content[0].text)
# Cost tracking and token usage automatically captured

Quick Example: Ollama (Local)

import genai_otel
genai_otel.instrument()

import ollama

response = ollama.chat(
    model="llama2",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response["message"]["content"])
# Local model traces captured with token counting

Special Providers

HuggingFace Transformers

Local model execution with estimated costs based on parameter count.

pip install genai-otel-instrument[huggingface]

Instruments:

  • pipeline()
  • AutoModelForCausalLM.generate()
  • AutoModelForSeq2SeqLM.generate()
  • InferenceClient API calls

See examples:

Hyperbolic

Requires OTLP gRPC exporter due to requests library conflicts.

export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export GENAI_ENABLED_INSTRUMENTORS="openai,anthropic,hyperbolic"

See Hyperbolic example.

Google GenAI (new SDK)

pip install genai-otel-instrument[google]

See Google GenAI example.

LiteLLM (Multi-Provider Proxy)

pip install genai-otel-instrument[openinference]

LiteLLM enables cost tracking across 100+ providers via a single proxy. See LiteLLM example.

Smolagents (HuggingFace Agents)

pip install genai-otel-instrument[openinference]

See Smolagents example.

Captured Attributes

For every LLM call:

Attribute Description
gen_ai.system Provider name (e.g., "openai")
gen_ai.request.model Requested model
gen_ai.response.model Actual model used
gen_ai.request.type Call type (chat, embedding)
gen_ai.usage.prompt_tokens Input token count
gen_ai.usage.completion_tokens Output token count
gen_ai.usage.total_tokens Total tokens
gen_ai.cost.amount Estimated cost in USD

All Examples

Browse all provider examples in the examples/ directory.