LLM Providers¶

TraceVerde auto-instruments 19+ LLM providers. No code changes are needed - just install the provider SDK and TraceVerde handles the rest.

Providers with Full Cost Tracking¶

Provider	Models	Install Extra	Example
OpenAI	GPT-4o, GPT-4 Turbo, GPT-5.2, o1/o3, embeddings (50+)	`[openai]`	example
OpenRouter	All models via OpenAI-compatible API	`[openrouter]`	example
Anthropic	Claude Sonnet 4.6, Claude 3.5/3 series (15+)	`[anthropic]`	example
Google AI	Gemini 2.5/2.0 Pro/Flash, PaLM 2 (30+)	`[google]`	example
AWS Bedrock	Amazon Titan, Claude, Llama, Mistral (25+)	`[aws]`	example
Azure OpenAI	Same as OpenAI with Azure pricing	`[openai]`	example
Cohere	Command R/R+, Embed v4/v3, rerankers (15+)	`[cohere]`	example
Mistral AI	Large/Medium/Small, Mixtral, embeddings (20+)	`[mistral]`	example
Together AI	DeepSeek-R1, Llama 3.x, Qwen (25+)	`[together]`	example
Groq	Llama 3.x, Mixtral, Gemma, Whisper (20+)	`[groq]`	example
Ollama	All local models with token tracking	`[ollama]`	example
Vertex AI	Gemini models via Google Cloud	`[vertexai]`	example
SambaNova	sarvam-m, Saarika, Bulbul (12+)	`[sambanova]`	example
Sarvam AI	Indian language models	`[sarvamai]`	example
Replicate	Hardware-based pricing ($/second)	`[replicate]`	example

Quick Example: OpenAI¶

import genai_otel
genai_otel.instrument()

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is OpenTelemetry?"},
    ],
    max_tokens=150,
)

print(f"Response: {response.choices[0].message.content}")
print(f"Tokens used: {response.usage.total_tokens}")
# Traces, metrics, and costs are automatically captured

Quick Example: Anthropic¶

import genai_otel
genai_otel.instrument()

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
)

print(message.content[0].text)
# Cost tracking and token usage automatically captured

Quick Example: Ollama (Local)¶

import genai_otel
genai_otel.instrument()

import ollama

response = ollama.chat(
    model="llama2",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response["message"]["content"])
# Local model traces captured with token counting

Special Providers¶

HuggingFace Transformers¶

Local model execution with estimated costs based on parameter count.

pip install genai-otel-instrument[huggingface]

Instruments:

pipeline()
AutoModelForCausalLM.generate()
AutoModelForSeq2SeqLM.generate()
InferenceClient API calls

See examples:

Hyperbolic¶

Requires OTLP gRPC exporter due to requests library conflicts.

export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export GENAI_ENABLED_INSTRUMENTORS="openai,anthropic,hyperbolic"

See Hyperbolic example.

Google GenAI (new SDK)¶

pip install genai-otel-instrument[google]

See Google GenAI example.

LiteLLM (Multi-Provider Proxy)¶

pip install genai-otel-instrument[openinference]

LiteLLM enables cost tracking across 100+ providers via a single proxy. See LiteLLM example.

Smolagents (HuggingFace Agents)¶

pip install genai-otel-instrument[openinference]

See Smolagents example.

Captured Attributes¶

For every LLM call:

Attribute	Description
`gen_ai.system`	Provider name (e.g., "openai")
`gen_ai.request.model`	Requested model
`gen_ai.response.model`	Actual model used
`gen_ai.request.type`	Call type (chat, embedding)
`gen_ai.usage.prompt_tokens`	Input token count
`gen_ai.usage.completion_tokens`	Output token count
`gen_ai.usage.total_tokens`	Total tokens
`gen_ai.cost.amount`	Estimated cost in USD

All Examples¶

Browse all provider examples in the examples/ directory.