Documentation

Groq

Trace Groq-powered LLM workflows.

Overview

Groq exposes an OpenAI-compatible API, which means the TraceLLM OpenAI integration works with Groq out of the box. Use wrap_openai() or TraceOpenAI with a Groq client configured to point at https://api.groq.com.

All the same capabilities apply: prompt and response capture, latency measurement, token tracking, streaming support, and retry recording.

Installation

terminalCopy

bash

pip install "tracellm[openai]"

The OpenAI client library is used to communicate with Groq's API since Groq is fully OpenAI-compatible. No separate Groq package is required.

Setup

Create an OpenAI client pointed at Groq's base URL, then wrap it with TraceLLM:

groq_setup.pyCopy

python

from openai import OpenAI
from tracellm import trace
from tracellm.integrations.openai import wrap_openai

client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key="gsk_...",  # Your Groq API key
)
client = wrap_openai(client)

Or use the TraceOpenAI class with a custom base URL:

groq_traceopenai.pyCopy

python

from tracellm.integrations.openai import TraceOpenAI

client = TraceOpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key="gsk_...",
)
# Auto-wrapped — no need to call wrap_openai()

Example

groq_example.pyCopy

python

from openai import OpenAI
from tracellm import trace
from tracellm.integrations.openai import wrap_openai

client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key="gsk_your_groq_api_key",
)
client = wrap_openai(client)

@trace(
    prompt="groq_inference",
    model_name="llama-3.3-70b-versatile",
    project="multi-provider",
    environment="production",
)
def run_groq(prompt: str) -> str:
    response = client.chat.completions.create(
        model="llama-3.3-70b-versatile",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt},
        ],
    )
    return response.choices[0].message.content

result = run_groq("Explain Groq LPU inference in one paragraph.")
print(result)

Streaming

Groq streaming works identically to OpenAI streaming through the wrapped client:

groq_stream.pyCopy

python

@trace(prompt="groq_stream", project="multi-provider")
def groq_stream(prompt: str) -> str:
    stream = client.chat.completions.create(
        model="llama-3.3-70b-versatile",
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    )
    full = ""
    for chunk in stream:
        if chunk.choices and chunk.choices[0].delta.content:
            full += chunk.choices[0].delta.content
    return full

What the Trace Captures

Data	Source
Prompt	Messages array sent to Groq API
Response	Full model output text
Model name	From request kwargs (e.g. llama-3.3-70b-versatile)
Latency	time.perf_counter() before and after the API call
Token count	Estimated via heuristic if Groq returns usage; counted from chunks in streaming
Steps	Single openai_chat step per completion
Retries	max_retries generates a step per attempt

Verification

Run the example and confirm the trace summary appears in the console
Open the TraceLLM dashboard and locate the trace by project name
Inspect the step detail to verify the Groq model name and response
Compare latency against direct Groq API calls to confirm minimal overhead

Troubleshooting

Issue	Cause	Fix
401 Unauthorized	Invalid or missing Groq API key	Set api_key= to a valid Groq key from console.groq.com
404 Not Found	Incorrect base_url or model name	Use https://api.groq.com/openai/v1 and a valid model name
Model not found	Groq does not host the requested model	Check available models at console.groq.com/docs/models
No traces recorded	wrap_openai() not called on the Groq client	Ensure client = wrap_openai(client) is executed