Documentation
Groq
Trace Groq-powered LLM workflows.
Overview
Groq exposes an OpenAI-compatible API, which means the TraceLLM OpenAI integration works with Groq out of the box. Use wrap_openai() or TraceOpenAI with a Groq client configured to point at https://api.groq.com.
All the same capabilities apply: prompt and response capture, latency measurement, token tracking, streaming support, and retry recording.
Installation
terminalCopy
bash
pip install "tracellm[openai]"
The OpenAI client library is used to communicate with Groq's API since Groq is fully OpenAI-compatible. No separate Groq package is required.
Setup
Create an OpenAI client pointed at Groq's base URL, then wrap it with TraceLLM:
groq_setup.pyCopy
python
from openai import OpenAI
from tracellm import trace
from tracellm.integrations.openai import wrap_openai
client = OpenAI(
base_url="https://api.groq.com/openai/v1",
api_key="gsk_...", # Your Groq API key
)
client = wrap_openai(client)Or use the TraceOpenAI class with a custom base URL:
groq_traceopenai.pyCopy
python
from tracellm.integrations.openai import TraceOpenAI
client = TraceOpenAI(
base_url="https://api.groq.com/openai/v1",
api_key="gsk_...",
)
# Auto-wrapped — no need to call wrap_openai()Example
groq_example.pyCopy
python
from openai import OpenAI
from tracellm import trace
from tracellm.integrations.openai import wrap_openai
client = OpenAI(
base_url="https://api.groq.com/openai/v1",
api_key="gsk_your_groq_api_key",
)
client = wrap_openai(client)
@trace(
prompt="groq_inference",
model_name="llama-3.3-70b-versatile",
project="multi-provider",
environment="production",
)
def run_groq(prompt: str) -> str:
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt},
],
)
return response.choices[0].message.content
result = run_groq("Explain Groq LPU inference in one paragraph.")
print(result)Streaming
Groq streaming works identically to OpenAI streaming through the wrapped client:
groq_stream.pyCopy
python
@trace(prompt="groq_stream", project="multi-provider")
def groq_stream(prompt: str) -> str:
stream = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": prompt}],
stream=True,
)
full = ""
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
full += chunk.choices[0].delta.content
return fullWhat the Trace Captures
| Data | Source |
|---|---|
| Prompt | Messages array sent to Groq API |
| Response | Full model output text |
| Model name | From request kwargs (e.g. llama-3.3-70b-versatile) |
| Latency | time.perf_counter() before and after the API call |
| Token count | Estimated via heuristic if Groq returns usage; counted from chunks in streaming |
| Steps | Single openai_chat step per completion |
| Retries | max_retries generates a step per attempt |
Verification
- Run the example and confirm the trace summary appears in the console
- Open the TraceLLM dashboard and locate the trace by project name
- Inspect the step detail to verify the Groq model name and response
- Compare latency against direct Groq API calls to confirm minimal overhead
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| 401 Unauthorized | Invalid or missing Groq API key | Set api_key= to a valid Groq key from console.groq.com |
| 404 Not Found | Incorrect base_url or model name | Use https://api.groq.com/openai/v1 and a valid model name |
| Model not found | Groq does not host the requested model | Check available models at console.groq.com/docs/models |
| No traces recorded | wrap_openai() not called on the Groq client | Ensure client = wrap_openai(client) is executed |