Monitor, trace, and evaluate your AI pipelines

Observability tools give you visibility into what your LLM applications are doing — token costs, latency, prompt traces, evaluation scores, and regression detection. Essential for production AI.

3 observability tools trackedLive dataBrowse all →

Top LLM Observability

Ranked by GitHub stars

See all →

Langfuse

Active

Open-source LLM engineering platform: tracing, evals, prompt management.

Observability and evaluation platform for LLM and agent applications.

Open-source LLM observability: routing, cost tracking, agent tracing.

observability

★ 5.8k

Frequently asked questions about LLM Observability

What is LLM observability?+

LLM observability is the practice of monitoring AI applications in production: tracing prompt/response chains, tracking costs and latency, running automated evaluations, and detecting regressions.

What are the best open-source LLM observability tools?+

Langfuse leads open-source LLM tracing (self-hostable). Phoenix (Arize) is strong for evals. Helicone is the simplest proxy-based solution. OpenLLMetry provides OpenTelemetry-compatible instrumentation.

How do I reduce LLM API costs?+

Observability tools help by showing which prompts are expensive. Common fixes: prompt caching, smaller models for simple tasks, reducing output length, and batching requests. Tools like LiteLLM add caching layers.

What metrics matter most for LLM production monitoring?+

Token cost per session, p95 latency, hallucination rate (via LLM-as-judge), user feedback scores, cache hit rate, and error/retry rate. Set alerts on cost spikes and latency regressions.

Explore related categories

🕸️Multi-Agent Frameworks 📚RAG & Retrieval 💻Coding Agents