What is LLM observability?+
LLM observability is the practice of monitoring AI applications in production: tracing prompt/response chains, tracking costs and latency, running automated evaluations, and detecting regressions.
What are the best open-source LLM observability tools?+
Langfuse leads open-source LLM tracing (self-hostable). Phoenix (Arize) is strong for evals. Helicone is the simplest proxy-based solution. OpenLLMetry provides OpenTelemetry-compatible instrumentation.
How do I reduce LLM API costs?+
Observability tools help by showing which prompts are expensive. Common fixes: prompt caching, smaller models for simple tasks, reducing output length, and batching requests. Tools like LiteLLM add caching layers.
What metrics matter most for LLM production monitoring?+
Token cost per session, p95 latency, hallucination rate (via LLM-as-judge), user feedback scores, cache hit rate, and error/retry rate. Set alerts on cost spikes and latency regressions.