Langfuse
Langfuse records production traces of LLM and agent applications and runs prompt management, datasets, and evaluations against them so teams can debug and measure their applications.
Description
Langfuse is an open-source LLM engineering platform built by Langfuse GmbH and is self-hostable. Applications send traces of their LLM calls, tool calls, and agent steps to Langfuse through SDKs and integrations, and the platform stores them for inspection. On top of those traces it provides prompt versioning, dataset-based experiments, and model-based evaluation, so it sits beside an application as an observability and evaluation backend rather than running the agent loop itself.
Solution
Langfuse does not run an agent loop; it is an out-of-band observability sink. The application's own loop emits spans for each LLM call, tool call, and step, and Langfuse ingests them as nested traces. Evaluators then run asynchronously over a sampled share of incoming traces, attaching scores back to them.
Primary use cases
- tracing and observability for LLM and agent applications
- prompt management and versioning
- LLM-as-a-judge evaluation of production traces
- dataset-based experiments and offline evaluation
- cost and latency monitoring
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.