Framework · Enterprise Platforms

LangSmith

LangSmith is a hosted platform for tracing, evaluating, and monitoring LLM applications across development and production.

Description

LangSmith captures traces of LLM and agent runs and evaluates them with code rules, human review, and LLM-as-judge evaluators. It runs offline experiments to gate during development and online evaluators that run automatically on production traces for monitoring and alerting. Failing production traces can be added back to datasets to drive targeted offline experiments.

Solution

LangSmith does not run the application's agent loop; it observes and evaluates it. Application runs send traces to LangSmith, where evaluators score them. In the offline track, evaluators run over curated datasets as experiments. In the online track, evaluators run automatically on incoming production runs or threads to provide monitoring and alerting, and flagged runs feed back into datasets.

Primary use cases

  • tracing LLM and agent runs
  • offline evaluation experiments before deploy
  • online evaluation and monitoring of production traffic

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.