App · Enterprise Platformsactive

LangSmith

Type: app · Vendor: LangChain Inc. · Language: TypeScript · License: proprietary · Status: active · Status in practice: mature · First released: 2023

Links: homepage docs

LangSmith is a hosted platform for tracing, evaluating, and monitoring LLM applications across development and production.

Description. LangSmith captures traces of LLM and agent runs and evaluates them with code rules, human review, and LLM-as-judge evaluators. It runs offline experiments to gate during development and online evaluators that run automatically on production traces for monitoring and alerting. Failing production traces can be added back to datasets to drive targeted offline experiments.

Agent loop shape. LangSmith does not run the application's agent loop; it observes and evaluates it. Application runs send traces to LangSmith, where evaluators score them. In the offline track, evaluators run over curated datasets as experiments. In the online track, evaluators run automatically on incoming production runs or threads to provide monitoring and alerting, and flagged runs feed back into datasets.

Primary use cases

  • tracing LLM and agent runs
  • offline evaluation experiments before deploy
  • online evaluation and monitoring of production traffic

Key concepts

  • Trace and run decision-log (docs)A run is a single span of work (an LLM call, a retrieval, a formatting step) and a trace is the collection of runs for one end-to-end operation; together they form the observability record LangSmith stores per request.
  • Experiment dual-evaluation-offline-online (docs)An offline evaluation run that applies a set of evaluators over a curated dataset to score an application version before deploy, the gate side of LangSmith's two evaluation tracks.
  • Online evaluator dual-evaluation-offline-online (docs)An evaluator configured to run automatically on incoming production runs or threads (optionally sampled) to provide live monitoring, anomaly detection, and alerting on production traffic.
  • Annotation queue (docs)A human-review surface where runs are queued for people to label or score, complementing the automated code-rule and LLM-as-judge evaluators.

Patterns this app implements —

Neighbourhood

Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.