For every issue it surfaces, LangSmith Engine proposes three resolution actions. 1️⃣Opens a PR Drafts a targeted code or prompt change + opens against repo. You can review & merge. 2️⃣Creates a custom online evaluator Proposes an evaluator scoped to the exact problem. If it happens again, it gets resurfaced. 3️⃣Adds to your offline eval suite Pulls failing production races into a dataset of ground truth examples ✨Every issue you resolve improves eval coverage along the way.
LangSmith Engine Automates Agent Issue Resolution with PRs and Evals
LangChainLangChain's LangSmith Engine now automatically proposes three resolution actions for every agent issue it identifies: opening a Pull Request (PR), creating a custom online evaluator, and adding failing traces to an offline evaluation suite. This aims to accelerate the agent development lifecycle by automating issue diagnosis and fix validation.
- Resolution Actions
- Opens a PR, Creates custom online evaluator, Adds to offline eval suite
- Availability
- Public beta
- Integration
- Existing LangSmith tracing projects, optional repository connection
- Issue Detection Signals
- Explicit errors, online evaluator failures, trace anomalies, negative user feedback, unusual behaviors
This update addresses the manual and time-consuming cycle of reviewing agent traces, identifying failure patterns, and creating fixes. By continuously monitoring production traces and clustering failures into named issues, LangSmith Engine diagnoses root causes against connected codebases and proposes solutions, aiming to prevent regressions and strengthen evaluation coverage over time.
LangSmith Engine is built on existing LangSmith tracing and evaluation infrastructure. Connect a tracing project and optionally a code repository, and the Engine will automatically begin surfacing issues from production traces. Every resolved issue also generates an evaluator to monitor performance, making future improvements more robust.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →



