langgraph-reflection
https://github.com/fanqingsong/langgraph-reflection
| Description | Architecture |
|---|---|
| This reflection agent uses two subagents: - A "main" agent, which is the agent attempting to solve the users task - A "critique" agent, which checks the main agents work and offers any critiques The reflection agent has the following architecture: 1. First, the main agent is called 2. Once the main agent is finished, the critique agent is called 3. Based on the result of the critique agent: - If the critique agent finds something to critique, then the main agent is called again - If there is nothing to critique, then the overall reflection agent finishes 4. Repeat until the overall reflection agent finishes |
![]() |
We make some assumptions about the graphs:
- The main agent should take as input a list of messages
- The reflection agent should return a user message if there is any critiques, otherwise it should return no messages.
https://python.langchain.ac.cn/docs/how_to/tool_calling/#typeddict-class
LangChain 工具
LangChain 还实现了一个 `@tool` 装饰器,允许进一步控制工具模式,例如工具名称和参数描述。有关详细信息,请参阅 此处的 操作指南。
Pydantic 类
您可以使用 Pydantic,无需附带函数即可等效地定义模式。
请注意,除非提供默认值,否则所有字段都是 `required`(必需的)。
from pydantic import BaseModel, Field
class add(BaseModel):
"""Add two integers."""
a: int = Field(..., description="First integer")
b: int = Field(..., description="Second integer")
class multiply(BaseModel):
"""Multiply two integers."""
a: int = Field(..., description="First integer")
b: int = Field(..., description="Second integer")
TypedDict 类
或者使用 TypedDicts 和注解
from typing_extensions import Annotated, TypedDict
class add(TypedDict):
"""Add two integers."""
# Annotations must have the type and can optionally include a default value and description (in that order).
a: Annotated[int, ..., "First integer"]
b: Annotated[int, ..., "Second integer"]
class multiply(TypedDict):
"""Multiply two integers."""
a: Annotated[int, ..., "First integer"]
b: Annotated[int, ..., "Second integer"]
tools = [add, multiply]
要将这些模式实际绑定到聊天模型,我们将使用 `.bind_tools()` 方法。这会处理将 `add` 和 `multiply` 模式转换为模型所需的正确格式。每次调用模型时,都会传入工具模式。
https://github.com/langchain-ai/openevals
Much like tests in traditional software, evals are an important part of bringing LLM applications to production. The goal of this package is to help provide a starting point for you to write evals for your LLM applications, from which you can write more custom evals specific to your application.
If you are looking for evals specific to evaluating LLM agents, please check out agentevals.
https://github.com/langchain-ai/agentevals
Agentic applications give an LLM freedom over control flow in order to solve problems. While this freedom can be extremely powerful, the black box nature of LLMs can make it difficult to understand how changes in one part of your agent will affect others downstream. This makes evaluating your agents especially important.
This package contains a collection of evaluators and utilities for evaluating the performance of your agents, with a focus on agent trajectory, or the intermediate steps an agent takes as it runs. It is intended to provide a good conceptual starting point for your agent's evals.
If you are looking for more general evaluation tools, please check out the companion package openevals.
https://zhuanlan.zhihu.com/p/26791319223
随着大型语言模型(LLM)在各类应用中的广泛使用,如何科学、系统地评估这些模型的表现成为一个重要课题。OpenEvals 是一个由 LangChain
团队推出的开源项目,专为 LLM 应用程序提供全面的评估工具,帮助开发者快速验证模型效果、优化输出质量。这一工具库不仅功能强大,还具备高度灵活性,适合多种场景。
核心亮点与功能:
- LLM 作为评判员 OpenEvals 提供了一个创新思路:利用另一个 LLM 对模型输出进行评分。
- 支持多维度评估:如正确性、简洁性、虚构性等。
- 极高的灵活性:开发者可以自定义提示模板、选择不同的模型以及评分标准。
- 结构化输出评估 对于需要生成结构化数据(如工具调用、文本提取等)的任务,OpenEvals 提供了精准的评估方式:
- 支持 精确匹配 和 LLM 评判,确保输出符合预期。
- 多样化评估指标 除了 LLM 评判,OpenEvals 还内置了常用的评估方法:
- 精确匹配:适用于对严格一致性要求的任务。
- 编辑距离:衡量生成文本与目标文本的相似程度。
- 嵌入相似度:通过向量化方法评估语义相似性。
- 异步支持 所有评估器均支持 Python 的异步模式,大幅提升评估效率,适合处理大量任务的场景。
- 与 LangSmith 集成 OpenEvals 无缝对接 LangSmith 平台,支持记录评估结果,用于实验跟踪与性能分析。
- 测试集成 开发者可以直接将 OpenEvals 的评估功能集成到常见的测试框架中(如 pytest、Vitest、Jest),实现自动化测试。
https://github.com/microsoft/pyright
Pyright is a full-featured, standards-based static type checker for Python. It is designed for high performance and can be used with large Python source bases.
Pyright includes both a command-line tool and an extension for Visual Studio Code.
https://github.com/microsoft/pyright
Pyright is a full-featured, standards-based static type checker for Python. It is designed for high performance and can be used with large Python source bases.
Pyright includes both a command-line tool and an extension for Visual Studio Code.
