您的位置: 首页> AI模型

LangChain1.0智能体开发：结构化输出

匿名上传

发布时间:2025-11-10 13:45:02

结构化输出能让智能体以特定、可预测的格式返回数据。无需解析自然语言响应，你就能直接获取JSON对象、Pydantic模型或数据类（dataclasses）形式的结构化数据，供应用程序直接使用。LangChain的create_agent接口会自动处理结构化输出。用户只需设置所需的结构化输出模式（schema），当模型生成结构化数据后，该数据会被捕获、验证，并最终在智能体状态的structured_response键中返回。

1、响应格式

控制智能体返回结构化数据的方式在创建智能体时给create_agent接口传递response_format参数，参数赋值类型如下：

ToolStrategy[StructuredResponseT]：使用工具调用获取结构化输出
ProviderStrategy[StructuredResponseT]：使用提供商原生的结构化输出
type[StructuredResponseT]：Schema类型 —— 根据模型能力自动选择最佳策略
None：无结构化输出

当直接提供Schema类型时，LangChain会自动选择:

对于支持原生结构化输出的模型（如OpenAI、Grok），使用ProviderStrategy;
对于其他所有模型，使用ToolStrategy;

结构化响应会在智能体最终状态的structured_response键中返回。

2、ProviderStrategy（提供商原生策略）

部分模型提供商可通过其API原生支持结构化输出（目前仅支持 OpenAI 和 Grok）。在支持该功能的情况下，这是最可靠的方法。若要使用此策略，可以给create_agent.response_format配置一个ProviderStrategy; 也可以将Schema类型直接传递给create_agent.response_format，且所用模型支持原生结构化输出时，LangChain会自动使用ProviderStrategy。

class ProviderStrategy(Generic[SchemaT]):
    schema: type[SchemaT]

使用示例：

from pydantic import BaseModel
from langchain.agents import create_agent


class ContactInfo(BaseModel):
    """Contact information for a person."""
    name: str = Field(description="The name of the person")
    email: str = Field(description="The email address of the person")
    phone: str = Field(description="The phone number of the person")

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, [email protected], (555) 123-4567"}]
})

result["structured_response"]
# ContactInfo(name='John Doe', email='[email protected]', phone='(555) 123-4567')

提供商原生结构化输出具有高可靠性和严格的验证能力，这是因为模型提供商会对schema进行强制约束。在支持该功能的情况下，建议使用此方式。

3、ToolStrategy（工具调用策略）

对于不支持原生结构化输出的模型，LangChain会通过工具调用来实现相同的效果。该方式适用于所有支持工具调用的模型，大多数现代模型均具备这一能力。若要使用此策略，需要给create_agent.response_format配置一个ToolStrategy; 也可以将Schema类型直接传递给create_agent.response_format，且所用模型不支持原生结构化输出时，LangChain会自动使用ToolStrategy。

class ToolStrategy(Generic[SchemaT]):
    schema: type[SchemaT]
    tool_message_content: str | None
    handle_errors: Union[
        bool,
        str,
        type[Exception],
        tuple[type[Exception], ...],
        Callable[[Exception], str],
    ]

使用示例：

from pydantic import BaseModel, Field
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class MeetingAction(BaseModel):
    """Action items extracted from a meeting transcript."""
    task: str = Field(description="The specific task to be completed")
    assignee: str = Field(description="Person responsible for the task")
    priority: Literal["low", "medium", "high"] = Field(description="Priority level")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(
        schema=MeetingAction,
        tool_message_content="Action item captured and added to meeting notes!" #可选项，输出ToolMessage的内容
    )
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "From our meeting: Sarah needs to update the project timeline as soon as possible"}]
})

result["structured_response"]
# MeetingAction(task='Update project timeline', assignee='Sarah', priority='high')

3.1 错误处理

模型在通过工具调用生成结构化输出时可能会出现错误。LangChain提供了智能重试机制，可自动处理这些错误。

3.1.1 多结构化输出工具错误

当模型错误地调用多个结构化输出工具时，智能体会在ToolMessage中提供错误反馈，并提示模型进行重试，可自动处理这些错误。

from pydantic import BaseModel, Field
from typing import Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy


class ContactInfo(BaseModel):
    name: str = Field(description="Person's name")
    email: str = Field(description="Email address")

class EventDetails(BaseModel):
    event_name: str = Field(description="Name of the event")
    date: str = Field(description="Event date")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(Union[ContactInfo, EventDetails])  # Default: handle_errors=True
)

agent.invoke({
    "messages": [{"role": "user", "content": "Extract info: John Doe ([email protected]) is organizing Tech Conference on March 15th"}]
})

3.1.2 Schema验证错误

当结构化输出与预期Schema不匹配时，智能体会提供具体的错误反馈，并提示模型进行重试，可自动处理这些错误。

from pydantic import BaseModel, Field
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ProductRating(BaseModel):
    rating: int | None = Field(description="Rating from 1-5", ge=1, le=5)
    comment: str = Field(description="Review comment")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(ProductRating),  # Default: handle_errors=True
    system_prompt="You are a helpful assistant that parses product reviews. Do not make any field or value up."
)

agent.invoke({
    "messages": [{"role": "user", "content": "Parse this: Amazing product, 10/10!"}]
})

3.1.3 自定义错误处理策略

可以通过handle_errors参数自定义错误的处理方式：

True：默认值，捕获所有错误，使用默认错误模板，自动重试
str（字符串）：捕获所有错误，使用此自定义消息，自动重试
type[Exception]（异常类型）：仅捕获该类型异常并重试，使用默认消息；其它类型错误则直接报错退出
tuple(type[Exception], ...)（异常类型元组）：仅捕获这些类型异常，使用默认消息
Callable[[Exception], str]（可调用对象）：自定义函数，返回错误消息
False：不重试，让异常继续传播

示例（handle_errors传入可调用对象）:

# 自定义函数处理错误
def custom_error_handler(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return "There was an issue with the format. Try again.
    elif isinstance(error, MultipleStructuredOutputsError):
        return "Multiple structured outputs were returned. Pick the most relevant one."
    else:
        return f"Error: {str(error)}"

ToolStrategy(
    schema=ToolStrategy(Union[ContactInfo, EventDetails]),
    handle_errors=custom_error_handler
)