6.2 · 模型调用管线（Model Invocation Pipeline）

应用编排与外部知识接入 · 本章是 Dify DeepWiki 中文译文的独立章节页，保留原始链接、源码锚点、模块标签和章节层级。

项目Dify 章节6.2 状态全文译文模块模型调用与提供方适配、入库与解析、系统架构、文档对象与元数据

项目要点页2.5 参考项目项目章节目录Dify DeepWiki 原始章节Model Invocation Pipeline 上一章6.1 下一章6.3

源码线索

api/core/agent/base_agent_runner.py
api/core/agent/cot_agent_runner.py
api/core/agent/cot_chat_agent_runner.py
api/core/agent/cot_completion_agent_runner.py
api/core/agent/entities.py
api/core/agent/fc_agent_runner.py
api/core/agent/output_parser/cot_output_parser.py
api/core/app/app_config/easy_ui_based_app/agent/manager.py
api/core/app/app_config/easy_ui_based_app/model_config/converter.py
api/core/app/apps/agent_chat/app_runner.py

模块标签

模型调用与提供方适配
入库与解析
系统架构
文档对象与元数据
智能体运行时

中文译文

模型调用管线（中文译文）

原始 DeepWiki 页面：https://deepwiki.com/langgenius/dify/6.2-model-invocation-pipeline

翻译时间：2026-05-27T08:44:28.700Z

翻译模型：deepseek-chat

原文字符数：14024

项目：Dify (dify)

---

模型调用管线

目的与范围

本文档描述了 Dify 中大语言模型（LLM）的完整调用管线，涵盖从提示词构建到响应处理及用量追踪的全过程。该管线支持流式与阻塞式响应、结构化输出、推理内容提取、多模态输入以及负载均衡。本文详细说明了如何通过 HostingConfiguration 提供凭证，以及如何通过 ProviderManager 和 ModelInstance 类创建模型实例以进行调用。

有关模型提供者配置和凭证管理的信息，请参见提供者与模型架构。有关配额管理和信用池的信息，请参见配额管理与信用池。

---

核心调用架构

模型调用管线以 ModelManager 和 ProviderManager 为核心，它们将配置解析为可执行的 ModelInstance 对象。

系统流程总览

graph TB
    subgraph "入口点"
        LLMNode["LLMNode.run()"]
        AgentRunner["BaseAgentRunner.run()"]
        WorkflowNode["WorkflowNode.execute()"]
    end

    subgraph "模型解析"
        PM["ProviderManager.get_configurations()"]
        MI_INIT["ModelInstance.__init__"]
        Bundle["ProviderModelBundle"]
        MInstance["ModelInstance"]
    end

    subgraph "凭证管理"
        CP["ProviderConfiguration.get_current_credentials()"]
        LB["LBModelManager"]
        Hosting["HostingConfiguration"]
    end

    subgraph "调用执行"
        InvokeLLM["ModelInstance.invoke_llm()"]
        Runtime["ModelRuntime"]
        AIModel["LargeLanguageModel.invoke()"]
    end

    LLMNode --> MI_INIT
    AgentRunner --> MI_INIT
    WorkflowNode --> MI_INIT

    MI_INIT --> PM
    PM --> Bundle
    Bundle --> MInstance

    MInstance --> CP
    CP --> LB
    CP --> Hosting

    MInstance --> InvokeLLM
    InvokeLLM --> Runtime
    Runtime --> AIModel

来源：api/core/model_manager.py:35-56, api/core/provider_manager.py:64-136, api/core/workflow/nodes/llm/node.py:70-78

---

凭证与托管配置

Dify 通过 HostingConfiguration 类提供了统一的托管模型凭证处理方式。这对于 Dify 云端版本尤为重要，因为 Dify 会为用户管理试用和付费配额。

托管提供者与配额

HostingConfiguration 类初始化了一个 provider_map，其中包含针对 OpenAI、Anthropic 和 Gemini 等模型的 HostingProvider 对象。

classDiagram
    class HostingConfiguration {
        +dict provider_map
        +init_app(app)
        +init_openai() HostingProvider
        +init_azure_openai() HostingProvider
    }
    class HostingProvider {
        +bool enabled
        +dict credentials
        +QuotaUnit quota_unit
        +list quotas
    }
    class HostingQuota {
        +ProviderQuotaType quota_type
        +list restrict_models
    }
    class RestrictModel {
        +str model
        +str base_model_name
        +ModelType model_type
    }
    HostingConfiguration --> HostingProvider
    HostingProvider --> HostingQuota
    HostingQuota --> RestrictModel

关键实现细节：

初始化：init_app 方法会检查 dify_config.EDITION == "CLOUD"，然后才会填充提供者映射 api/core/hosting_configuration.py:52-66。
凭证注入：对于 Azure OpenAI，该方法会从 dify_config.HOSTED_AZURE_OPENAI_API_KEY 中提取密钥，并设置默认的 base_model_name api/core/hosting_configuration.py:71-76。
模型限制：配额通常会使用 RestrictModel 对象限制到特定模型，以确保用户在试用/付费层级下只能访问允许的模型 api/core/hosting_configuration.py:82-121。

来源：api/core/hosting_configuration.py:43-164, api/core/entities/provider_entities.py:8-32

---

模型实例创建

ModelInstance 是用于执行的主要对象。它封装了模型名称、提供者包以及解析后的凭证。

凭证解析流程

当 ModelInstance 被初始化时，它会通过层级查找来解析凭证：

提供者配置：检查租户使用的是 ProviderType.SYSTEM 还是 ProviderType.CUSTOM api/core/entities/provider_configuration.py:137-158。
负载均衡：如果自定义提供者启用了负载均衡，则会实例化 LBModelManager 以在多个凭证集之间进行选择 api/core/model_manager.py:83-116。
模型级别与提供者级别：凭证可以在特定模型级别定义，也可以从提供者全局设置中继承 api/core/entities/provider_configuration.py:162-170。

graph TD
    Start["ModelInstance.__init__"] --> CheckBundle["_fetch_credentials_from_bundle()"]
    CheckBundle --> GetConfig["ProviderConfiguration.get_current_credentials()"]

    GetConfig --> TypeCheck{"using_provider_type?"}
    TypeCheck -->|SYSTEM| SystemCreds["返回 SystemConfiguration.credentials"]
    TypeCheck -->|CUSTOM| CustomCreds["检查 CustomModelConfiguration，然后检查 CustomProviderConfiguration"]

    CustomCreds --> LBCheck{"LB 已启用?"}
    LBCheck -->|是| LBManager["LBModelManager 选择凭证"]
    LBCheck -->|否| ReturnCreds["返回解析后的字典"]

来源：api/core/model_manager.py:40-56, api/core/entities/provider_configuration.py:122-181, api/services/model_load_balancing_service.py:92-168

---

提示词消息构建

管线会组装 PromptMessage 对象。TokenBufferMemory 和 SimplePromptTransform 等组件负责管理对话历史和模板渲染，同时会考虑模型的 Token 限制。

记忆与文件集成

TokenBufferMemory 类通过将 MessageFile 实体转换为 PromptMessageContent，来构建包含多模态内容（如图片）的提示词消息 api/core/memory/token_buffer_memory.py:47-121。

组件	函数	作用
`TokenBufferMemory`	`get_history_prompt_messages`	获取并截断对话历史 `api/core/memory/token_buffer_memory.py:122-154`
`SimplePromptTransform`	`get_prompt`	将应用输入和模板转换为 `PromptMessage` 列表 `api/core/prompt/simple_prompt_transform.py:48-60`
`PromptMessage`	实体	用于 SYSTEM、USER 和 ASSISTANT 角色的数据结构 `api/graphon/model_runtime/entities/message_entities.py:16-25`

来源：api/core/memory/token_buffer_memory.py:30-123, api/core/prompt/simple_prompt_transform.py:43-91

---

调用执行

ModelInstance.invoke_llm 方法是通往模型运行时的最终网关。它支持 Agent 和工作流所需的不同调用模式。

调用模式

阻塞模式：返回完整的 LLMResult api/core/model_manager.py:130-138。
流式模式：返回 LLMResultChunk 对象的生成器 api/core/model_manager.py:119-127。

sequenceDiagram
    participant App as 运行器（Agent/工作流）
    participant MI as ModelInstance
    participant LB as LBModelManager
    participant RT as ModelRuntime

    App->>MI: invoke_llm(prompt_messages, stream=True)
    MI->>LB: （可选）选择凭证
    MI->>RT: model_type_instance.invoke(...)
    RT->>RT: LargeLanguageModel.invoke(...)
    RT-->>MI: 生成 Chunk
    MI-->>App: 生成 Chunk

Agent 调用示例： CotAgentRunner 在循环中调用模型，从流中解析思维过程和动作 api/core/agent/cot_agent_runner.py:126-133。用量会在多次迭代中被追踪和累积 api/core/agent/cot_agent_runner.py:89-99。

来源：api/core/model_manager.py:151-165, api/core/agent/cot_agent_runner.py:47-133, api/core/agent/fc_agent_runner.py:93-100

---

负载均衡与高可用性

Dify 支持模型负载均衡，可以将请求分发到多个 API 密钥或提供者，以避免速率限制。

启用：通过 ModelLoadBalancingService.enable_model_load_balancing 进行管理 api/services/model_load_balancing_service.py:50-69。
配置：存储多个 LoadBalancingModelConfig 记录，其中包含加密的凭证 api/services/model_load_balancing_service.py:129-145。
继承：可以使用 __inherit__ 保留名称从现有提供者配置中继承 api/services/model_load_balancing_service.py:150-168。

来源：api/services/model_load_balancing_service.py:45-168, api/core/model_manager.py:83-116