6.4 · LLM Adapter 架构（LLM Adapter Architecture）

记忆管道与知识图谱构建 · 本章是 Cognee DeepWiki 中文译文的独立章节页，保留原始链接、源码锚点、模块标签和章节层级。

项目Cognee 章节6.4 状态全文译文模块模型调用与提供方适配、界面与交互、系统架构、测试、发布与运维

项目要点页2.5 参考项目项目章节目录Cognee DeepWiki 原始章节LLM Adapter Architecture 上一章6.3 下一章6.5

源码线索

cognee-mcp/src/strip_vectors.py
cognee/infrastructure/llm/LLMGateway.py
cognee/infrastructure/llm/structured_output_framework/baml/baml_src/extraction/acreate_structured_output.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py

模块标签

模型调用与提供方适配
界面与交互
系统架构
测试、发布与运维
记忆与上下文

中文译文

LLM Adapter 架构（中文译文）

原始 DeepWiki 页面：https://deepwiki.com/topoteretes/cognee/6.4-llm-adapter-architecture

翻译时间：2026-05-27T08:45:19.486Z

翻译模型：deepseek-chat

原文字符数：12936

项目：Cognee (cognee)

---

大语言模型（LLM）适配器架构

架构总览

大语言模型适配器系统结合了基于协议（Protocol）的接口和类继承，在保持类型安全的同时提供了可扩展性。该架构由四层组成：

网关层：LLMGateway 负责高层路由、上下文注入和使用量跟踪 cognee/infrastructure/llm/LLMGateway.py:52-111。
协议层：LLMInterface 定义了所有适配器的契约 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py:8-34。
基础适配器层：GenericAPIAdapter 使用 litellm 和 instructor 为兼容 OpenAI 的 API 提供通用功能 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-114。
提供商适配器：针对每个大语言模型提供商的专门实现。

适配器类层次结构

graph TB
    LLMInterface["LLMInterface<br/>(协议)"]
    GenericAPIAdapter["GenericAPIAdapter<br/>(基类)"]
    OpenAIAdapter["OpenAIAdapter"]
    AzureOpenAIAdapter["AzureOpenAIAdapter"]
    GeminiAdapter["GeminiAdapter"]
    MistralAdapter["MistralAdapter"]
    AnthropicAdapter["AnthropicAdapter"]
    OllamaAPIAdapter["OllamaAPIAdapter"]
    LlamaCppAPIAdapter["LlamaCppAPIAdapter"]

    LLMInterface -.实现.-> GenericAPIAdapter
    LLMInterface -.实现.-> AnthropicAdapter
    LLMInterface -.实现.-> OllamaAPIAdapter
    LLMInterface -.实现.-> LlamaCppAPIAdapter

    GenericAPIAdapter -->|继承| OpenAIAdapter
    OpenAIAdapter -->|继承| AzureOpenAIAdapter
    GenericAPIAdapter -->|继承| GeminiAdapter
    GenericAPIAdapter -->|继承| MistralAdapter

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-114、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:37-56、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:38-44、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:32-39、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:34-57

LLMGateway 与上下文注入

LLMGateway 是所有大语言模型请求的入口点。它负责选择结构化输出框架（Instructor 或 BAML）、注入持久化内存上下文以及记录会话使用量。

内存注入逻辑

在请求发送到适配器之前，LLMGateway 会调用 _inject_agent_memory cognee/infrastructure/llm/LLMGateway.py:65。此函数使用 get_current_agent_memory_context() 检索活动的 AgentMemoryContext cognee/infrastructure/llm/LLMGateway.py:15-17。如果内存上下文处于活动状态且包含检索到的数据，则会将其前置到原始用户输入之前 cognee/infrastructure/llm/LLMGateway.py:18-21。

def _inject_agent_memory(text_input: str) -> str:
    from cognee.modules.agent_memory import get_current_agent_memory_context

    context = get_current_agent_memory_context()
    if context is None or not context.memory_context:
        return text_input

    return f"附加内存上下文：\n{context.memory_context}\n\n原始输入：\n{text_input}"

来源：cognee/infrastructure/llm/LLMGateway.py:14-21、cognee/modules/agent_memory/runtime.py:79-81

使用量跟踪

网关将大语言模型调用包装在 _record_session_usage_after 中 cognee/infrastructure/llm/LLMGateway.py:92。此工具使用 record_llm_call 将输入文本、模型名称和序列化响应（对于 Pydantic 模型使用 model_dump_json()）记录到活动会话跟踪器中 cognee/infrastructure/llm/LLMGateway.py:35-46。

来源：cognee/infrastructure/llm/LLMGateway.py:24-49

LLMInterface 协议

LLMInterface 协议定义了所有大语言模型适配器必须实现的最小契约。它使用 Python 的 Protocol 类型实现结构子类型化。

核心方法

方法	用途	返回类型
`acreate_structured_output()`	使用 Pydantic 模型进行异步结构化输出生成。	`BaseModel`
`create_transcript()`	处理音频到文本的转录。	`TranscriptionReturnType`
`transcribe_image()`	处理图像到文本/描述的任务。	`str`

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py:8-34、cognee/infrastructure/llm/LLMGateway.py:59-111

大语言模型适配器实现

GenericAPIAdapter

这是使用 litellm 和 instructor 组合的提供商的基类。它使用 tenacity 实现了标准重试逻辑 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:118-126，并在主模型因内容策略违规失败时支持回退模型 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:187-210。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-210

OpenAI 和 AzureOpenAI

OpenAIAdapter 处理标准 GPT 模型，并默认对新模型使用 json_schema_mode cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:58-104。AzureOpenAIAdapter 扩展了此功能以支持 Azure 特定的端点和通过 DefaultAzureCredential 实现的托管标识认证 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:108-171。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:37-106、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:38-171

LlamaCppAPIAdapter

通过 llama-cpp-python 支持本地大语言模型执行。它提供两种模式：

服务器模式：连接到兼容 OpenAI 的 HTTP 服务器 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:127-141。
本地模式：直接在进程中加载模型文件，并使用 instructor 修补 Llama 对象 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:96-125。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:34-125

多模态能力

适配器实现了 create_transcript 和 transcribe_image 以处理非文本输入。

转录：MistralAdapter 使用原生 Mistral 客户端的音频 API cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:168-174。OllamaAPIAdapter 使用兼容 OpenAI 的 whisper 端点 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:161-166。
视觉：OllamaAPIAdapter 通过将图像进行 base64 编码，并使用 "这张图片里有什么？" 提示发送给模型来实现 transcribe_image cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:203-219。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:148-176、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:142-219

代理内存集成

大语言模型适配器架构与 agent_memory 模块紧密耦合，该模块使用 contextvars 在异步调用之间维护状态。

执行上下文数据流

sequenceDiagram
    participant App["应用程序代码"]
    participant Decorator["@agent_memory"]
    participant Runtime["AgentMemoryContext"]
    participant Gateway["LLMGateway"]
    participant LLM["大语言模型适配器"]

    App->>Decorator: 调用 support_agent()
    Decorator->>Runtime: 创建 AgentMemoryConfig
    Decorator->>Runtime: set_current_agent_memory_context()
    Decorator->>App: 执行 support_agent 主体
    App->>Gateway: acreate_structured_output(text_input)
    Gateway->>Gateway: _inject_agent_memory(text_input)
    Note over Gateway: 从 Runtime 检索 memory_context
    Gateway->>LLM: acreate_structured_output(enriched_input)
    LLM-->>Gateway: Pydantic 响应
    Gateway->>Gateway: _record_session_usage_after()
    Gateway-->>App: 结果
    App-->>Decorator: 返回结果
    Decorator->>Runtime: reset_current_agent_memory_context()

来源：cognee/infrastructure/llm/LLMGateway.py:59-92、cognee/modules/agent_memory/runtime.py:59-95、examples/guides/agent_memory_quickstart.py:43-52

结果消毒

对于 MCP 和其他上下文敏感的客户端，strip_vectors cognee-mcp/src/strip_vectors.py:15-27 用于在将搜索结果传递给大语言模型或客户端之前递归移除大型 text_vector 字段，以防止上下文窗口耗尽 cognee-mcp/src/strip_vectors.py:4-9。

来源：cognee-mcp/src/strip_vectors.py:1-27