agentic_huge_data_base / wiki
页面 Graphiti · 6.1 LLM 客户端架构·DeepWiki 中文全文译文

6.1 · LLM 客户端架构(LLM Client Architecture)

时序知识图谱与动态事实记忆 · 本章是 Graphiti DeepWiki 中文译文的独立章节页,保留原始链接、源码锚点、模块标签和章节层级。

项目Graphiti 章节6.1 状态全文译文 模块模型调用与提供方适配、接口与服务契约、界面与交互、入库与解析
源码线索
  • examples/azure-openai/azure_openai_neo4j.py
  • examples/gliner2/.env.example
  • examples/gliner2/README.md
  • examples/gliner2/gliner2_neo4j.py
  • graphiti_core/cross_encoder/gemini_reranker_client.py
  • graphiti_core/embedder/azure_openai.py
  • graphiti_core/embedder/gemini.py
  • graphiti_core/llm_client/anthropic_client.py
  • graphiti_core/llm_client/azure_openai_client.py
  • graphiti_core/llm_client/client.py
模块标签
  • 模型调用与提供方适配
  • 接口与服务契约
  • 界面与交互
  • 入库与解析
  • 系统架构

中文译文

LLM 客户端架构(中文译文)

原始 DeepWiki 页面:https://deepwiki.com/getzep/graphiti/6.1-llm-client-architecture
翻译时间:2026-05-27T08:45:05.510Z
翻译模型:deepseek-chat
原文字符数:16253
项目:Graphiti (graphiti)

---

大语言模型(LLM)客户端架构

相关源文件

以下文件被用作生成此维基页面的上下文:

  • examples/azure-openai/azure_openai_neo4j.py
  • examples/gliner2/.env.example
  • examples/gliner2/README.md
  • examples/gliner2/gliner2_neo4j.py
  • graphiti_core/cross_encoder/gemini_reranker_client.py
  • graphiti_core/embedder/azure_openai.py
  • graphiti_core/embedder/gemini.py
  • graphiti_core/llm_client/anthropic_client.py
  • graphiti_core/llm_client/azure_openai_client.py
  • graphiti_core/llm_client/client.py
  • graphiti_core/llm_client/config.py
  • graphiti_core/llm_client/errors.py
  • graphiti_core/llm_client/gemini_client.py
  • graphiti_core/llm_client/gliner2_client.py
  • graphiti_core/llm_client/groq_client.py
  • graphiti_core/llm_client/openai_base_client.py
  • graphiti_core/llm_client/openai_client.py
  • graphiti_core/llm_client/openai_generic_client.py
  • graphiti_core/llm_client/token_tracker.py
  • mcp_server/config/mcp_config_stdio_example.json
  • mcp_server/src/services/factories.py
  • tests/cross_encoder/test_gemini_reranker_client.py
  • tests/llm_client/test_anthropic_client.py
  • tests/llm_client/test_azure_openai_client.py
  • tests/llm_client/test_errors.py
  • tests/llm_client/test_gemini_client.py
  • tests/test_text_utils.py

本文档记录了 graphiti-core 中的大语言模型(LLM)客户端子系统:包括抽象基类、配置对象、所有具体提供商实现,以及包裹每次大语言模型(LLM)调用的横切行为(重试逻辑、缓存、追踪和 Token 追踪)。

有关嵌入向量和重排序服务的集成,请参见 6.2。有关传入这些客户端的提示模板,请参见 6.3。有关如何端到端配置特定提供商,请参见 9.3

---

类层次结构

Graphiti 中的所有大语言模型(LLM)客户端共享一个以 LLMClient 为根的继承树。

大语言模型(LLM)客户端的类层次结构

classDiagram
    class LLMClient {
        +LLMConfig config
        +str model
        +str small_model
        +float temperature
        +int max_tokens
        +bool cache_enabled
        +Tracer tracer
        +TokenUsageTracker token_tracker
        +set_tracer(tracer)
        +generate_response(messages, response_model, max_tokens, model_size, group_id, prompt_name)
        #_generate_response(messages, response_model, max_tokens, model_size)*
        #_clean_input(input)
        #_get_cache_key(messages)
        #_generate_response_with_retry(...)
    }

    class BaseOpenAIClient {
        +int MAX_RETRIES
        #_convert_messages_to_openai_format(messages)
        #_get_model_for_size(model_size)
        #_handle_structured_response(response)
        #_handle_json_response(response)
        #_create_completion()*
        #_create_structured_completion()*
    }

    class OpenAIClient {
        +AsyncOpenAI client
        #_create_structured_completion(...)
        #_create_completion(...)
    }

    class AzureOpenAILLMClient {
        +AsyncAzureOpenAI|AsyncOpenAI client
        +int MAX_RETRIES
        #_create_structured_completion(...)
        #_create_completion(...)
        #_handle_structured_response(response)
        #_supports_reasoning_features(model)
    }

    class OpenAIGenericClient {
        +AsyncOpenAI client
        +int MAX_RETRIES
        #_generate_response(...)
        +generate_response(...)
    }

    class AnthropicClient {
        +AsyncAnthropic client
        #_create_tool(response_model)
        #_extract_json_from_text(text)
        #_resolve_max_tokens(requested, model)
        #_generate_response(...)
    }

    class GeminiClient {
        +genai.Client client
        +int MAX_RETRIES
        #_check_safety_blocks(response)
        #_check_prompt_blocks(response)
        #_resolve_max_tokens(requested, model)
        #_generate_response(...)
    }

    class GroqClient {
        +AsyncGroq client
        #_generate_response(...)
    }

    class GLiNER2Client {
        +generate_response(...)
    }

    LLMClient <|-- BaseOpenAIClient
    LLMClient <|-- OpenAIGenericClient
    LLMClient <|-- AnthropicClient
    LLMClient <|-- GeminiClient
    LLMClient <|-- GroqClient
    LLMClient <|-- GLiNER2Client
    BaseOpenAIClient <|-- OpenAIClient
    BaseOpenAIClient <|-- AzureOpenAILLMClient

来源:graphiti_core/llm_client/client.py:71-147, graphiti_core/llm_client/openai_base_client.py:40-95, graphiti_core/llm_client/openai_client.py:27-125, graphiti_core/llm_client/azure_openai_client.py:31-167, graphiti_core/llm_client/openai_generic_client.py:37-214, graphiti_core/llm_client/anthropic_client.py:103-150, graphiti_core/llm_client/gemini_client.py:72-127, graphiti_core/llm_client/groq_client.py:48-85, graphiti_core/llm_client/gliner2_client.py:34-118

---

LLMConfig

LLMConfig 是传递给每个客户端构造函数的配置对象。它是一个普通的 Python 类(不是 Pydantic 模型)。

字段类型默认值描述
api_key`str \None`None提供商 API 密钥
model`str \None`None主模型标识符
small_model`str \None`None用于简单提示的较小/较便宜模型
base_url`str \None`None覆盖 API 基础 URL(例如,用于本地端点)
temperaturefloat1.0采样温度 graphiti_core/llm_client/config.py:20
max_tokensint16384最大输出 Token 数 graphiti_core/llm_client/config.py:19

ModelSize 是一个枚举,包含两个值:smallmedium graphiti_core/llm_client/config.py:23-25。所有对 generate_response 的调用都接受一个 model_size 参数;客户端会将 ModelSize.small 路由到 small_model,将 ModelSize.medium 路由到 model

来源:graphiti_core/llm_client/config.py:19-69

---

LLMClient 抽象基类

graphiti_core/llm_client/client.py:71-147 中的 LLMClient 是所有提供商实现的抽象基类。

构造函数
LLMClient(config: LLMConfig | None, cache: bool = False)

如果 configNone,则会使用默认的 LLMConfig() graphiti_core/llm_client/client.py:73-74。当 cache=True 时,会创建一个指向 ./llm_cacheLLMCache 实例 graphiti_core/llm_client/client.py:35, graphiti_core/llm_client/client.py:87-88

generate_response — 公共接口

这是调用者的唯一公共入口点。其签名如下:

async generate_response(
    messages: list[Message],
    response_model: type[BaseModel] | None = None,
    max_tokens: int | None = None,
    model_size: ModelSize = ModelSize.medium,
    group_id: str | None = None,
    prompt_name: str | None = None,
) -> dict[str, Any]

基类实现按顺序执行以下步骤 graphiti_core/llm_client/client.py:155-247

  1. 如果提供了 response_model,则将其 JSON 模式追加到最后一条消息中 graphiti_core/llm_client/client.py:167-173
  2. 将多语言提取指令(来自 get_extraction_language_instruction(group_id))追加到第一条消息中 graphiti_core/llm_client/client.py:176
  3. 对每条消息调用 _clean_input,以去除无效的 Unicode 和控制字符 graphiti_core/llm_client/client.py:178-179
  4. 打开一个追踪跨度(llm.generate)并设置属性,包括 llm.providermodel.sizemax_tokenscache.enabled,以及可选的 prompt.name graphiti_core/llm_client/client.py:182-191
  5. 检查缓存;如果命中,则立即返回 graphiti_core/llm_client/client.py:194-197
  6. 调用 _generate_response_with_retry,该方法使用 Tenacity 重试逻辑包装了抽象的 _generate_response graphiti_core/llm_client/client.py:202-212
  7. 如果启用了缓存,则将结果存储到缓存中 graphiti_core/llm_client/client.py:214-216
抽象方法:_generate_response
@abstractmethod
async def _generate_response(
    self,
    messages: list[Message],
    response_model: type[BaseModel] | None = None,
    max_tokens: int = DEFAULT_MAX_TOKENS,
    model_size: ModelSize = ModelSize.medium,
) -> dict[str, typing.Any]:
    pass

来源:graphiti_core/llm_client/client.py:139-147

---

具体实现

具体客户端类比较表

上游 SDK默认主模型结构化输出方法
OpenAIClientopenaigpt-4.1-miniresponses.parse(推理)/ chat.completions(标准)
AzureOpenAILLMClientopenai(Azure)_(由调用者设置)_responses.parse(o1/o3/gpt-5)/ beta.chat.completions.parse(标准)
OpenAIGenericClientopenaigpt-4.1-minijson_schema 响应格式
AnthropicClientanthropicclaude-haiku-4-5-latest工具使用(_create_tool
GeminiClientgoogle-genaigemini-3-flash-previewresponse_mime_type=application/json
GroqClientgroqllama-3.1-70b-versatilejson_object 响应格式
GLiNER2Clientglinergliner_medium-v2.1本地模型推理

来源:graphiti_core/llm_client/openai_client.py:27-125, graphiti_core/llm_client/azure_openai_client.py:31-167, graphiti_core/llm_client/openai_generic_client.py:37-214, graphiti_core/llm_client/anthropic_client.py:103-150, graphiti_core/llm_client/gemini_client.py:72-127, graphiti_core/llm_client/groq_client.py:48-85, graphiti_core/llm_client/gliner2_client.py:34-118

OpenAI 系列(BaseOpenAIClientOpenAIClientAzureOpenAILLMClient

BaseOpenAIClient 持有 OpenAI 兼容 API 的共享逻辑 graphiti_core/llm_client/openai_base_client.py:40-58。它定义了两个抽象钩子:_create_structured_completion_create_completion

OpenAIClient 通过前缀(gpt-5o1o3)检测推理模型 graphiti_core/llm_client/openai_client.py:77-79。对于这些模型,它会调用 client.responses.parse graphiti_core/llm_client/openai_client.py:99;对于标准模型,它会调用 client.chat.completions.create,并设置 response_format={'type': 'json_object'} graphiti_core/llm_client/openai_client.py:119-125

AzureOpenAILLMClient 根据 _supports_reasoning_features(model) 将请求路由到 responses.parsebeta.chat.completions.parse graphiti_core/llm_client/azure_openai_client.py:74-104

OpenAIGenericClient

专为本地模型(Ollama、LM Studio)设计。它使用 json_schema 响应格式 graphiti_core/llm_client/openai_generic_client.py:115-121。默认 max_tokens 为 16,384,以确保兼容性 graphiti_core/llm_client/openai_generic_client.py:75-76

AnthropicClient

使用工具使用 API 进行结构化输出。_create_toolresponse_model 生成工具定义 graphiti_core/llm_client/anthropic_client.py:177-220。它通过 ANTHROPIC_MODEL_MAX_TOKENS 处理模型特定的 Token 限制 graphiti_core/llm_client/anthropic_client.py:75-97

GeminiClient

google-genai 集成。它通过 _check_safety_blocks 处理安全过滤器 graphiti_core/llm_client/gemini_client.py:128-152,并通过 _check_prompt_blocks 处理提示拦截 graphiti_core/llm_client/gemini_client.py:154-162。它支持 Gemini 2.5+ 模型的 thinking_config graphiti_core/llm_client/gemini_client.py:109-110

---

横切行为

通过 generate_response 的调用流程

sequenceDiagram
    participant "调用者" as caller
    participant "LLMClient.generate_response" as gr
    participant "LLMCache" as cache
    participant "追踪器" as tracer
    participant "_generate_response_with_retry" as retry
    participant "提供商 API" as api

    caller->>gr: "generate_response(messages, response_model, ...)"
    gr->>gr: "将 JSON 模式追加到最后一条消息(如果提供了 response_model)"
    gr->>gr: "将 get_extraction_language_instruction() 追加到 messages[0]"
    gr->>gr: "对每条消息调用 _clean_input()"
    gr->>tracer: "start_span('llm.generate')"
    gr->>cache: "get(cache_key)"
    alt "缓存命中"
        cache-->>gr: "缓存的字典"
        gr-->>caller: "缓存的字典"
    else "缓存未命中"
        gr->>retry: "_generate_response_with_retry(messages, ...)"
        retry->>api: "_generate_response()"
        api-->>retry: "响应字典"
        retry-->>gr: "响应字典"
        gr->>cache: "set(cache_key, response)"
        gr-->>caller: "响应字典"
    end
    gr->>tracer: "结束跨度"

来源:graphiti_core/llm_client/client.py:155-247

重试逻辑

客户端使用 Tenacity 进行自动重试。is_server_or_retry_error 决定某个异常(如 RateLimitError 或 5xx 状态码)是否需要进行重试 graphiti_core/llm_client/client.py:62-69

客户端策略尝试次数
LLMClient指数退避(5-120 秒)4 graphiti_core/llm_client/client.py:117-118
BaseOpenAIClient类常量2 graphiti_core/llm_client/openai_base_client.py:49
AnthropicClientSDK 内部1 graphiti_core/llm_client/anthropic_client.py:146
GeminiClient类常量2 graphiti_core/llm_client/gemini_client.py:93

来源:graphiti_core/llm_client/client.py:116-126, graphiti_core/llm_client/openai_base_client.py:49, graphiti_core/llm_client/anthropic_client.py:146, graphiti_core/llm_client/gemini_client.py:93

Token 追踪

TokenUsageTracker graphiti_core/llm_client/token_tracker.py 记录每个提示的使用情况。具体客户端在收到 API 响应后会记录使用情况,以追踪输入和输出 Token graphiti_core/llm_client/openai_base_client.py:127-130, graphiti_core/llm_client/anthropic_client.py:417-422

响应缓存

LLMCache graphiti_core/llm_client/cache.py 将响应存储在 ./llm_cachegraphiti_core/llm_client/client.py:35。缓存键是模型和消息的 MD5 哈希值 graphiti_core/llm_client/client.py:149-153

---

提供商到代码的映射

每个提供商的文件和类位置

graph TB
    subgraph "graphiti_core/llm_client/"
        A["client.py\nLLMClient (抽象基类)"]
        B["config.py\nLLMConfig, ModelSize"]
        C["openai_base_client.py\nBaseOpenAIClient"]
        D["openai_client.py\nOpenAIClient"]
        E["azure_openai_client.py\nAzureOpenAILLMClient"]
        F["openai_generic_client.py\nOpenAIGenericClient"]
        G["anthropic_client.py\nAnthropicClient"]
        H["gemini_client.py\nGeminiClient"]
        I["groq_client.py\nGroqClient"]
        J["gliner2_client.py\nGLiNER2Client"]
        K["token_tracker.py\nTokenUsageTracker"]
    end

    A --> B
    A --> K
    C --> A
    D --> C
    E --> C
    F --> A
    G --> A
    H --> A
    I --> A
    J --> A

来源:graphiti_core/llm_client/client.py:1-147, graphiti_core/llm_client/openai_base_client.py:1-38, graphiti_core/llm_client/anthropic_client.py:1-44, graphiti_core/llm_client/gemini_client.py:1-43, graphiti_core/llm_client/groq_client.py:1-34, graphiti_core/llm_client/gliner2_client.py:1-32