agentic_huge_data_base / wiki
页面 Cognee · 4.2 Graph-Based 检索·DeepWiki 中文全文译文

4.2 · Graph-Based 检索(Graph-Based Search)

记忆管道与知识图谱构建 · 本章是 Cognee DeepWiki 中文译文的独立章节页,保留原始链接、源码锚点、模块标签和章节层级。

项目Cognee 章节4.2 状态全文译文 模块检索、召回与索引、图谱与关系、系统架构、模型调用与提供方适配
源码线索
  • cognee/infrastructure/llm/prompts/extract_query_time.txt
  • cognee/modules/graph/cognee_graph/CogneeAbstractGraph.py
  • cognee/modules/graph/cognee_graph/CogneeGraph.py
  • cognee/modules/graph/cognee_graph/CogneeGraphElements.py
  • cognee/modules/graph/cognee_graph/__init__.py
  • cognee/modules/graph/exceptions/__init__.py
  • cognee/modules/graph/exceptions/exceptions.py
  • cognee/modules/retrieval/base_retriever.py
  • cognee/modules/retrieval/completion_retriever.py
  • cognee/modules/retrieval/graph_completion_context_extension_retriever.py
模块标签
  • 检索、召回与索引
  • 图谱与关系
  • 系统架构
  • 模型调用与提供方适配
  • 认证、权限与安全

中文译文

Graph-Based 检索(中文译文)

原始 DeepWiki 页面:https://deepwiki.com/topoteretes/cognee/4.2-graph-based-search
翻译时间:2026-05-27T08:45:24.944Z
翻译模型:deepseek-chat
原文字符数:16893
项目:Cognee (cognee)

---

基于图的搜索

相关源文件

以下文件被用作生成此 Wiki 页面的上下文:

  • cognee/infrastructure/llm/prompts/extract_query_time.txt
  • cognee/modules/graph/cognee_graph/CogneeAbstractGraph.py
  • cognee/modules/graph/cognee_graph/CogneeGraph.py
  • cognee/modules/graph/cognee_graph/CogneeGraphElements.py
  • cognee/modules/graph/cognee_graph/__init__.py
  • cognee/modules/graph/exceptions/__init__.py
  • cognee/modules/graph/exceptions/exceptions.py
  • cognee/modules/retrieval/base_retriever.py
  • cognee/modules/retrieval/completion_retriever.py
  • cognee/modules/retrieval/graph_completion_context_extension_retriever.py
  • cognee/modules/retrieval/graph_completion_cot_retriever.py
  • cognee/modules/retrieval/graph_completion_retriever.py
  • cognee/modules/retrieval/graph_summary_completion_retriever.py
  • cognee/modules/retrieval/temporal_retriever.py
  • cognee/modules/retrieval/utils/brute_force_triplet_search.py
  • cognee/modules/retrieval/utils/completion.py
  • cognee/modules/retrieval/utils/node_edge_vector_search.py
  • cognee/modules/users/methods/get_authenticated_user.py
  • cognee/tests/test_search_db.py
  • cognee/tests/unit/modules/graph/cognee_graph_elements_test.py
  • cognee/tests/unit/modules/graph/cognee_graph_test.py
  • cognee/tests/unit/modules/retrieval/chunks_retriever_test.py
  • cognee/tests/unit/modules/retrieval/graph_completion_retriever_context_extension_test.py
  • cognee/tests/unit/modules/retrieval/graph_completion_retriever_cot_test.py
  • cognee/tests/unit/modules/retrieval/graph_completion_retriever_test.py
  • cognee/tests/unit/modules/retrieval/rag_completion_retriever_test.py
  • cognee/tests/unit/modules/retrieval/summaries_retriever_test.py
  • cognee/tests/unit/modules/retrieval/temporal_retriever_test.py
  • cognee/tests/unit/modules/retrieval/test_brute_force_triplet_search.py
  • cognee/tests/unit/modules/retrieval/test_node_edge_vector_search.py
  • cognee/tests/unit/modules/retrieval/triplet_retriever_test.py
  • cognee/tests/unit/modules/users/test_conditional_authentication.py

目的与范围

Cognee 中的基于图的搜索利用知识图谱结构,根据实体关系和连接来检索信息,而非简单的语义相似度。核心实现是 GraphCompletionRetriever 类,它执行基于三元组的搜索,然后通过大语言模型(LLM)补全生成基于图结构的答案。

本页涵盖 GraphCompletionRetriever 类、brute_force_triplet_search() 算法、内存图结构(CogneeGraph)以及图感知的大语言模型补全策略,包括思维链(Chain-of-Thought)和上下文扩展(Context Extension)。

来源: cognee/modules/retrieval/graph_completion_retriever.py:33-40

基于图的搜索架构

Cognee 中的基于图的搜索遵循检索器模式,包含三个主要阶段:检索图三元组、将其解析为文本上下文,以及生成大语言模型补全。

GraphCompletionRetriever 管线

graph TB
    Query["用户查询"]

    subgraph "GraphCompletionRetriever 方法"
        GetRetrievedObjects["get_retrieved_objects(query)<br/>返回:List[Edge]"]
        GetContextFromObjects["get_context_from_objects(query, retrieved_objects)<br/>返回:str"]
        GetCompletionFromContext["get_completion_from_context(query, retrieved_objects, context)<br/>返回:List[Any]"]
    end

    subgraph "阶段 1:三元组检索"
        GetTriplets["get_triplets(query)<br/>调用:brute_force_triplet_search()"]
        CheckEmpty["检查 unified_engine.graph.is_empty()"]
        VectorSearch["向量搜索集合:<br/>Entity_name, TextSummary_text<br/>EntityType_name, DocumentChunk_text<br/>EdgeType_relationship_name"]
        ProjectGraph["CogneeGraph.project_graph_from_db()"]
        RankTriplets["calculate_top_triplet_importances(k=top_k)<br/>得分 = n1_dist + n2_dist + edge_dist"]
    end

    subgraph "阶段 2:上下文解析"
        ResolveEdges["resolve_edges_to_text(retrieved_edges)<br/>将三元组格式化为可读文本"]
    end

    subgraph "阶段 3:大语言模型补全"
        CheckSession{"会话已启用?"}
        SessionCompletion["SessionManager.generate_completion_with_session()"]
        DirectCompletion["generate_completion(query, context)"]
        LLMResult["大语言模型生成的答案"]
    end

    Query --> GetRetrievedObjects
    GetRetrievedObjects --> CheckEmpty
    CheckEmpty -->|非空| GetTriplets
    GetTriplets --> VectorSearch
    VectorSearch --> ProjectGraph
    ProjectGraph --> RankTriplets
    RankTriplets --> GetContextFromObjects

    GetContextFromObjects --> ResolveEdges
    ResolveEdges --> GetCompletionFromContext

    GetCompletionFromContext --> CheckSession
    CheckSession -->|是 + user_id| SessionCompletion
    CheckSession -->|否| DirectCompletion
    SessionCompletion --> LLMResult
    DirectCompletion --> LLMResult

来源: cognee/modules/retrieval/graph_completion_retriever.py:101-136cognee/modules/retrieval/utils/brute_force_triplet_search.py:119-150

GraphCompletionRetriever 类

GraphCompletionRetriever 类实现了基于图的搜索检索器的基础功能。它继承自 BaseRetriever,并提供了一个三阶段检索管线。

GraphCompletionRetriever 类结构

classDiagram
    class BaseRetriever {
        <<抽象>>
        +get_retrieved_objects(query)
        +get_context_from_objects(query, retrieved_objects)
        +get_completion_from_context(query, retrieved_objects, context)
        +get_completion(query)
    }

    class GraphCompletionRetriever {
        -str user_prompt_path
        -str system_prompt_path
        -Optional~str~ system_prompt
        -int top_k
        -Optional~int~ wide_search_top_k
        -Optional~Type~ node_type
        -Optional~List~str~~ node_name
        -float triplet_distance_penalty
        -Optional~str~ session_id
        -Type response_model
        -Optional~int~ neighborhood_depth
        -Optional~int~ neighborhood_seed_top_k
        +get_retrieved_objects(query) List~Edge~
        +get_triplets(query) List~Edge~
        +get_context_from_objects(query, retrieved_objects) str
        +resolve_edges_to_text(retrieved_edges) str
        +get_completion_from_context(query, retrieved_objects, context) List~Any~
        +get_completion(query) List~Any~
        -_use_session_cache() bool
        -_get_vector_index_collections() List~str~
    }

    class GraphCompletionCotRetriever {
        -str validation_system_prompt_path
        -str validation_user_prompt_path
        -str followup_system_prompt_path
        -str followup_user_prompt_path
        -int max_iter
        +get_retrieved_objects(query) List~Edge~
        -_run_cot_completion(query_batch, conversation_history) tuple
    }

    class GraphCompletionContextExtensionRetriever {
        -int context_extension_rounds
        +get_retrieved_objects(query) List~Edge~
        -_run_extension_round(states) void
    }

    BaseRetriever <|-- GraphCompletionRetriever
    GraphCompletionRetriever <|-- GraphCompletionCotRetriever
    GraphCompletionRetriever <|-- GraphCompletionContextExtensionRetriever

来源: cognee/modules/retrieval/graph_completion_retriever.py:33-80cognee/modules/retrieval/graph_completion_cot_retriever.py:36-95cognee/modules/retrieval/graph_completion_context_extension_retriever.py:14-54

初始化参数
参数类型默认值用途
user_prompt_pathstr"graph_context_for_question.txt"包含上下文的用户提示模板
system_prompt_pathstr"answer_simple_question.txt"系统提示模板
top_kint5检索的顶级三元组数量
wide_search_top_kint100向量搜索结果限制
triplet_distance_penaltyfloat6.5未映射元素的默认距离
neighborhood_depthOptional[int]None基于邻域的图投影深度
neighborhood_seed_top_kOptional[int]10邻域提取的最大种子节点数

来源: cognee/modules/retrieval/graph_completion_retriever.py:42-79

核心方法

get_retrieved_objects(query) → List[Edge]

执行三元组搜索并返回 Edge 对象列表。该方法:

  1. 使用 validate_retriever_input 校验输入 cognee/modules/retrieval/graph_completion_retriever.py:116
  2. 通过 self._unified_engine.graph.is_empty() 检查图是否为空 cognee/modules/retrieval/graph_completion_retriever.py:119
  3. 调用 get_triplets(query),通过 brute_force_triplet_search() 检索三元组 cognee/modules/retrieval/graph_completion_retriever.py:125

来源: cognee/modules/retrieval/graph_completion_retriever.py:101-136

三元组搜索算法

brute_force_triplet_search() 函数实现了基于图的搜索的核心算法。它将向量搜索与图结构分析相结合。

三元组搜索算法步骤

graph TB
    Start["brute_force_triplet_search(query, top_k)"]

    subgraph "步骤 1:向量搜索"
        EmbedQuery["embed_text([query])"]
        SearchCollections["并行搜索集合:<br/>- Entity_name<br/>- TextSummary_text<br/>- EdgeType_relationship_name"]
    end

    subgraph "步骤 2:提取相关 ID"
        ExtractIDs["NodeEdgeVectorSearch.extract_relevant_node_ids()"]
    end

    subgraph "步骤 3:图投影"
        CheckNeighborhood{"neighborhood_depth<br/>已提供?"}
        ProjectNeighborhood["project_neighborhood_from_db()<br/>提取种子周围的 k 跳邻域"]
        ProjectGraph["project_graph_from_db()<br/>全量或按 ID 过滤的投影"]
    end

    subgraph "步骤 4:排序"
        MapDist["map_vector_distances_to_graph_nodes/edges()"]
        CalcImportance["calculate_top_triplet_importances(k=top_k)<br/>得分 = n1_dist + n2_dist + edge_dist"]
    end

    Start --> SearchCollections
    SearchCollections --> ExtractIDs
    ExtractIDs --> CheckNeighborhood
    CheckNeighborhood -->|是| ProjectNeighborhood
    CheckNeighborhood -->|否| ProjectGraph
    ProjectNeighborhood --> MapDist
    ProjectGraph --> MapDist
    MapDist --> CalcImportance

来源: cognee/modules/retrieval/utils/brute_force_triplet_search.py:119-150

基于邻域的投影

如果指定了 neighborhood_depth,算法会使用向量搜索结果的一个子集作为"种子节点",并提取它们的局部图邻域:

if neighborhood_depth is not None and relevant_ids_to_filter:
    seed_ids = relevant_ids_to_filter[:neighborhood_seed_top_k]
    await memory_fragment.project_neighborhood_from_db(
        graph_engine,
        seed_node_ids=seed_ids,
        depth=neighborhood_depth,
        triplet_distance_penalty=triplet_distance_penalty,
    )

来源: cognee/modules/retrieval/utils/brute_force_triplet_search.py:81-92

三元组重要性评分

三元组通过组合两个连接节点和边本身的向量距离进行排序。

评分公式

对于连接节点 n1n2 的每条边 e

triplet_score = n1.attributes['vector_distance']
              + n2.attributes['vector_distance']
              + e.attributes['vector_distance']

得分越低表示相关性越高。没有向量距离的节点和边会收到默认惩罚得分(默认值:6.5)。

来源: cognee/modules/retrieval/utils/brute_force_triplet_search.py:119-150cognee/modules/graph/cognee_graph/CogneeGraph.py:83

图数据库实现

CogneeGraph

CogneeGraph 类作为从数据库检索的图数据的内存表示。它管理 NodeEdge 对象,并提供投影子图的方法。

CogneeGraph 特性

  • 距离管理:使用 reset_distances() 重置搜索的向量距离 cognee/modules/graph/cognee_graph/CogneeGraph.py:80
  • 边索引:在 edges_by_distance_key 中按距离键组织边,以便高效检索 cognee/modules/graph/cognee_graph/CogneeGraph.py:36
  • 数据库投影:通过 _get_full_or_id_filtered_graph()_get_filtered_graph() 支持全量、按 ID 过滤或按属性过滤的投影 cognee/modules/graph/cognee_graph/CogneeGraph.py:117-160

来源: cognee/modules/graph/cognee_graph/CogneeGraph.py:18-172

检索器变体

GraphCompletionCotRetriever

实现思维链(Chain-of-Thought,CoT)推理。它会验证初始答案,并迭代生成后续问题以检索缺失的上下文 cognee/modules/retrieval/graph_completion_cot_retriever.py:170-175。它通过 QueryState 管理查询状态以跟踪收敛情况 cognee/modules/retrieval/graph_completion_cot_retriever.py:166

来源: cognee/modules/retrieval/graph_completion_cot_retriever.py:36-176

GraphCompletionContextExtensionRetriever

迭代扩展上下文,将生成的补全结果作为新的搜索查询来查找相关三元组,直到收敛或达到轮次限制(context_extension_roundscognee/modules/retrieval/graph_completion_context_extension_retriever.py:92-96

来源: cognee/modules/retrieval/graph_completion_context_extension_retriever.py:14-131

TemporalRetriever

处理带有时间约束的查询。它使用大语言模型通过 extract_time_from_query() 从查询中提取时间区间 cognee/modules/retrieval/temporal_retriever.py:84。然后使用 collect_time_ids() 基于这些时间戳过滤图事件 cognee/modules/retrieval/temporal_retriever.py:126

来源: cognee/modules/retrieval/temporal_retriever.py:19-154

大语言模型补全集成

cognee/modules/retrieval/utils/completion.py 中的 generate_completion 函数是从图上下文生成答案的主要接口。

补全中的数据流

  1. 提示渲染:使用 render_promptquestioncontext 注入到 user_prompt_path 指定的模板中 cognee/modules/retrieval/utils/completion.py:20
  2. 系统提示:从 system_prompt_path 读取系统提示,或使用提供的 system_prompt cognee/modules/retrieval/utils/completion.py:21
  3. 对话历史:如果提供,将历史记录前置到任务描述中 cognee/modules/retrieval/utils/completion.py:24
  4. 结构化输出:调用 LLMGateway.acreate_structured_output 获取最终结果,可选地遵循 Pydantic response_model cognee/modules/retrieval/utils/completion.py:30-34

来源: cognee/modules/retrieval/utils/completion.py:9-38