9 · 召回与检索系统（Retrieval and Search System）

复杂文档理解与引用检索 · 本章是 RAGFlow DeepWiki 中文译文的独立章节页，保留原始链接、源码锚点、模块标签和章节层级。

项目RAGFlow 章节9 状态全文译文模块检索、召回与索引、文档对象与元数据、系统架构、工作流与编排

项目要点页2.5 参考项目项目章节目录RAGFlow DeepWiki 原始章节Retrieval and Search System 上一章8.7 下一章9.1

源码线索

api/db/__init__.py
api/db/db_models.py
api/db/services/dialog_service.py
api/db/services/document_service.py
api/db/services/file_service.py
api/db/services/knowledgebase_service.py
api/db/services/llm_service.py
api/db/services/task_service.py
api/db/services/user_service.py
common/query_base.py

模块标签

检索、召回与索引
文档对象与元数据
系统架构
工作流与编排
记忆与上下文

中文译文

召回与检索系统（中文译文）

原始 DeepWiki 页面：https://deepwiki.com/infiniflow/ragflow/9-retrieval-and-search-system

翻译时间：2026-05-27T08:44:48.707Z

翻译模型：deepseek-chat

原文字符数：14439

项目：RAGFlow (ragflow)

---

检索与搜索系统

目的与范围

本文档概述了 RAGFlow 的检索与搜索系统，该系统结合了 BM25 文本搜索与向量相似度搜索，用于检索与对话式 AI 响应相关的文档片段。系统支持混合搜索策略、重排序、元数据过滤以及自动引用插入。

关于文档存储层实现的详细信息，请参阅文档存储抽象层。关于查询处理算法的详细信息，请参阅查询处理与混合搜索。关于重排序模型集成，请参阅重排序与过滤。关于上下文格式化与大语言模型（LLM）响应生成，请参阅响应生成与引用。

---

系统架构

检索系统作为用户查询与文档存储之间的桥梁，协调嵌入向量生成、混合搜索、重排序和引用插入。

检索编排图

graph TB
    subgraph "入口点"
        AsyncChat["DialogService.async_chat()<br/>api/db/services/dialog_service.py"]
        RetrievalTest["retrieval_test()<br/>api/apps/chunk_app.py"]
    end

    subgraph "检索编排"
        GetModels["get_model_config_by_type_and_name()<br/>api/db/joint_services/tenant_model_service.py"]
        Retriever["settings.retriever<br/>Dealer 实例<br/>common/settings.py"]
    end

    subgraph "搜索管线_rag_nlp_search_py"
        DealerClass["Dealer.search()<br/>rag/nlp/search.py:132"]
        GetVector["get_vector()<br/>rag/nlp/search.py:53"]
        HybridSearch["混合搜索<br/>BM25 + 向量<br/>rag/nlp/search.py:115-148"]
        Rerank["Dealer.rerank()<br/>rag/nlp/search.py:296"]
        InsertCitations["insert_citations()<br/>rag/nlp/search.py:177"]
    end

    subgraph "文档存储"
        DocStoreBase["DocStoreConnection<br/>common/doc_store/doc_store_base.py"]
        ESConn["ESConnection<br/>rag/utils/es_conn.py"]
        InfinityConn["InfinityConnection<br/>rag/utils/infinity_conn.py"]
    end

    subgraph "模型"
        LLMBundle["LLMBundle<br/>api/db/services/llm_service.py:85"]
        EmbedModel["嵌入向量模型"]
        RerankModel["重排序模型"]
    end

    AsyncChat --> GetModels
    RetrievalTest --> Retriever

    GetModels --> LLMBundle
    LLMBundle --> EmbedModel
    LLMBundle --> RerankModel

    AsyncChat --> Retriever
    Retriever --> DealerClass

    DealerClass --> GetVector
    GetVector --> LLMBundle

    DealerClass --> HybridSearch
    HybridSearch --> DocStoreBase
    DocStoreBase --> ESConn
    DocStoreBase --> InfinityConn

    DealerClass --> Rerank
    Rerank --> LLMBundle

    AsyncChat --> InsertCitations
    InsertCitations --> LLMBundle

来源：

api/db/services/dialog_service.py:98-150
rag/nlp/search.py:37-41
rag/nlp/search.py:132-172
api/db/services/llm_service.py:85-163

---

核心组件

Dealer 类

Dealer 类 rag/nlp/search.py:37-41 是检索操作的核心编排器。它封装了以下功能：

使用 FulltextQueryer rag/nlp/query.py:28-40 进行查询处理，用于 BM25 分词和扩展。
通过 self.dataStore rag/nlp/search.py:40 管理文档存储连接。
通过 get_vector() rag/nlp/search.py:53-61 生成向量嵌入向量。
使用 FusionExpr rag/nlp/search.py:25 协调混合搜索。
通过 insert_citations() rag/nlp/search.py:177-267 实现引用插入逻辑。

系统会实例化一个单例 Dealer 实例作为 settings.retriever common/settings.py，该实例在整个应用程序中使用。

文档存储抽象层

RAGFlow 通过 DocStoreConnection common/doc_store/doc_store_base.py 中定义的统一接口支持多种文档存储后端：

存储引擎	实现类	引擎标志
Elasticsearch	`ESConnection` `rag/utils/es_conn.py`	`DOC_ENGINE=elasticsearch`
Infinity	`InfinityConnection` `rag/utils/infinity_conn.py`	`DOC_ENGINE=infinity`
OpenSearch	`OpenSearchConnection` `rag/utils/opensearch_conn.py`	`DOC_ENGINE=opensearch`

每个实现都提供了 search() 方法 rag/nlp/search.py:132-172，该方法接受以下参数：

要检索的源字段（src）rag/nlp/search.py:149-153。
用于关键词强调的高亮字段 rag/nlp/search.py:168-172。
通过 get_filters() rag/nlp/search.py:120-130 设置的过滤条件（kb_id、doc_id 等）。
匹配表达式（文本、稠密向量、融合）rag/nlp/search.py:173-180。
通过 OrderByExpr rag/nlp/search.py:142 指定的排序规范。

用于模型的 LLMBundle

LLMBundle 类 api/db/services/llm_service.py:85 为以下功能提供了统一接口：

嵌入向量模型：encode() api/db/services/llm_service.py:95 和 encode_queries() api/db/services/llm_service.py:120 生成向量表示。
重排序模型：通过 similarity() api/db/services/llm_service.py:136 计算查询与文本之间的相关性分数。
聊天模型：用于响应生成和查询优化。

来源：

rag/nlp/search.py:37-61
api/db/services/llm_service.py:85-163
rag/nlp/query.py:28-40

---

对话式 AI 中的检索流程

以下时序图展示了检索如何集成到对话式 AI 管线中：

sequenceDiagram
    participant 用户
    participant async_chat
    participant get_models
    participant Dealer
    participant 嵌入向量模型
    participant 文档存储
    participant 重排序模型
    participant insert_citations

    用户->>async_chat: 问题 + 对话配置
    async_chat->>get_models: 加载模型配置
    get_models-->>async_chat: embd_mdl, rerank_mdl, chat_mdl

    alt 有 kb_ids
        async_chat->>async_chat: 如果多轮对话则优化查询（full_question）
        async_chat->>async_chat: apply_meta_data_filter

        async_chat->>Dealer: search(req, idx_names, kb_ids, emb_mdl)

        Dealer->>嵌入向量模型: encode_queries(question)
        嵌入向量模型-->>Dealer: query_vector

        Dealer->>Dealer: 构建 MatchTextExpr（BM25）
        Dealer->>Dealer: 构建 MatchDenseExpr（向量）
        Dealer->>Dealer: 构建 FusionExpr（weighted_sum）

        Dealer->>文档存储: search(filters, matchExprs, orderBy, limit)
        文档存储-->>Dealer: SearchResult（ids, fields, highlights, scores）

        opt 启用重排序
            Dealer->>Dealer: rerank(sres, query, rerank_mdl)
            Dealer->>重排序模型: similarity(query, chunks)
            重排序模型-->>Dealer: 重排序分数
        end

        Dealer-->>async_chat: sres（ids, field, highlight）

        async_chat->>async_chat: kb_prompt() - 格式化片段
        async_chat->>async_chat: 生成大语言模型（LLM）响应（流式）

        opt 启用引用
            async_chat->>insert_citations: answer, chunks, embd_mdl
            insert_citations->>嵌入向量模型: encode(answer_sentences)
            insert_citations->>insert_citations: hybrid_similarity(ans_vec, chunk_vec)
            insert_citations-->>async_chat: 带有 [ID:X] 标记的答案
        end
    end

    async_chat-->>用户: 带有引用的最终答案

关键步骤：

模型加载：通过 get_model_config_by_type_and_name api/db/joint_services/tenant_model_service.py:78 为对话的知识库加载嵌入向量、聊天和可选的重排序模型。
查询优化：如果启用了多轮对话，则使用大语言模型（LLM）通过 full_question() api/db/services/dialog_service.py:51 生成一个独立的问题。
元数据过滤：如果配置了文档元数据过滤器，则通过 apply_meta_data_filter api/db/services/dialog_service.py:38 应用过滤。
混合搜索：使用加权融合将 BM25 文本搜索与向量相似度搜索结合 rag/nlp/search.py:115-148。
重排序：可选地使用专门的重排序模型或基于相似度的分数对结果进行重排序 rag/nlp/search.py:296-387。
引用插入：如果启用，通过将答案句子与源片段进行匹配，在答案中注入 [ID:X] 标记 rag/nlp/search.py:177-267。

来源：

api/db/services/dialog_service.py:16-100
rag/nlp/search.py:132-172
rag/nlp/search.py:177-267

---

混合搜索策略

RAGFlow 实现了一种结合词法（BM25）和语义（向量）检索的混合搜索方法：

匹配表达式

搜索管线构建匹配表达式 rag/nlp/search.py:115-130：

MatchTextExpr（BM25）：

- 由 FulltextQueryer.question() rag/nlp/query.py:42-181 生成。 - 在 FulltextQueryer 定义的分词字段中搜索：title_tks、important_kwd、content_ltks 等 rag/nlp/query.py:32-40。 - 使用字段特定的权重（例如，title_tks^10、important_kwd^30）。

MatchDenseExpr（向量）：

- 来自 get_vector() rag/nlp/search.py:53-61 的查询嵌入向量。 - 与 q_{dim}_vec 列进行匹配 rag/nlp/search.py:60。 - 使用余弦相似度度量，并带有可配置的阈值 rag/nlp/search.py:61。

FusionExpr（组合）：

- 方法：weighted_sum rag/nlp/search.py:25。 - 结合来自词法和语义索引的结果。

回退策略

如果初始搜索返回零结果，系统可能会使用更宽泛的参数进行重试。Dealer 还包含一个 _prune_deleted_chunks 机制 rag/nlp/search.py:76-118，通过检查 DocumentService.get_by_ids() rag/nlp/search.py:72 来确保检索到的片段对应于数据库中存在的文档。

来源：

rag/nlp/search.py:132-172
rag/nlp/query.py:32-40
rag/nlp/search.py:53-61

---

重排序与评分

重排序方法

Dealer.rerank() 方法 rag/nlp/search.py:296-387 使用多种评分机制来优化搜索结果：

graph LR
    SearchResult["SearchResult<br/>BM25+向量分数"] --> ExtractVectors["提取片段向量<br/>q_{dim}_vec 字段"]
    SearchResult --> ExtractTokens["对内容进行分词<br/>content_ltks 字段"]

    ExtractVectors --> HybridSim["hybrid_similarity()<br/>Token + 向量相似度"]
    ExtractTokens --> HybridSim
    QueryVector["查询向量"] --> HybridSim
    QueryTokens["查询 Token"] --> HybridSim

    HybridSim --> CombineScores["组合：<br/>tkweight * token_sim<br/>+ vtweight * vector_sim"]

    SearchResult --> RankFeature["_rank_feature_scores()<br/>标签相似度 + PageRank"]
    QueryTags["查询标签"] --> RankFeature

    CombineScores --> FinalScore["最终分数：<br/>hybrid_sim * 100<br/>+ rank_feature"]
    RankFeature --> FinalScore

    FinalScore --> Resorted["重新排序的结果"]

评分组件：

混合相似度 rag/nlp/query.py:183-187：

- Token 重叠：token_similarity 计算查询 Token 与片段 Token 之间的重叠。 - 向量余弦：查询嵌入向量与片段嵌入向量之间的余弦相似度。 - 公式：vtweight * vector_sim + tkweight * token_sim rag/nlp/query.py:183-187。

排序特征 rag/nlp/search.py:269-294：

- 标签相似度：查询标签与片段标签之间的余弦相似度 rag/nlp/search.py:286。 - PageRank：来自 PAGERANK_FLD 的文档重要性分数 rag/nlp/search.py:290。

最终分数 rag/nlp/search.py:373-387：

- 基础分：hybrid_similarity * 100。 - 提升分：+ rank_feature_score。 - 结果按最终分数降序重新排序。

来源：

rag/nlp/search.py:296-387
rag/nlp/query.py:183-187
rag/nlp/search.py:269-294

---

引用插入

insert_citations() 方法 rag/nlp/search.py:177-267 通过将答案句子与检索到的片段进行匹配，为生成的答案添加引用标记。

算法

句子分割：使用一个正则表达式将答案分割成句子，该正则表达式处理各种标点符号，并避免在代码块内进行分割 rag/nlp/search.py:182-210。
嵌入向量：使用嵌入向量模型对每个句子进行嵌入向量 rag/nlp/search.py:236。
相似度计算：对于每个句子，计算其与所有检索到的片段之间的 hybrid_similarity rag/nlp/search.py:241。
阈值匹配：如果相似度超过阈值（起始为 0.63，衰减至 0.3），则添加引用 rag/nlp/search.py:228, 246。
标记插入：在符合条件的句子后插入 [ID:X] 等标记 rag/nlp/search.py:252-266。

来源：

rag/nlp/search.py:177-267

---

API 集成

元数据过滤

元数据过滤在搜索请求构建期间进行处理。get_filters 方法 rag/nlp/search.py:120-130 将请求参数映射到文档存储过滤器，包括 kb_ids、doc_ids 以及特定的元数据字段，如 knowledge_graph_kwd、entity_kwd 或 available_int。通过 apply_meta_data_filter api/db/services/dialog_service.py:38 应用使用布尔条件的高级过滤逻辑。

来源：

rag/nlp/search.py:120-130
api/db/services/dialog_service.py:38