arxiv_cs_ai 2026年2月10日

RAGの検索を順次実施するための技術

Progressive Searching for Retrieval in RAG

Translated: 2026/3/7 12:27:55

ragsretrieval-augmented-generationlang-modsembedding-vectorsalgorithms

Japanese Translation

大規模言語モデル (LLMs) の二つの重要な制限である過去の情報と hallucinations を緩和することで、引き続き有望な手法として Retrieval Augmented Generation (RAG) があります。 RAG シイズは文書をエントリーベクトアーとしてデデータベースに保管します。クエリが提示された場合、最関連の文書を検索し、これらをすべて LLMs のプロンプトに入れることで応答を生成します。合理的な時間と正確さで求める必要があり、RAG では効果的なそして正確な探検技術が必要です。我々は最適解を実現するためにリテンションプロセスのための節約の探求アルゴリズムを開発しました。我々の進化的探求アルゴリズムは、低次元エントリーベクトアーから始まり、その次に高い目標の次元を目指します。そのマルチステージアプローチでは検索時間を減らしながらも理想的な正確性を維持します。我々の発見は RAG システムにおいて進化的探求がディミニューショの、速度、そして正確さとのバランスで平衡点を打つことができることが示されたと述べています。大規模なデータベースに対してスケーラブルで高パフォーマンスのリテンションが可能です。

Original Content

arXiv:2602.07297v1 Announce Type: cross Abstract: Retrieval Augmented Generation (RAG) is a promising technique for mitigating two key limitations of large language models (LLMs): outdated information and hallucinations. RAG system stores documents as embedding vectors in a database. Given a query, search is executed to find the most related documents. Then, the topmost matching documents are inserted into LLMs' prompt to generate a response. Efficient and accurate searching is critical for RAG to get relevant information. We propose a cost-effective searching algorithm for retrieval process. Our progressive searching algorithm incrementally refines the candidate set through a hierarchy of searches, starting from low-dimensional embeddings and progressing into a higher, target-dimensionality. This multi-stage approach reduces retrieval time while preserving the desired accuracy. Our findings demonstrate that progressive search in RAG systems achieves a balance between dimensionality, speed, and accuracy, enabling scalable and high-performance retrieval even for large databases.