arxiv_cs_ai 2026年2月10日

HypRAG：ハイパー希薄の密集検索によるリテラル・アーカージュエニミズジェネレート

HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation

Translated: 2026/3/7 12:40:28

hyperbolictransformerrepresentation-poolinghierarchical-structureretrieval-augmented-generation

Japanese Translation

エンティティ間の文脈的類似性を維持できない自然言語は、次元的な構造から大きく特化したトピックまでがあり、その事実に欧幾理会議空間での専門家行列が対応できていないことが明らかになりました。この難点に対するアプローチとして、我々は次元曲率スケールのLorentzモデルを適用し、HyTE-FHとHyTE-Hという2つの異なるモデル構造を開発しました。HyTE-FHは完全にハイパー空間内の変換器、そしてHyTE-Hはプレコンパイルされた欧幾리ニック行列をハイパー空間に投影するハイブリッドフレームワークです。また、ハイパーホープの集合化プロセスにおける視覚的偏見を修正するために、「出所からエインセント・ミーディアン」を導入しましたこれは、高層階構造を保つものと考えられます。MTEBにおいては、HyTe-FHは対応するヨーク会議バシネスメソドロジよりもパフォーマンスが上がっています。一方でRAGベンチマーケットについては、HyTE-Hによって欧幾里得ベースの最新型検索者とは比較してもっとも29%もの上を行っているのです。さらに我々の分析は、直方体への特定概念の説明力への変化が通常よりも約20倍あることを見すえて、ハイパー画像の中での代表性の欠如を指摘します。“ヒューマン・インダクティブ・バイアス”を通じて、このように信頼性のあるリテラル・アーカージュミーガジェネレートの統計を可能にする、重要な場所として検討されています。

Original Content

arXiv:2602.07739v1 Announce Type: cross Abstract: Embedding geometry plays a fundamental role in retrieval quality, yet dense retrievers for retrieval-augmented generation (RAG) remain largely confined to Euclidean space. However, natural language exhibits hierarchical structure from broad topics to specific entities that Euclidean embeddings fail to preserve, causing semantically distant documents to appear spuriously similar and increasing hallucination risk. To address these limitations, we introduce hyperbolic dense retrieval, developing two model variants in the Lorentz model of hyperbolic space: HyTE-FH, a fully hyperbolic transformer, and HyTE-H, a hybrid architecture projecting pre-trained Euclidean embeddings into hyperbolic space. To prevent representational collapse during sequence aggregation, we introduce the Outward Einstein Midpoint, a geometry-aware pooling operator that provably preserves hierarchical structure. On MTEB, HyTE-FH outperforms equivalent Euclidean baselines, while on RAGBench, HyTE-H achieves up to 29% gains over Euclidean baselines in context relevance and answer relevance using substantially smaller models than current state-of-the-art retrievers. Our analysis also reveals that hyperbolic representations encode document specificity through norm-based separation, with over 20% radial increase from general to specific concepts, a property absent in Euclidean embeddings, underscoring the critical role of geometric inductive bias in faithful RAG systems.