arxiv_cs_ai 2026年4月24日

Knowledge Capsules: LLMs 向けの構造化された非パラメトリックメモリユニット

Knowledge Capsules: Structured Nonparametric Memory Units for LLMs

Translated: 2026/4/24 20:36:06

knowledge-capsulesllmragmemory-augmentationnonparametric-memory

Japanese Translation

論文：arXiv:2604.20487v2 発表タイプ：replace-cross 要旨：大規模言語モデル（LLMs）は、知識をパラメトリックな重みとしてエンコードしており、これを再トレーニングせずに更新または拡張するのは費用対効果に欠けます。検索増強生成（RAG）は、検索されたテキストを入力に追加することでこの制限を緩和しますが、これは文脈の拡張を通じてのみ動作し、外部の知識が注意機構内のトークンとして競争するだけなので、その影響は間接的かつ不安定です。特に、長文脈とマルチホップ推論のシナリオでは顕著です。私たちは、正規化された関係性を表現し、凍結されたベースモデルからドキュメントコーパスを直接使用して構築できる構造化された非パラメトリックメモリユニット「Knowledge Capsules」を提案します。知識をテキストとして注入するのではなく、 капсулы を注意適合のキーバリュー表現に変換する「外部キーバリュー注入（KVI）」フレームワークを導入し、外部の知識がモデルの注意計算に直接参加できるようにしました。知識統合を文脈レベルの拡張からメモリレベルの相互作用へシフトさせることで、提案されたフレームワークは複数の QA ベンチマークで RAG や GraphRAG を一貫して凌駕し、長文脈とマルチホップ推論において安定性と精度が向上すると同時に、パラメータの更新を必要としません。

Original Content

arXiv:2604.20487v2 Announce Type: replace-cross Abstract: Large language models (LLMs) encode knowledge in parametric weights, making it costly to update or extend without retraining. Retrieval-augmented generation (RAG) mitigates this limitation by appending retrieved text to the input, but operates purely through context expansion, where external knowledge competes as tokens within the attention mechanism. As a result, its influence is indirect and often unstable, particularly in long context and multi hop reasoning scenarios. We propose Knowledge Capsules, structured nonparametric memory units that represent normalized relational knowledge and can be constructed directly from document corpora using a frozen base model. Instead of injecting knowledge as text, we introduce an External Key Value Injection (KVI) framework that compiles capsules into attention-compatible key value representations, enabling external knowledge to directly participate in the model's attention computation. By shifting knowledge integration from context-level augmentation to memory level interaction, the proposed framework consistently outperforms RAG and GraphRAG across multiple QA benchmarks, with improved stability and accuracy in long context and multi hop reasoning, while requiring no parameter updates.