arxiv_cs_lg 2026年4月24日

強化された推論を組み合わせたトレーニングフリーなリトリエバル・オーギュメント・ジェネレーションによる洪水被害現在予測

Training-free retrieval-augmented generation with reinforced reasoning for flood damage nowcasting

Translated: 2026/4/24 20:09:56

typescriptmachine-learningretrieval-augmented-generationflood-damage-nowcastingreasoning-centric

Japanese Translation

arXiv:2602.10312v2 発表形式：置換要旨：本稿では、洪水被害の現在予測に特化した強化された推論を組み合わせたトレーニングフリーなリトリエバル・オーギュメント・ジェネレーション（R2RAG-Flood）フレームワークを提案する。このフレームワークは、ラベル付けされたテーブル記録から、構造化された予測因子、コンパクトなテキスト形式の要約、モデルによって生成された推論経路を含む各サンプルを備えた推論中心の知識ベースを構築する。推論段階では、目標プロンプトは地理的に局所的な近隣データおよび選択されたフリーショットを拡張し、タスク固有の微調整なしに事例ベースの推論を支援する。2 段階の手続きでは、まず損傷の発生か否かを判定し、続いて 3 段階の財産被害範囲（PDE）分類内でseverity を精緻化し、支援度の低い過重評価出力に対して保存的な値下げチェックを行う。テキサス州ハリス郡のハリケーンハーヴィー事例研究において、教師ありテーブルベースラインは総当たり精度 0.714、損傷クラス（中規模・高規模 PDE）において精度 0.859 を達成した。7 つの LLM バックボーンに対して、R2RAG-Flood は総当たり精度 0.613〜0.668、損傷クラスにおいて精度 0.757〜0.896 を達成し、各予測の構造化された根拠を提供する。本研究で使用されたseverity-per-cost メトリックにおいて、軽量な R2RAG-Flood バリアントは教師ありベースラインおよび大型 LLM バックボーンよりも費用対効果が高い。これらの結果は、現実的な事例研究環境において、推論中心でトレーニングフリーなパイプラインが洪水被害の現在予測に実用可能であることを示している。

Original Content

arXiv:2602.10312v2 Announce Type: replace Abstract: We propose R2RAG-Flood, a training-free retrieval-augmented generation framework for flood damage nowcasting with reinforced reasoning. The framework builds a reasoning-centric knowledge base from labeled tabular records, where each sample includes structured predictors, a compact text-mode summary, and a model-generated reasoning trajectory. During inference, the target prompt is augmented with geographically local neighbors and selected free-shots to support case-based reasoning without task-specific fine-tuning. A two-stage procedure first determines damage occurrence and then refines severity within a three-level Property Damage Extent (PDE) classification, followed by a conservative downgrade check for weakly supported over-severe outputs. In a Hurricane Harvey case study in Harris County, Texas, the supervised tabular baseline achieves 0.714 overall accuracy and 0.859 accuracy on the damaged classes (medium and high PDE). Across seven LLM backbones, R2RAG-Flood achieves 0.613--0.668 overall accuracy and 0.757--0.896 accuracy on the damaged classes while providing a structured rationale for each prediction. Under the severity-per-cost metric used in this study, lighter R2RAG-Flood variants are more cost-efficient than the supervised baseline and larger LLM backbones. These results demonstrate the feasibility of a reasoning-centric, training-free pipeline for flood damage nowcasting in a realistic case-study setting.