arxiv_cs_cv 2026年2月10日

RAP: 3D Rasterization Augmented End-to-End Planning

Translated: 2026/3/15 14:04:45

rasterizationend-to-end-planningroboticsreinforcement-learningdata-augmentation

Japanese Translation

arXiv:2510.04333v2 Announce Type: replace 要約：エンドツーエンドの運転政策をトレーニングするための複製学習は、専門家のデモンストレーションのみに基づいて訓練されます。クローズドループ環境に展開されたこのような政策は、回復データを欠如させます：小さな過ちは修正されず、迅速に失敗に引き込まれます。有望な方向として、ログされたパスを超えた代替視点と軌道の生成が提案されています。先鋭的な工作は、ニューラルレンダリングやゲームエンジンを通じて実写風デジタルツインを調査しており、しかしこれらの方法は極めて遅く高価であり、主に評価目的で使用されています。本研究では、エンドツーエンドプランナーを訓練する際に実写性が不要であることを主張します。重要なのは语义的忠実性とスケーラビリティです：運転は幾何学とダイナミクスに依存しており、テクスチャや照明ではありません。これを動機付けとして、本研究では、軽量なラスタライズによる注釈付きプリミティブのレンダリングを代用し、反事実回復操作やクロスエージェントビューシネシスなどの拡張を可能にする「3D Rasterization（3D ラスタライズ）」を提案します。これらの合成視点を現実世界の展開に効果的に移すために、我々はシミュレーションから現実へのギャップを架ける「Raster-to-Real」特徴空間の一致を導入しました。これらの構成要素は、スケーラブルなデータ拡張パイプラインである「Rasterization Augmented Planning（RAP）」を形成します。RAPは、次世代のクローズドループ頑健性とロングテール一般化を実現し、四つの主要なベンチマーク：NAVSIM v1/v2、Waymo Open Dataset Vision-based E2E Driving、および Bench2Drive で 1 位にランクされます。我々の結果は、軽量なラスタライズと特徴一致がエンドツーエンドトレーニングのスケーリングに十分であり、実用的な代替として実写風のレンダリングに提供することを示唆します。プロジェクトページ：https://alan-lanfeng.github.io/RAP/。

Original Content

arXiv:2510.04333v2 Announce Type: replace Abstract: Imitation learning for end-to-end driving trains policies only on expert demonstrations. Once deployed in a closed loop, such policies lack recovery data: small mistakes cannot be corrected and quickly compound into failures. A promising direction is to generate alternative viewpoints and trajectories beyond the logged path. Prior work explores photorealistic digital twins via neural rendering or game engines, but these methods are prohibitively slow and costly, and thus mainly used for evaluation. In this work, we argue that photorealism is unnecessary for training end-to-end planners. What matters is semantic fidelity and scalability: driving depends on geometry and dynamics, not textures or lighting. Motivated by this, we propose 3D Rasterization, which replaces costly rendering with lightweight rasterization of annotated primitives, enabling augmentations such as counterfactual recovery maneuvers and cross-agent view synthesis. To transfer these synthetic views effectively to real-world deployment, we introduce a Raster-to-Real feature-space alignment that bridges the sim-to-real gap. Together, these components form Rasterization Augmented Planning (RAP), a scalable data augmentation pipeline for planning. RAP achieves state-of-the-art closed-loop robustness and long-tail generalization, ranking first on four major benchmarks: NAVSIM v1/v2, Waymo Open Dataset Vision-based E2E Driving, and Bench2Drive. Our results show that lightweight rasterization with feature alignment suffices to scale E2E training, offering a practical alternative to photorealistic rendering. Project page: https://alan-lanfeng.github.io/RAP/.