arxiv_cs_lg 2026年2月10日

ワイヤレスインタラクティブパノラマシーンの配信のための混合フィードバック導向最適学習

Hybrid Feedback-Guided Optimal Learning for Wireless Interactive Panoramic Scene Delivery

Translated: 2026/3/15 13:05:52

hybrid-feedbackonline-learningwireless-networksmulti-armed-banditpanoramic-delivery

Japanese Translation

arXiv:2602.07273v1 Announce Type: new 要旨：没入型アプリケーション（バーチャルリアリティ、拡張現実）は、フレームレート、レイテンシ、および物理環境とバーチャル環境間の同期に関する厳格な要件を課します。これらの要件を満たすために、エッジサーバーはパノラマコンテンツをレンダリングし、ユーザーのヘッドモーションを予測して、ユーザーのビューポートを被覆しつつも無線帯域幅の制約内で残るシーンの一部を送信する必要があります。各部分は、選択された部分が実際のビューポートを被覆しているかを示す予測フィードバックと、対応するパケットが正常に送信されたかを示す伝送フィードバックの 2 つのフィードバック信号を生み出します。既存の研究は、この問題を 2 段階の帯断フィードバックを持つマルチアームバンディットとしてモデル化しましたが、予測フィードバックがユーザーのヘッドポーズが観察された後、候補のすべての部分に対して遡って計算可能であるという事実を利用できていません。その結果、予測フィードバックは帯断フィードバックではなく、完全情報フィードバックとなります。この観察に基づき、我々は完全情報フィードバックと帯断フィードバックを組み合わせた 2 段階の混合フィードバックモデルを導入し、この設定の下での部分選択問題をオンライン学習タスクとして形式化します。混合フィードバックモデルに対するインスタンス依存ルグレットの下限導来を行い、両方のフィードバックタイプを活用して学習効率を改善する混合学習アルゴリズムである AdaPort を提案します。さらに、下限と漸近的に一致するルグレットの上限を確立し、リアルワールドのトレース駆動シミュレーションを通じて、AdaPort が最先端の基準手法に比して一貫して優れていることを示しました。

Original Content

arXiv:2602.07273v1 Announce Type: new Abstract: Immersive applications such as virtual and augmented reality impose stringent requirements on frame rate, latency, and synchronization between physical and virtual environments. To meet these requirements, an edge server must render panoramic content, predict user head motion, and transmit a portion of the scene that is large enough to cover the user viewport while remaining within wireless bandwidth constraints. Each portion produces two feedback signals: prediction feedback, indicating whether the selected portion covers the actual viewport, and transmission feedback, indicating whether the corresponding packets are successfully delivered. Prior work models this problem as a multi-armed bandit with two-level bandit feedback, but fails to exploit the fact that prediction feedback can be retrospectively computed for all candidate portions once the user head pose is observed. As a result, prediction feedback constitutes full-information feedback rather than bandit feedback. Motivated by this observation, we introduce a two-level hybrid feedback model that combines full-information and bandit feedback, and formulate the portion selection problem as an online learning task under this setting. We derive an instance-dependent regret lower bound for the hybrid feedback model and propose AdaPort, a hybrid learning algorithm that leverages both feedback types to improve learning efficiency. We further establish an instance-dependent regret upper bound that matches the lower bound asymptotically, and demonstrate through real-world trace driven simulations that AdaPort consistently outperforms state-of-the-art baseline methods.