arxiv_cs_lg 2026年2月10日

線形アンサンブルサンプリングのシャープ解析

Sharp analysis of linear ensemble sampling

Translated: 2026/3/15 14:50:26

linear-banditsbandit- Algorithmslinear-banditsregret-analysisbandit- Algorithims

Japanese Translation

arXiv:2602.08026v1 Announce Type: new 要旨: 我々は、確率的な線形バンディットにおいて標準のガウス擾乱を用いた線形アンサンブルサンプリング（ES）を解析する。我々は、アンサンブルサイズ $m = \Theta(d \log n)$ の場合、ES が高確率で $\tilde O(d^{3/2} \sqrt n)$ のリグレットを達成することを示し、これによりトンプソンサンプリングのベンチマークとのギャップを埋め、同時に変算量を比較可能な水準で保つ。この証明は、線形バンディットにおけるランダム化探索について新しい視点を带来し、分析を $m$ 個の独立したブラウン運動の時間非uniform超過問題に還元する。興味深いことに、この連続時間的な視点のは強要されているわけではなく、むしろ自然であり、おそらく必要である：離散時間の問題は連続時間の解決策を求めているように思える、そして我々は他のシャープな ES の境界を得る方法を知っているわけではない。

Original Content

arXiv:2602.08026v1 Announce Type: new Abstract: We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size $m=\Theta(d\log n)$, ES attains $\tilde O(d^{3/2}\sqrt n)$ high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for $m$ independent Brownian motions. Intriguingly, this continuous-time lens is not forced; it appears natural--and perhaps necessary: the discrete-time problem seems to be asking for a continuous-time solution, and we know of no other way to obtain a sharp ES bound.