arxiv_cs_lg 2026年2月10日

DSL：コンペティション意識的なスケーリングを用いたソフトマックス推奨システムにおける理解と改善

DSL: Understanding and Improving Softmax Recommender Systems with Competition-Aware Scaling

Translated: 2026/3/15 13:04:50

dslsoftmax-lossrecommender-systemsmachine-learningdistributionally-robust-optimization

Japanese Translation

arXiv:2602.07206v1 Announce Type: new Abstract: ソフトマックス損失 (Softmax Loss, SL) は、その優れた性能、頑健性、そして公平性を示したことで、推奨システム (RS) において次第に採用されるようになっています。しかし、暗黙的なフィードバックにおいて、単一のグローバルな温度と均等な負例の扱いが、サンプリングされたセットに異なる程度の関連性や情報量のコンピテーター（競合者）が含まれる可能性があるため、脆いトレーニングを招く可能性があります。特定の負例セットを持つユーザーーアイテムペアにおける最適な損失シャープネスは、異なる負例セットを持つ別のペアに対して亜最適または不安定となる可能性があります。我々は、コンペティションそのものから有効なシャープネスを推論する「Dual-scale Softmax Loss (DSL)」を導入しました。DSL は、log-sum-exp バックボーンに 2 つの補完ブランチを追加します。まず、各トレーニングインスタンス内の負例を難易度とアイテム間の類似度に基づいて再重み付けします。次に、構築されたコンピテータースレートにおけるコンペティション強度に基づいて、サンプルごとの温度を適応させます。これら 2 つの成分は、SL の幾何学構造を維持しつつ、負例間およびサンプル間におけるコンペティション分布を再形成します。複数の代表ベンチマークとバックボーンにおいて、DSL は強力なベースラインに対して著しい向上をもたらしており、いくつかの設定で SL に対する改善が $10 ext{ extpercent}$ を超え、データセット、指標、バックボーンを平均して $6.22 ext{ extpercent}}$ となりました。分布外 (OOD) ポピュラリティシフトの下では、利益はさらに大きくなり、SL に対する平均改善は $9.31 ext{ extpercent}}$ となりました。我々は、DSL が曖昧なインスタンスにおけるロバストペイオフと KL 離れをどのように再形成するかを示す、理論的かつ分布的に頑健な最適化 (DRO) 分析も提供しました。これは、経験的に観察された精度と頑健性の向上を説明する上で役立ちました。

Original Content

arXiv:2602.07206v1 Announce Type: new Abstract: Softmax Loss (SL) is being increasingly adopted for recommender systems (RS) as it has demonstrated better performance, robustness and fairness. Yet in implicit-feedback, a single global temperature and equal treatment of uniformly sampled negatives can lead to brittle training, because sampled sets may contain varying degrees of relevant or informative competitors. The optimal loss sharpness for a user-item pair with a particular set of negatives, can be suboptimal or destabilising for another with different negatives. We introduce Dual-scale Softmax Loss (DSL), which infers effective sharpness from the sampled competition itself. DSL adds two complementary branches to the log-sum-exp backbone. Firstly it reweights negatives within each training instance using hardness and item--item similarity, secondly it adapts a per-example temperature from the competition intensity over a constructed competitor slate. Together, these components preserve the geometry of SL while reshaping the competition distribution across negatives and across examples. Over several representative benchmarks and backbones, DSL yields substantial gains over strong baselines, with improvements over SL exceeding $10%$ in several settings and averaging $6.22%$ across datasets, metrics, and backbones. Under out-of-distribution (OOD) popularity shift, the gains are larger, with an average of $9.31%$ improvement over SL. We further provide a theoretical, distributionally robust optimisation (DRO) analysis, which demonstrates how DSL reshapes the robust payoff and the KL deviation for ambiguous instances. This helps explain the empirically observed improvements in accuracy and robustness.