arxiv_cs_lg 2026年4月20日

LLM が効率的なシーケンス推察機となる要因は何か：好意的な強度と時間的コンテキストに関する調査

What Makes LLMs Effective Sequential Recommenders? A Study on Preference Intensity and Temporal Context

Translated: 2026/4/20 11:06:40

large-language-modelssequential-recommendationpreference-intensitytemporal-contextrecpo

Japanese Translation

arXiv:2506.02261v3 Announce Type: replace-cross 要旨：大型言語モデル（LLM）がユーザーの好意をシーケンス推察において効果的にモデル化する要因は何ですか。私たちの調査では、既存の好意一致アプローチは基本的に二項対比に依存しており、好意的な強度（親近感や嫌悪さの構造化された強度）と時間的コンテキスト（最近のインタラクションがユーザーの現在の意図をどれだけ反映しているか）という 2 つの重要な要素を見落としていることが明らかになりました。統制実験を通じて、構造化された好意シグナルを伴う包括的なフィードバックを活用することが推奨性能を著しく改善することを示し、二項モデル化が本質的な情報を捨却していることを明らかにしました。これらの見解に基づき、我々は明示的フィードバックと暗黙的フィードバックの両方を読み込む共通の好意シグナルに変換し、好意的な強度とインタラクションの親近さを共同で考慮する適応的な報酬マージンを構築する統一された好意最適化フレームワーク「RecPO」を提案しました。5 つのデータセットを介した実験では、RecPO が最上流ベースラインの一貫して優れており、人間の意思決定と一致する行動パターン（即席の満足度を優先、好意的な一貫性を維持、嫌われるアイテムを回避）を示したことを示しました。私たちの結果は、好意的な強度と時間的コンテキストが効果的な LLM による推察に不可欠な要素であることを強調しています。

Original Content

arXiv:2506.02261v3 Announce Type: replace-cross Abstract: What enables large language models (LLMs) to effectively model user preferences in sequential recommendation? Our investigation reveals that existing preference-alignment approaches largely rely on binary pairwise comparisons, overlooking two critical factors: preference intensity (the structured strength of affinity or aversion) and temporal context (the extent to which recent interactions better reflect a user's current intent). Through controlled experiments, we show that leveraging comprehensive feedback with structured preference signals substantially improves recommendation performance, indicating that binary modeling discards essential information. Motivated by these findings, we propose RecPO, a unified preference optimization framework that maps both explicit and implicit feedback into a common preference signal and constructs adaptive reward margins that jointly account for preference intensity and interaction recency. Experiments across five datasets show that RecPO consistently outperforms state-of-the-art baselines while exhibiting behavioral patterns aligned with human decision-making, including favoring immediate satisfaction, maintaining preference coherence, and avoiding dispreferred items. Our results highlight that preference intensity and temporal context are fundamental ingredients for effective LLM-based recommendation.