arxiv_cs_ai 2026年4月20日

Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation

Translated: 2026/4/20 11:18:46

test-time-adaptationlarge-language-modelsout-of-distributionllm-debiasingloRA

Japanese Translation

arXiv:2603.13683v3 Announce Type: replace-cross Abstract: デバイアスされた大規模言語モデル（LLM）は既知または低バイアスのプロンプトに対して優れているにもかかわらず、生来で高バイアスのプロンプトに対処する際に見落ちは起こりがちです。出分布（OOD）検出を通じて、これらの高バイアスプロンプトが分布シフトを引き起こし、静的モデルのパフォーマンスを劣化させることを示しました。リアルタイムな修正を可能にするために、CAP-TTA というテストタイム適応フレームワークを提案しました。CAP-TTA は、バイアスリスクスコアが設定された閾値を超える場合にのみ、文脈感知な LoRA 更新をトリガーします。オフラインで事前計算された対角線プレ条件子を活用することで、高速かつ安定した最適化を確保します。複数のベンチマークと人間による評価において、CAP-TTA は標準的な最適化手法（例：AdamW や SGD）と比較して著しく低い遅延率で、毒語/バイアススコアを効果的に削減しました。さらに、カタルーピックフォゲティングを防止し、状態の最優解のベースラインと比較して物語の流暢さを大幅に向上させると同時に、デバイアシング性能を損なうことなく維持しました。

Original Content

arXiv:2603.13683v3 Announce Type: replace-cross Abstract: Although debiased large language models (LLMs) excel at handling known or low-bias prompts, they often fail on unfamiliar and high-bias prompts. We demonstrate via out-of-distribution (OOD) detection that these high-bias prompts cause a distribution shift, degrading static model performance. To enable real-time correction, we propose CAP-TTA, a test-time adaptation framework. CAP-TTA triggers context-aware LoRA updates only when a bias-risk score exceeds a set threshold. By utilizing an offline precomputed diagonal preconditioner, it ensures fast and stable optimization. Across multiple benchmarks and human evaluations, CAP-TTA effectively reduces toxicity/bias score with significantly lower latency than standard optimization methods (e.g., AdamW or SGD). Furthermore, it prevents catastrophic forgetting, and substantially improves narrative fluency over state-of-the-art baselines without compromising debiasing performance.