arxiv_cs_ai 2026年2月10日

会話での感情理解：認識の洞察と言語パターンを用いた生成

Understanding Emotion in Discourse: Recognition Insights and Linguistic Patterns for Generation

Translated: 2026/2/14 8:11:08

Japanese Translation

Despite recent strong进展 in Emotion Recognition in Conversation (ERC)，仍有两个不足之处：缺乏明确定义的选择决定了表现，我们对识别发现与可操作的生成线索之间的链接有限的语义分析。这些问题通过在IEMOCAP上进行全面的研究来解决。对于识别方面，我们进行的是受控替代实验并使用10个随机种子及成对测试（多比较后调整），产生了三个发现。首先，对话上下文占主导地位：表演迅速饱和，在用仅最近年纪的10-30之前的语句上可以达到约90％的效果。第二，层次化的句子表示有助于单个话语识别(K=0)，但加入转回合上下文层面的背景信息后效果下降，表明回顾性的对话历史将掩盖内部段落中的结构信息。第三，在外部情感词典（SenticNet）被集成的情况下也没有得到改进的结果，这与预训练编码器已捕获情感信号的事实一致。在严格因果性（仅过去信息）限定情况下下，我们简单的模型也可达到强表现(4路82. 69%，6路加权F1为67.07%)。对于语言分析，我们检查到5, 286个语调标志发生的可靠性，并发现情绪状态和标记位置之间存在可靠的相关性（p < 0. 0001）。悲伤的话语显示其使用左边缘标记的比例（只有21. 9%）较其他情绪（在其余情绪类别中为28-32%）少出了约5个百分点，与把左边缘语调标记用来管理话语的内容的说法一致。这一模式与悲伤情绪受益最多、依赖更多的对话背景的信息来获得情感的解释相符合。”。

Original Content

arXiv:2601.00181v2 Announce Type: replace-cross Abstract: Despite strong recent progress in Emotion Recognition in Conversation (ERC), two gaps remain: we lack clear understanding of which modeling choices materially affect performance, and we have limited linguistic analysis linking recognition findings to actionable generation cues. We address both via a systematic study on IEMOCAP. For recognition, we conduct controlled ablations with 10 random seeds and paired tests (with correction for multiple comparisons), yielding three findings. First, conversational context is dominant: performance saturates quickly, with roughly 90% of gain achieved using only the most recent 10-30 preceding turns. Second, hierarchical sentence representations improve utterance-only recognition (K=0), but the benefit vanishes once turn-level context is available, suggesting conversational history subsumes intra-utterance structure. Third, external affective lexicon (SenticNet) integration does not improve results, consistent with pretrained encoders already capturing affective signal. Under strictly causal (past-only) setting, our simple models attain strong performance (82.69% 4-way; 67.07% 6-way weighted F1). For linguistic analysis, we examine 5,286 discourse-marker occurrences and find reliable association between emotion and marker position (p < 0.0001). Sad utterances show reduced left-periphery marker usage (21.9%) relative to other emotions (28-32%), aligning with accounts linking left-periphery markers to active discourse management. This pattern is consistent with Sad benefiting most from conversational context (+22%p), suggesting sadness relies more on discourse history than overt pragmatic signaling.