Back to list
arxiv_cs_ai 2026年2月10日

会話での感情理解:認識の洞察と言語パターンを用いた生成

Understanding Emotion in Discourse: Recognition Insights and Linguistic Patterns for Generation

Translated: 2026/2/14 8:11:08

Japanese Translation

Despite recent strong进展 in Emotion Recognition in Conversation (ERC),仍有两个不足之处:缺乏明确定义的选择决定了表现,我们对识别发现与可操作的生成线索之间的链接有限的语义分析。这些问题通过在IEMOCAP上进行全面的研究来解决。对于识别方面,我们进行的是受控替代实验并使用10个随机种子及成对测试(多比较后调整),产生了三个发现。首先,对话上下文占主导地位:表演迅速饱和,在用仅最近年纪的10-30之前的语句上可以达到约90%的效果。第二,层次化的句子表示有助于单个话语识别(K=0),但加入转回合上下文层面的背景信息后效果下降,表明回顾性的对话历史将掩盖内部段落中的结构信息。第三,在外部情感词典(SenticNet)被集成的情况下也没有得到改进的结果,这与预训练编码器已捕获情感信号的事实一致。在严格因果性(仅过去信息)限定情况下下,我们简单的模型也可达到强表现(4路82. 69%,6路加权F1为67.07%)。对于语言分析,我们检查到5, 286个语调标志发生的可靠性,并发现情绪状态和标记位置之间存在可靠的相关性(p < 0. 0001)。悲伤的话语显示其使用左边缘标记的比例(只有21. 9%)较其他情绪(在其余情绪类别中为28-32%)少出了约5个百分点,与把左边缘语调标记用来管理话语的内容的说法一致。这一模式与悲伤情绪受益最多、依赖更多的对话背景的信息来获得情感的解释相符合。”。

Original Content

arXiv:2601.00181v2 Announce Type: replace-cross Abstract: Despite strong recent progress in Emotion Recognition in Conversation (ERC), two gaps remain: we lack clear understanding of which modeling choices materially affect performance, and we have limited linguistic analysis linking recognition findings to actionable generation cues. We address both via a systematic study on IEMOCAP. For recognition, we conduct controlled ablations with 10 random seeds and paired tests (with correction for multiple comparisons), yielding three findings. First, conversational context is dominant: performance saturates quickly, with roughly 90% of gain achieved using only the most recent 10-30 preceding turns. Second, hierarchical sentence representations improve utterance-only recognition (K=0), but the benefit vanishes once turn-level context is available, suggesting conversational history subsumes intra-utterance structure. Third, external affective lexicon (SenticNet) integration does not improve results, consistent with pretrained encoders already capturing affective signal. Under strictly causal (past-only) setting, our simple models attain strong performance (82.69% 4-way; 67.07% 6-way weighted F1). For linguistic analysis, we examine 5,286 discourse-marker occurrences and find reliable association between emotion and marker position (p < 0.0001). Sad utterances show reduced left-periphery marker usage (21.9%) relative to other emotions (28-32%), aligning with accounts linking left-periphery markers to active discourse management. This pattern is consistent with Sad benefiting most from conversational context (+22%p), suggesting sadness relies more on discourse history than overt pragmatic signaling.