arxiv_cs_ai 2026年4月24日

教師の発言と生徒の発言の両者を補完的な次元で分類することによる、科学的教室での議論分析を多目的学習で向上させる

Enhancing Science Classroom Discourse Analysis through Joint Multi-Task Learning for Reasoning-Component Classification

Translated: 2026/4/24 20:23:48

discourse-analysismulti-task-learningclassroom-discoursellm-augmentationreasoning-classification

Japanese Translation

arXiv:2604.21137v1 Announce Type: cross 抽象：科学教室における生徒の推論パターンを分析することは、知識の構築メカニズムを理解し、認知関与を最大化する教育実践を改善するために不可欠ですが、大規模な教室での議論の自動编码はまだ手作業で膨大な労働を要するため不可能なほど困難です。私たちは、CDAT フレームワークから派生した 2 つの補完的な次元である「発言タイプ」と「推論成分」の両方で教師と生徒の発言を同時に分類する自動的な議論分析システム（ADAS）を提案します。少数クラス間の深刻なラベル的不平衡に対処するため、私たちは（1）注釈コーパスを階層化して分割し、（2）少数クラスを標的とした LLM ベースの合成データ拡張を行い、（3）双頭探針を持つ RoBERTa-base クラシファラーを訓練しました。ゼロショット GPT-5.4 ベースラインは UT（発言タイプ）で 0.467、RC（推論成分）で 0.476 のマイクロ-F1 スコアを得て、プロンプトのみを使用したアプローチの有効な上限量を確立し、微调（fine-tuning）を促しました。分類だけでなく、UTxRC の共現プロファイリング、セッションごとの認知複雑度指数（CCI）計算、ラグシークウェンシャル分析、IRF 連鎖分析などの議論パターン分析も実施し、教師の Feedback-with-Question（Fq）が生徒の推論的推論（SR-I）の最も一貫した前因であることが明らかになりました。私たちの結果は、LLM ベースの拡張が UT の少数クラス認識を有意に改善し、RC タスクの構造的単純さが文字ベースラインさえも処理可能なレベルでこれを可能にすることを示唆しています。

Original Content

arXiv:2604.21137v1 Announce Type: cross Abstract: Analyzing the reasoning patterns of students in science classrooms is critical for understanding knowledge construction mechanism and improving instructional practice to maximize cognitive engagement, yet manual coding of classroom discourse at scale remains prohibitively labor-intensive. We present an automated discourse analysis system (ADAS) that jointly classifies teacher and student utterances along two complementary dimensions: Utterance Type and Reasoning Component derived from our prior CDAT framework. To address severe label imbalance among minority classes, we (1) stratify-resplit the annotated corpus, (2) apply LLM-based synthetic data augmentation targeting minority classes, and (3) train a dual-probe head RoBERTa-base classifier. A zero-shot GPT-5.4 baseline achieves macro-F1 of 0.467 on UT and 0.476 on RC, establishing meaningful upper bounds for prompt-only approaches motivating fine-tuning. Beyond classification, we conduct discourse pattern analyses including UTxRC co-occurrence profiling, Cognitive Complexity Index (CCI) computation per session, lag-sequential analysis, and IRF chain analysis, revealing that teacher Feedback-with-Question (Fq) moves are the most consistent antecedents of student inferential reasoning (SR-I). Our results demonstrate that LLM-based augmentation meaningfully improves UT minority-class recognition, and that the structural simplicity of the RC task makes it tractable even for lexical baselines.