arxiv_cs_ai 2026年2月10日

思考状態による隠れ推論：-supervised-thinking-states

Latent Reasoning with Supervised Thinking States

Translated: 2026/3/7 13:28:54

machine-learninglarge-language-modelslatent-reasoningchain-of-thought

Japanese Translation

Latent Reasoning with Supervised Thinking Statesは、Large Language ModelsでChain-of-Thoughtでの複雑なタスクを解き、生成すると長期の理屈が必要になります。そのため、負担が大きくかかりますが、この問題に対する解決策は、Inputが処理されるときにThinking Statesという方法で推論を行うことができますことを提案しています。まず、コタルーゼはRecurrenceの特性を持っていますがそれに加えて、思考状態は生成されたInputに合わせて進行するためです。 2つ目の重要な利点として：思考状態は自然言語のサルベージを使用することにより学習できるという事実があります。そして、これを行う際に、教師強制を使用し、それによって、多様性が行なえることが可能です。また、異なる推論タスクに対して、Thinking StateはCoT以上の能力を示し、最終的には訓練されたシーケンスよりも長い序列に適用できることも確認しています。

Original Content

arXiv:2602.08332v1 Announce Type: cross Abstract: Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs due to the generation of long rationales. We propose Thinking States, a method that performs reasoning {\em while} the input is processing. Specifically, Thinking States generates sequences of thinking tokens every few input tokens, transforms the thoughts back into embedding space, and adds them to the following input tokens. This has two key advantages. First, it captures the recurrent nature of CoT, but where the thought tokens are generated as input is processing. Second, since the thoughts are represented as tokens, they can be learned from natural language supervision, and using teacher-forcing, which is parallelizable. Empirically, Thinking States outperforms other latent reasoning methods on multiple reasoning tasks, narrowing the gap to CoT on math problems, and matching its performance on 2-Hop QA with improved latency. On state-tracking tasks, we show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.