arxiv_cs_cv 2026年4月24日

人間と機械の視覚における方向性混乱は、誤分類の方向性を通過して、誤分類頻度では見えない発散する帰納的バイアスを浮き彫りにする

Directional Confusions Reveal Divergent Inductive Biases Through Rate-Distortion Geometry in Human and Machine Vision

Translated: 2026/4/24 19:47:15

rate-distortioninductive-biasvisual-modelsconfusion-matrixhuman-computer-vision

Japanese Translation

arXiv:2604.21909v1 Announce Type: new 要約：人間と最新の視覚モデルは、分類精度が似ていますが、系一的な種類の Mistake（誤り）を行います。彼らの誤りの頻度ではなく、誰が誰と誤って認識されるのか、そしてどの方向へ混乱するかにおいて異なります。我々は、これらの方向性混乱が、精度のみに見えない別の発散する帰納的バイアスを明らかにしていることを示します。12 種類の乱擾動条件下で自然画像カテゴリー化タスクでマッチングされた人間と深層視覚モデルの応用を用いることにより、我々は混乱行列の非対称性を定量化し、それを率歪み（RD）の枠組みを通じて一般化幾何学と関連づけます。この枠組みは、傾斜（β）、曲率（κ）、効率（AUC）の 3 つの幾何学的署名で要約されます。我々は、人間が広いが弱い非対称性を示し、深層視覚モデルは疎く強力な方向性崩壊を示すことを見出します。頑健性トレーニングはグローバルな非対称性を減少させますが、階差の類似性の人間のような広がり・強さのプロファイルを回復させることはできません。メカニカルなシミュレーションは、さらに、異なる非対称性の組織が性能がマッチングされた場合であっても、RD フロントを反対方向にシフトさせると示します。これらの結果は、方向性混乱と RD 幾何学が、分布のシフト下での帰納的バイアスの緊密で解釈可能な署名であるという立場を取ります。

Original Content

arXiv:2604.21909v1 Announce Type: new Abstract: Humans and modern vision models can reach similar classification accuracy while making systematically different kinds of mistakes - differing not in how often they err, but in who gets mistaken for whom, and in which direction. We show that these directional confusions reveal distinct inductive biases that are invisible to accuracy alone. Using matched human and deep vision model responses on a natural-image categorization task under 12 perturbation types, we quantify asymmetry in confusion matrices and link it to generalization geometry through a Rate-Distortion (RD) framework, summarized by three geometric signatures (slope (beta), curvature (kappa)) and efficiency (AUC). We find that humans exhibit broad but weak asymmetries, whereas deep vision models show sparser, stronger directional collapses. Robustness training reduces global asymmetry but fails to recover the human-like breadth-strength profile of graded similarity. Mechanistic simulations further show that different asymmetry organizations shift the RD frontier in opposite directions, even when matched for performance. Together, these results position directional confusions and RD geometry as compact, interpretable signatures of inductive bias under distribution shift.