arxiv_cs_cv 2026年2月10日

トポロジカルシグネチャーとグラデントヒストグラム：眼底画像分類のための比較研究

Topological Signatures vs. Gradient Histograms: A Comparative Study for Medical Image Classification

Translated: 2026/3/15 7:01:57

topological-data-analysishistogram-of-oriented-gradientsretinal-fundus-imagemedical-image-classificationpersistent-homology

Japanese Translation

arXiv:2507.03006v2 Announce Type: replace 摘要：本稿は、眼底画像分類において、Histogram of Oriented Gradients (HOG) と Topological Data Analysis (TDA) という、本質的に異なる特徴量抽出のパラダイムを比較検討する。HOG は、空間領域内における勾配方向分布をモデル化することで局所的構造情報を捉え、テクスチャやエッジパターンの有効な符号化を実現する。一方、TDA（キューブ持久性同調を介して実装）は、形状、接続性、強度に基づく構造を特徴化する全局的トポロジカル記述子を抽出する。我々は、公的に入手可能な APTOS 眼底データセットを用いて、二分類（正常対糖尿病網膜症 (DR)）および 5 分類 (DR 重症度分级) のタスクで両アプローチを評価した。各画像から 26,244 個の HOG 特徴量と 800 個の TDA 特徴量を抽出し、これらを独立に 7 つの古典的な機械学習モデル（ロジスティック回帰、ランダムフォレスト、XGBoost、サポートベクターマシン、決定木、k 近傍、Extra Trees）のトレーニングに使用した（10 分割クロスバリデーション）。実験結果は、XGBoost が両方の特徴量タイプにおいて最良の性能を示したことを示している。二分類では、HOG (94.29%) と TDA (94.18%)、多分類ではそれぞれ 74.41% と 74.69% の精度が得られた。これらの結果は、勾配ベースの特徴量とトポロジカル特徴量が網膜画像構造の補完的な表現を提供しており、両アプローチを統合することで、解釈可能性が高く堅牢な医学画像分類の可能性が高いことを示している。

Original Content

arXiv:2507.03006v2 Announce Type: replace Abstract: This work presents a comparative evaluation of two fundamentally different feature extraction paradigms--Histogram of Oriented Gradients (HOG) and Topological Data Analysis (TDA)--for medical image classification, with a focus on retinal fundus imagery. HOG captures local structural information by modeling gradient orientation distributions within spatial regions, effectively encoding texture and edge patterns. In contrast, TDA, implemented through cubical persistent homology, extracts global topological descriptors that characterize shape, connectivity, and intensity-based structure across images. We evaluate both approaches on the publicly available APTOS retinal fundus dataset for two classification tasks: binary classification (normal vs. diabetic retinopathy (DR)) and five-class DR severity grading. From each image, 26,244 HOG features and 800 TDA features are extracted and independently used to train seven classical machine learning models, including logistic regression, random forest, XGBoost, support vector machines, decision trees, k-nearest neighbors, and Extra Trees, using 10-fold cross-validation. Experimental results show that XGBoost achieves the best performance across both feature types. For binary classification, accuracies of 94.29% (HOG) and 94.18% (TDA) are obtained, while multi-class classification yields accuracies of 74.41% and 74.69%, respectively. These results demonstrate that gradient-based and topological features provide complementary representations of retinal image structure and highlight the potential of integrating both approaches for interpretable and robust medical image classification.