arxiv_cs_cv 2026年2月10日

Sequential Attention-based Sampling for Histopathological Analysis

Translated: 2026/3/15 17:02:50

histopathologydeep-reinforcement-learningmulti-instance-learningwhole-slide-imagingmedical-image-analysis

Japanese Translation

arXiv:2507.05077v4 Announce Type: replace-cross Abstract: 深層学習は、自動化された組織診断においてますます活用されています。しかし、全スライド画像（WSI）はしばしば数十億画素に及び、これらを一括して高解像度で解析することは計算的に不可能に近いものです。診断用のラベルは主にスライドレベルのみで利用可能であり、画像を微視的な（パッチレベル）段階で専門家による注釈付けが既成の労働費と高価であるからです。さらに、診断情報が含まれる領域は WSI の大部分を占めておらず、スライド全体を高解像度で検査することは非効率です。本稿では、SASHA（Sequential Attention-based Sampling for Histopathological Analysis: 組織病理解析のための順次付注意に基づくサンプリング）というアプローチを提案します。SASHA は、効率的な組織病理画像解析のための深層強化学習手法です。まず、SASHA は軽量の階層的付注意に基づく多発学習（MIL）モデルを用いて、情報豊かな特徴を学習します。第二に、SASHA は知的にサンプリングを行い、高解像度のパッチの少数（10〜20%）を選択的にズームアップすることで、確実な診断を実現します。我々は、WSI を完全高解像度で解析する最先端の手法と同等の結果を得られる一方で、それらの計算コストやメモリコストの僅かな分数に過ぎないことを示しました。さらに、競合するスプースサンプリング手法に比べて著しく優れた性能を発揮しました。SASHA を、自動診断を伴う非常に大きな画像に含まれる疎な情報特徴を対象とする医療画像認識の課題のための知的サンプリングモデルとして提案します。モデルの実装は以下の URL で利用可能です：https://github.com/coglabiisc/SASHA

Original Content

arXiv:2507.05077v4 Announce Type: replace-cross Abstract: Deep neural networks are increasingly applied in automated histopathology. Yet, whole-slide images (WSIs) are often acquired at gigapixel sizes, rendering them computationally infeasible to analyze entirely at high resolution. Diagnostic labels are largely available only at the slide-level, because expert annotation of images at a finer (patch) level is both laborious and expensive. Moreover, regions with diagnostic information typically occupy only a small fraction of the WSI, making it inefficient to examine the entire slide at full resolution. Here, we propose SASHA -- Sequential Attention-based Sampling for Histopathological Analysis -- a deep reinforcement learning approach for efficient analysis of histopathological images. First, SASHA learns informative features with a lightweight hierarchical, attention-based multiple instance learning (MIL) model. Second, SASHA samples intelligently and zooms selectively into a small fraction (10-20\%) of high-resolution patches to achieve reliable diagnoses. We show that SASHA matches state-of-the-art methods that analyze the WSI fully at high resolution, albeit at a fraction of their computational and memory costs. In addition, it significantly outperforms competing, sparse sampling methods. We propose SASHA as an intelligent sampling model for medical imaging challenges that involve automated diagnosis with exceptionally large images containing sparsely informative features. Model implementation is available at: https://github.com/coglabiisc/SASHA.