arxiv_cs_cv 2026年4月20日

手持眼底画像の非構造化劣化に対する無教師学習型拡散型自動エンコーダーによるアートをリステル

Diffusion Autoencoder for Unsupervised Artifact Restoration in Handheld Fundus Images

Translated: 2026/4/20 10:42:02

diffusion-modelsimage-restorationunsupervised-learningophthalmologyhandheld-imaging

Japanese Translation

arXiv:2604.15723v1 発表タイプ：新規要旨：手持式眼底撮像機器の登場により、眼科診断と疾患スクリーニングがよりアクセス可能、効率的、かつコスト効果の高いものとなりました。ただし、これらの装置から得られた画像には、フラッシュ反射、露光変動、および運動によるぼかしなどのアーティファクトが含まれており、これらは画像品質を低下させ、下流解析を妨げています。生成モデルは画像修復において有効ですが、それらは通常ペアリングした監視または事前に定義されたアーティファクト構造に依存しており、手持式眼底画像に見られる非構造化劣化には適応性が低いです。これを解決するために、文脈エンコーダーとノイズ除去プロセスを組み合わせた無教師学習型拡散型自動エンコーダーを提案しました。このモデルは、意味的な表現を学習することでアーティファクトをリステルします。モデルは、高品質なテーブルトップ眼底画像のみでトレーニングされ、アーティファクトの影響を受けた手持式取得データを推論して修復します。我々は定量的および定性的評価を通じて修復を検証し、未見データセットおよび複数のアーティファクト条件において、診断の精度が 81.17% に増加することが示されました。

Original Content

arXiv:2604.15723v1 Announce Type: new Abstract: The advent of handheld fundus imaging devices has made ophthalmologic diagnosis and disease screening more accessible, efficient, and cost-effective. However, images captured from these setups often suffer from artifacts such as flash reflections, exposure variations, and motion-induced blur, which degrade image quality and hinder downstream analysis. While generative models have been effective in image restoration, most depend on paired supervision or predefined artifact structures, making them less adaptable to unstructured degradations commonly observed in handheld fundus images. To address this, we propose an unsupervised diffusion autoencoder that integrates a context encoder with the denoising process to learn semantically meaningful representations for artifact restoration. The model is trained only on high-quality table-top fundus images and infers to restore artifact-affected handheld acquisitions. We validate the restorations through quantitative and qualitative evaluations, and have shown that diagnostic accuracy increases to 81.17% on an unseen dataset and multiple artifact conditions