arxiv_cs_cv 2026年2月10日

SegQuant: 拡散モデルのための意味認識型かつ汎用性の高い量化フレームワーク

SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models

Translated: 2026/3/15 8:02:19

segquantdiffusion-modelsquantizationptqgenerative-ai

Japanese Translation

arXiv:2507.14811v5 Announce Type: replace Abstract: 拡散モデルは卓越した生成能力を示していますが、計算コストが高いため、リソース制約のあるまたは低遅延要求のある環境への展開において重大な課題を呈します。量化はモデルサイズと計算コストを削減する効果的な手段であり、事前学習済みモデルとの互換性を保ちながら再学習や訓練データ不要という特徴から、ポストトレーニング量化 (PTQ) が特に魅力的です。しかし、既存の拡散モデル用 PTQ 手法は、アーキテクチャ固有のバイアスを伴うため、汎用性が不十分で産業系デプロイメントパイプラインとの統合を妨げています。これらの制限に対処するために、本稿では、クロスモデル versatility を向上させるために補完的な技法を適応的に組み合わせる単一化された量化フレームワークである SegQuant を提案します。SegQuant は、構造的意味と空間的多様性を捉えるセグメント認識型グラフベースの量化戦略 (SegLinear) と、生成された出力の視覚的な精度維持に不可欠である極性非対称なアクティベーションを保持する二重スケール量化スキーム (DualScale) で構成されています。SegQuant は、Transformer ベースの拡散モデルを超えて幅広く適用可能であり、強力なパフォーマンスを発揮しつつ、主流のデプロイメントツールとのシームレスな互換性を確保します。

Original Content

arXiv:2507.14811v5 Announce Type: replace Abstract: Diffusion models have demonstrated exceptional generative capabilities but are computationally intensive, posing significant challenges for deployment in resource-constrained or latency-sensitive environments. Quantization offers an effective means to reduce model size and computational cost, with post-training quantization (PTQ) being particularly appealing due to its compatibility with pre-trained models without requiring retraining or training data. However, existing PTQ methods for diffusion models often rely on architecture-specific heuristics that limit their generalizability and hinder integration with industrial deployment pipelines. To address these limitations, we propose SegQuant, a unified quantization framework that adaptively combines complementary techniques to enhance cross-model versatility. SegQuant consists of a segment-aware, graph-based quantization strategy (SegLinear) that captures structural semantics and spatial heterogeneity, along with a dual-scale quantization scheme (DualScale) that preserves polarity-asymmetric activations, which is crucial for maintaining visual fidelity in generated outputs. SegQuant is broadly applicable beyond Transformer-based diffusion models, achieving strong performance while ensuring seamless compatibility with mainstream deployment tools.