arxiv_cs_lg 2026年4月24日

プストトレーニングクランティゼされた大規模言語モデルのためのタスク階層型知識スケーリング法則

Task-Stratified Knowledge Scaling Laws for Post-Training Quantized Large Language Models

Translated: 2026/4/24 20:12:32

llmquantizationscaling-lawspost-training-quantizationlarge-language-models

Japanese Translation

arXiv:2508.18609v4 Announce Type: replace-cross 要約：プストトレーニングクランティゼーション (PTQ) は、効率的な大規模言語モデル (LLM) デプロイメントにおける重要な戦略である。しかし、既存のスケーリング法則は主に一般的な性能に焦点を当てており、重要なきの細粒度ファクターや、クランティゼーションが異なる知識能力に及ぼす影響の違いを看過している。これを解決するために、当稿ではタスク階層型知識スケーリング法則を確立した。能力を暗記、応用、推論に階層化することで、モデルサイズ、ビット幅、および細粒度ファクター（グループサイズとクリエレーションセットサイズ）を統合するフレームワークを開発した。293 つの多様な PTQ 構成で検証した結果、我々のフレームワークは高い適合性とクロスアーキテクチャ一貫性を示した。これにより、知識能力間で異なる感度性が明らかになった：推論は精度に依存する、応用はスケールに反応する、暗記はクリエレーションに敏感である。当稿は、低ビットシナリオにおいて、これらの細粒度ファクターを最適化することがパフォーマンスの崩壊を防ぐために不可欠であることを強調する。これらの知見は、知識認識クランティゼーション戦略の設計に実証に基づく基盤を提供するものである。

Original Content

arXiv:2508.18609v4 Announce Type: replace-cross Abstract: Post-Training Quantization (PTQ) is a critical strategy for efficient Large Language Models (LLMs) deployment. However, existing scaling laws primarily focus on general performance, overlooking crucial fine-grained factors and how quantization differentially impacts diverse knowledge capabilities. To address this, we establish Task-Stratified Knowledge Scaling Laws. By stratifying capabilities into memorization, application, and reasoning, we develop a framework that unifies model size, bit-width, and fine-grained factors: group size and calibration set size. Validated on 293 diverse PTQ configurations, our framework demonstrates strong fit and cross-architecture consistency. It reveals distinct sensitivities across knowledge capabilities: reasoning is precision-critical, application is scale-responsive, and memorization is calibration-sensitive. We highlight that in low-bit scenarios, optimizing these fine-grained factors is essential for preventing performance collapse. These findings provide an empirically-backed foundation for designing knowledge-aware quantization strategies.