arxiv_cs_lg 2026年2月10日

Aggregated Acquisition Functions によるアクティブ学習：精度とサステナビリティの分析

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

Translated: 2026/3/15 14:07:01

active-learningmachine-learningacquisition-functionsenergy-efficiencysampling-strategy

Japanese Translation

arXiv:2602.07440v1 発表タイプ：新規概要: アクティブ学習（AL）は、トレーニング中にラベル付けコストを最小限に抑えるために、最も情報量のあるサンプルを戦略的に選択する機械学習（ML）のアプローチです。この戦略はラベル付けコストだけでなく、ニューラルネットワークのトレーニングにおけるエネルギー効率を高め、データとエネルギーの両方の効率を向上させます。本稿では、最先端の取得関数を実装・評価し、それぞれの精度と計算コストを分析するとともに、各手法の利点と欠点について論じます。我々の調査结果表明，再現性ベースの取得関数はデータセットを効果的に探索しますが、境界決定を優先しない一方で、不確実性ベースの取得関数については、ニューラルネットワークがすでに識別した境界決定を精錬することに焦点を当てています。このトレードオフは「探索・利用のジレンマ」と呼ばれます。このジレンマに対処するため、我々は 6 つの構造化方法（直列，並列，ハイブリッド，適応的フィードバック，ランダム探索，Annealing 探索）を導入しました。我々の集積型取得関数は、バッチモードの非効率性や冷たいスタート問題など、一般的な AL の病気を緩和します。さらに、精度とエネルギー消費のバランスを取ることに焦点を当て、より持続可能で、エネルギー意識のある人工知能（AI）の開発に貢献します。我々は、提案された構造を各種のモデルとデータセットで評価しました。結果は、これらの構造が計算コストを削減しながら精度を維持あるいは向上させる可能性を示唆しています。例えば、BALD と BADGE のような取得関数の切り替えなど、革新的な集積アプローチは堅固な結果を示し、$K$-Centers を BALD に続くように順次実行するものは、12% 少ないサンプル数と取得コストを約半分に抑えることさえ達成しました。

Original Content

arXiv:2602.07440v1 Announce Type: new Abstract: Active learning (AL) is a machine learning (ML) approach that strategically selects the most informative samples for annotation during training, aiming to minimize annotation costs. This strategy not only reduces labeling expenses but also results in energy savings during neural network training, thereby enhancing both data and energy efficiency. In this paper, we implement and evaluate various state-of-the-art acquisition functions, analyzing their accuracy and computational costs, while discussing the advantages and disadvantages of each method. Our findings reveal that representativity-based acquisition functions effectively explore the dataset but do not prioritize boundary decisions, whereas uncertainty-based acquisition functions focus on refining boundary decisions already identified by the neural network. This trade-off is known as the exploration-exploitation dilemma. To address this dilemma, we introduce six aggregation structures: series, parallel, hybrid, adaptive feedback, random exploration, and annealing exploration. Our aggregated acquisition functions alleviate common AL pathologies such as batch mode inefficiency and the cold start problem. Additionally, we focus on balancing accuracy and energy consumption, contributing to the development of more sustainable, energy-aware artificial intelligence (AI). We evaluate our proposed structures on various models and datasets. Our results demonstrate the potential of these structures to reduce computational costs while maintaining or even improving accuracy. Innovative aggregation approaches, such as alternating between acquisition functions such as BALD and BADGE, have shown robust results. Sequentially running functions like $K$-Centers followed by BALD has achieved the same performance goals with up to 12\% fewer samples, while reducing the acquisition cost by almost half.