arxiv_cs_ai 2026年4月24日

非定常マルウェア検出における攻撃的回避: 類似制約による擾乱を通じたドリフトシグナルの最小化

Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations

Open original article

Translated: 2026/4/24 20:26:06

adversarial-attackmalware-detectiondeep-learningdrift-detectionoptimization

Japanese Translation

arXiv:2604.21310v1 Announce Type: cross 要旨: 深層学習は、さまざまなデータ表現において驚異的な精度を示す強力なマルウェア検出アプローチとして台頭しました。しかし、これらのモデルは、マルウェアの特性と検出システムが両方で絶えず進化する、現実世界の非定常環境において致命的な制約に直面しています。本研究は、セキュリティの根本的な問いを調査します: 攻撃者は、分類を回避し、かつドリフト監視メカニズムに目立たないままマルウェアサンプルを生成できるのでしょうか？我々は、分類器の標準化された特徴空間で生成され、洗練された類似制約を伴う、ターゲット指向的な攻撃的サンプルを生成する新しいアプローチを提案しました。汚染マルウェアとの分布的類似性を維持するよう擾乱を慎重に制約することで、我々はターゲット指向的な誤分類とドリフトシグナルの最小化を平衡する最適化目標を作成しました。我々は複数のドリフトメトリックを使用し、分類器出力確率を包括的に比較することで、このアプローチの有効性を定量化しました。実験では、類似制約は出力ドリフトシグナルを低減でき、$\ell_2$ régularization が最も有望な結果を示すことを示しました。我々は、擾乱予算が回避・検出可能性のトレードオフに著しく影響することを観察し、予算が増加すると攻撃成功率が高まり、より顕著なドリフト指標が生じることを確認しました。

Original Content

arXiv:2604.21310v1 Announce Type: cross Abstract: Deep learning has emerged as a powerful approach for malware detection, demonstrating impressive accuracy across various data representations. However, these models face critical limitations in real-world, non-stationary environments where both malware characteristics and detection systems continuously evolve. Our research investigates a fundamental security question: Can an attacker generate adversarial malware samples that simultaneously evade classification and remain inconspicuous to drift monitoring mechanisms? We propose a novel approach that generates targeted adversarial examples in the classifier's standardized feature space, augmented with sophisticated similarity regularizers. By carefully constraining perturbations to maintain distributional similarity with clean malware, we create an optimization objective that balances targeted misclassification with drift signal minimization. We quantify the effectiveness of this approach by comprehensively comparing classifier output probabilities using multiple drift metrics. Our experiments demonstrate that similarity constraints can reduce output drift signals, with $\ell_2$ regularization showing the most promising results. We observe that perturbation budget significantly influences the evasion-detectability trade-off, with increased budget leading to higher attack success rates and more substantial drift indicators.