arxiv_cs_lg 2026年4月24日

分子ポテンシャル場を最小限の時間情報で向上させる

Improving Molecular Force Fields with Minimal Temporal Information

Translated: 2026/4/24 20:02:04

molecular-dynamicsforce-fieldsneural-networksai-for-sciencemachine-learning

Japanese Translation

arXiv:2604.19806v1 Announce Type: cross Abstract:3 次元分子系に対するエネルギーと力への正確な予測は、AI for Science 応用の核心的な基本的課題の一人です。強力でありデータ効率性の高いニューラルネットワークは、単一の原子配置から分子エネルギーと力を予測します。しかし、これらのモデルを学習する際、データ生成過程の重要な側面である分子動的シミュレーション (MD) が稀に考慮されます。MD シミュレーションは、エネルギーが変動し、ポテンシャルエネルギー面を探索する時間順の原子配置軌道を描きます（標準的な NVE/NVT アンサンブル下において）。これに対し、幾何学的緩和においては、ポテンシャルエネルギーが最小限に低下する方向に安定して進むように構成されます。本作業は、MD データが利用可能な場合、そのような予測者のパフォーマンスを向上させるための新しい方法を探索します。我々は、MD 軌道内の時間的関係を利用するための新たな訓練戦略である FRAMES を導入します。FRAMES は、時間的関係を利用するための補助損失関数を使用します。逆説的ですが、原子スケールのベンチマーク 2 つおよび合成系において、最小限の時間情報（直近の 2 つのフレームのペアによって捉えられる情報）が、最適なパフォーマンスを得るために十分であることが観察されます。一方、より長い軌道シーケンスを追加すると、冗長性が導入され、パフォーマンスが低下する可能性があります。広く使われている MD17 と ISO17 のベンチマークにおいて、FRAMES は Equiformer ベースラインを著しく凌駕し、エネルギー精度と力精度の両方において非常に競争力のある結果を達成しました。我々の作業は、単にモデルの精度を向上させるという新たな訓練戦略を提示するだけでなく、原子系の物理的事先を蒸留（distill）する際に、より多い時間データの方が常によいとは限りないという証拠も提供します。

Original Content

arXiv:2604.19806v1 Announce Type: cross Abstract: Accurate prediction of energy and forces for 3D molecular systems is one of fundamental challenges at the core of AI for Science applications. Many powerful and data-efficient neural networks predict molecular energies and forces from single atomic configurations. However, one crucial aspect of the data generation process is rarely considered while learning these models i.e. Molecular Dynamics (MD) simulation. MD simulations generate time-ordered trajectories of atomic positions that fluctuate in energy and explore regions of the potential energy surface (e.g., under standard NVE/NVT ensembles), rather than being constructed to steadily lower the potential energy toward a minimum as in geometry relaxations. This work explores a novel way to leverage MD data, when available, to improve the performance of such predictors. We introduce a novel training strategy called FRAMES, that use an auxiliary loss function for exploiting the temporal relationships within MD trajectories. Counter-intuitively, on two atomistic benchmarks and a synthetic system we observe that minimal temporal information, captured by pairs of just two consecutive frames, is often sufficient to obtain the best performance, while adding longer trajectory sequences can introduce redundancy and degrade performance. On the widely used MD17 and ISO17 benchmarks, FRAMES significantly outperforms its Equiformer baseline, achieving highly competitive results in both energy and force accuracy. Our work not only presents a novel training strategy which improves the accuracy of the model, but also provides evidence that for distilling physical priors of atomic systems, more temporal data is not always better.