arxiv_cs_lg 2026年4月24日

共役制剤候補を生成するための多目標強化学習

Multi-Objective Reinforcement Learning for Generating Covalent Inhibitor Candidates

Translated: 2026/4/24 19:55:21

reinforcement-learningdrug-discoverycovalent-inhibitorsmachine-learningchemistry

Japanese Translation

arXiv:2604.20019v1 Announce Type: new 摘要：共役制剤の合理的設計には、結合親和力、ターゲット選択性、または置換基反応性などの複数の特性を同時に最適化する必要がある。これはスクリーニングだけでは容易に解決できない多目標問題である。本研究では、エピDerミル成長因子受容体（EGFR）およびアセチルコリンエステラーゼ（ACHE）の 2 つのターゲットを対象とし、共役制剤候補の生成にマルチObjective强化学習（RL）を用いた機械学習パイプラインを提示する。SMILES ベースの事前訓練された LSTM が生成モデルとして機能し、政策勾配 RL とパレイド混雑距離を用いて、合成アクセス性、予測共役活性、残基親和力、および近似ドッキングスコアを含む競合する評価関数を調整する。このパイプラインは、10,000 構造の走査において EGFR において最大 0.50％（最大 0.74％）、ACHE において最大 0.74％の割合で既知の共役制剤を再発見した。さらに、ドッキングベースのスクリーニングを経る後、候補構造は warhead と残基の間を 5.5 Å（EGFR）および 3.2 Å（ACHE）まで短縮した。さらに注目すべきは、トレーニングデータに含まれていない warhead モチフを有する構造を自発的に生成する点である。これには、アレン、3-オキソ-β-スルファム、および α-メチレン-β-ラクトンが含まれ、すべてが独立した文献上で共役 warhead としてサポートされている。これらの結果は、RL 導出による生成がトレーニングの分布を超えて共役化学空間を探索できる可能性を示唆しており、共役薬物発見に取り組む医薬化学家にとって有用なツールとなる可能性がある。

Original Content

arXiv:2604.20019v1 Announce Type: new Abstract: Rational design of covalent inhibitors requires simultaneously optimizing multiple properties, such as binding affinity, target selectivity, or electrophilic reactivity. This presents a multi-objective problem not easily addressed by screening alone. Here we present a machine learning pipeline for generating covalent inhibitor candidates using multi-objective reinforcement learning (RL), applied to two targets: epidermal growth factor receptor (EGFR) and acetylcholinesterase (ACHE). A SMILES-based pretrained LSTM serves as the generative model, optimized via policy gradient RL with Pareto crowding distance to balance competing scoring functions including synthetic accessibility, predicted covalent activity, residue affinity, and an approximated docking score. The pipeline rediscovers known covalent inhibitors at rates of up to 0.50% (EGFR) and 0.74% (ACHE) in 10,000-structure runs, with candidate structures achieving warhead-to-residue distances as short as 5.5 angstrom (EGFR) and 3.2 angstrom (ACHE) after further docking-based screening. More notably, the pipeline spontaneously generates structures bearing warhead motifs absent from the training data - including allenes, 3-oxo-$\beta$-sultams, and $\alpha$-methylene-$\beta$-lactones - all of which have independent literature support as covalent warheads. These results suggest that RL-guided generation can explore covalent chemical space beyond its training distribution, and may be useful as a tool for medicinal chemists working on covalent drug discovery.