Reinforcement Learning Driven Heuristic Optimization for Global Adsorption Structure Prediction

I. Introduction

Exploring the vast configuration space to identify global minimum structures is a formidable challenge in computational materials science, particularly when studying adsorbate binding motives on catalyst surfaces. Traditional molecular dynamics (MD) simulations often fall short, becoming trapped in local energy basins and failing to locate the true global minimum. Furthermore, uncertainty-driven training schemes for machine learning (ML) potentials can inadvertently prioritize high-energy configurations, hindering the discovery of crucial low-energy minima.

Drawing inspiration from the Gaussian Approximation Potential (GAP)-driven random structure search (RSS), this work introduces a novel global optimization protocol. This protocol, similar to GAP-driven RSS, avoids a priori assumptions about adsorbate geometry or binding sites. It concurrently conducts global structure searches and GAP fitting. An iterative candidate structure pool is generated through Minima Hopping (MH) and is then utilized to select new training samples, effectively refining the potential and guiding the search towards the global minimum. However, the specific demands of surface-adsorbate geometry searches necessitate a tailored workflow, as RSS is not ideally suited for preserving the molecular integrity of adsorbates.

This comprehensive protocol comprises three key stages. Initially, iterative GAP fitting is performed, employing a series of MH searches to generate training data. Upon convergence of the training process, an extensive MH production run is executed, leveraging parallel simulations that contribute to a shared pool of energy minima. From this pool, a diverse set of promising structures is selected using Kernel Principal Component Analysis (PCA) and k-means clustering. These selected structures then undergo local relaxation at the Density Functional Theory (DFT) level, ensuring high accuracy for the final candidates. The complete workflow is visually depicted in Fig. 1, with each step elaborated upon in the subsequent sections.

Fig. 1: Workflow overview. a Schematic diagram illustrating the global optimization workflow. Colors indicate the application of different geometric constraints. The diagram highlights the heuristic approach to sampling five structures throughout the workflow. b During minima hopping, Hookean constraints are applied to adsorbate bonds, and the metal slab is fixed (blue). In DFT relaxations, only the bottom two layers of the slab remain fixed (orange). c A representative trajectory from a minima hopping run, demonstrating the sampling strategy that involves selecting the global minimum, two random local minima, and two random Molecular Dynamics snapshots.

Full size image

II. Iterative Training: Reinforcing Potential Accuracy

The workflow begins by accepting a SMILES string representing the adsorbate and the relaxed geometry of the clean surface as inputs. An initial gas-phase geometry of the adsorbate is generated using the Merck Molecular Force Field (MMFF) implemented in RDkit. It is important to acknowledge that empirical force fields like MMFF may not accurately represent (poly-)radical adsorbates common in heterogeneous catalysis. Nevertheless, the generated geometries are sufficient for our purposes. They prevent unrealistic atomic contacts, maintain the adsorbate’s bonding structure, and provide reasonable bond lengths. These bond lengths are then used to define Hookean constraints, ensuring the molecular identity of the adsorbate is preserved throughout the simulation. Furthermore, during all GAP simulations, metal surface atoms are constrained, while only the lower layers of the surface are constrained during the final DFT relaxation (refer to Fig. 1b).

The initial geometry is randomly placed onto the catalyst surface. DFT is then used to calculate the energy and forces of the complete system. This single configuration forms the initial training set for the GAP. As expected, the initial potential quality is low, resulting in physically implausible structures during the first MH run (see Fig. 2). While these structures are not directly useful for global optimization, they are crucial for significantly improving the GAP by identifying high-energy regions of the potential energy surface (PES). Subsequent iterations initiate with newly randomized starting structures, further assisting MH runs in exploring diverse regions of the PES.

Fig. 2: Putative global minimum configurations of CH3CHOH on Rh(111). Visual comparison of the global minimum configurations at iteration 0 of the iterative training process (left), with the converged GAP (center), and the final minimum structure after DFT relaxation (right), demonstrating the improvement in structural accuracy through iterative training.

Full size image

Based on the generated structures, a limited number of DFT calculations are performed. These calculations serve two purposes: validating the GAP model’s accuracy and generating new training data for the next iteration. A key design choice in this workflow is the selection of structures to be added to the training set. The primary objective is to discover the global minimum geometry for a given adsorbate-surface combination. Therefore, the GAP must accurately predict the geometries of all local minima and reliably rank their relative stabilities. Consequently, minimum structures are essential components of the training set. However, relying solely on putative minima would omit information about energy barriers on the PES and potentially lead to numerical instabilities during the MD runs required for MH (see Fig. 1c).

To establish a robust data selection procedure, several strategies were evaluated, assuming a fixed budget of five DFT calculations per iteration. A baseline approach involved randomly selecting five systems from all configurations generated by an MH run, termed full random. Since MD snapshots typically outnumber relaxed geometries, full random selection inherently favors the former. To address this imbalance, a stratified random approach was tested. In this method, two configurations each were randomly selected from MD snapshots and local minima, while the global minimum from each iteration was always included in the training set. This stratification reflects the anticipated importance of each configuration type within the training dataset.

While random selection provides a strong baseline, many active learning methods employ more sophisticated selection criteria, such as uncertainty estimation or farthest-point heuristics. Therefore, selection schemes using the Farthest Point Sampling (FPS) method were also tested. Analogous to the random approaches, these are termed full FPS and stratified FPS.

All four selection methods were tested across eighteen surface/adsorbate combinations (details in SI). The results indicated that stratification is crucial for achieving convergence of energy and force errors on the minima within a reasonable number of iterations (under 30). The difference between FPS and random selection was minimal, with random schemes slightly outperforming FPS overall. This aligns with previous findings suggesting that FPS is most beneficial when starting with larger datasets, as its emphasis on outliers can be problematic for very small training sets. Based on this comparative analysis, the stratified random approach was chosen for subsequent steps.

III. Model Convergence: Ensuring Reliable Predictions

Beyond providing training data, DFT calculations are also used to estimate the out-of-sample error of the current GAP. Ideally, iterative training concludes when the predicted energies and forces for minima reach sufficient accuracy. However, the limited number of calculations per iteration means that the Root Mean Square Deviation (RMSD) relative to DFT references offers only a noisy approximation of the true out-of-sample error. As illustrated in Fig. 3 for CH2CO on Rh(211), the RMSD fluctuates significantly across iterations. Notably, the RMSD can be very low in certain iterations and then spike, demonstrating its unreliability as a convergence measure.

To overcome this limitation, the Exponential Moving Average (EMA) of the RMSD is used to assess training procedure convergence:

$${{{rm{EMA}}}}(i)=(1-alpha ){{{rm{EMA}}}}(i-1)+alpha {{{rm{RMSD}}}}(i)$$

Here, the hyperparameter α controls the decay rate of previous RMSD weights in the average. Setting α = 1.0 recovers the current RMSD at each iteration i. As shown in Fig. 3, the EMA exhibits a slower, smoother decay than the RMSD. A value of 0.3 for α was used throughout the workflow, striking a balance between providing a conservative error estimate and avoiding excessive iterations. GAP convergence is determined when the EMA falls below 8 meV/atom for energy and 0.15 eV Å−1 for forces. These criteria are empirically determined to provide a good balance between data efficiency and accuracy for the systems investigated.

Fig. 3: Workflow convergence. The graph displays the GAP convergence trend for energy (top) and force (bottom) root mean squared deviations (RMSD) for CH2CO on Rh(211). The shaded region indicates the convergence criteria: RMSD(E) < 8 meV/atom and RMSD(F) < 0.15 eV Å−1. The red line represents the RMSD, while the blue line shows the exponential moving average (EMA) of the RMSD, illustrating the EMA’s smoother convergence behavior.

Full size image

IV. Parallel Minima Hopping: Efficient Global Search

Upon convergence, the training process yields a substantial set of potential minimum structures. However, the quality of these structures is lower in the initial iterations. Therefore, the converged GAP model is used in an extensive MH production run. A parallel MH scheme is employed, where multiple independent MH simulations are initiated. These simulations explore different regions of the PES concurrently, sharing a common history of visited minima. This approach mitigates the risk of a single MH run becoming trapped in a PES region far from the global minimum due to high energy barriers. Initial structures are diversified through random rotations of the rigid molecule around its center of mass and random translations across the metal surface. Forty parallel processes are executed.

Determining when to terminate the parallel MH run is challenging because the global minimum structure is generally unknown a priori. Previous work used MD simulation temperature as a stopping criterion; the temperature increases by a specific factor whenever known minima are revisited. This strategy is also adopted here, with each process performing an independent simulation with its own temperature. Termination occurs when the temperature reaches twice the initial temperature (4000 K) or a maximum iteration count (80) is reached. Parallel MH significantly accelerates convergence, as multiple processes often converge on overlapping PES regions, rediscovering minima previously identified by nearby walkers.

V. Candidate Selection and DFT Refinement: High-Fidelity Minima

The parallel MH production run aims for comprehensive exploration of adsorbate conformer and binding site space, resulting in a large collection of minimum adsorption structures. However, these minima are on the GAP PES and subject to Hookean constraints on adsorbate geometry. The ultimate goal is to obtain unconstrained minima on the DFT PES. Therefore, a subset of filtered minima (ten structures in this study) is further refined at the DFT level using the Bayesian Error Estimation Functional with a non-local van-der-Waals correction (BEEF-vdW).

A naive approach would be to select the ten lowest energy structures from the GAP ensemble, assuming the global minimum is sought. However, this often leads to a set of minima with very similar geometries, representing limited structural diversity. Furthermore, molecular dissociation can occur during DFT relaxation, potentially because a candidate structure is only a minimum on the constrained PES or due to inconsistencies between the GAP and DFT PES.

To address these issues, structural diversity within the candidate ensemble is assessed before selecting configurations for refinement. Structures from the conformer ensemble are mapped into a 2D space using Kernel PCA, with the averaged SOAP vector representing each system. This allows visualization of structural similarity distribution (see Fig. 4). With a fixed computational budget of ten DFT relaxations, k-means clustering is applied to divide the Kernel PCA space into k = 10 distinct regions. Finally, the structure with the lowest formation energy, E_form (according to GAP), from each cluster is selected as a representative candidate for DFT relaxation.

Fig. 4: Maps of adsorbate configurations. Kernel PCA representation of the adsorbate ensemble for CH3CHOH on Rh(211). The top-left panel highlights clusters obtained via k-means and the lowest E_form structures within each cluster. The bottom panel displays five of the selected geometries. The right panel shows Kernel PCA representations colored by E_form, illustrating the energy distribution within the structural space.

Full size image

VI. Application: Molecular Adsorbates on Rhodium Catalysts

To demonstrate the applicability of this workflow in heterogeneous catalysis, a set of thirteen small to mid-sized adsorbates on Rh(111) and five on Rh(211) were studied. These adsorbates were previously investigated by Yang et al. in their study of CO hydrogenation selectivity on Rh. Yang et al. also used MH simulations to identify putative global minima, employing approximate energies and forces from a custom density functional tight binding model (DFTB) implemented in Hotbit. This provides a benchmark to evaluate the accuracy of the present workflow for complex adsorbates, using identical computational settings as in Yang et al.

Formation energies obtained from this workflow are compared to those of Yang et al. in Fig. 5. The global minima predicted by the proposed workflow generally exhibit comparable or lower formation energies than previously reported, with only two exceptions where global minima from Yang et al. are marginally lower. This indicates the GAP potential and MH search are effective in predicting useful starting points for DFT relaxation. The largest energy difference was observed for CH3CHOH on Rh(211), where a configuration 0.26 eV lower in energy than the previous global minimum was identified. Such energy differences can significantly impact catalytic activity.

Fig. 5: Relaxed adsorbate configurations. Upper panel: Comparison of global minima from ref. 5 (blue) and this work (red) for Rh(111) (left) and Rh(211) (right). Faint red bars indicate additional local minima discovered by the present workflow, highlighting its comprehensive search capability. Lower panel: Comparison of adsorption structures for three example cases with the largest ΔE_form compared to ref. 5. ΔE_form represents the difference in formation energies between putative global minima from ref. 5 and this work in eV, showcasing the energy improvements achieved by the new workflow.

Full size image

Analyzing the discrepancies in more detail (Fig. 5, bottom) reveals that large energy differences can arise from different adsorption sites, as seen for CHCO on Rh(211) and CH2CO on Rh(111). In other cases, adsorbate orientation and conformation on the site are critical, as demonstrated by CH3CHOH on Rh(211). Both configurations are attached at similar sites but exhibit different orientations. The GAP/MH workflow excels in this scenario, simultaneously exploring binding sites and molecular conformations, potentially surpassing discrete graph-based algorithms.

Both this workflow and Yang et al.’s work combine MH on an approximate PES with local DFT relaxations. However, this workflow yields a more extensive set of minima, consistently including low-lying minima and, in some instances, identifying significantly more stable configurations. These improvements likely stem from the use of a DFTB model (fitted to BEEF+vdW energies) for the MH search in ref. 5. Ref. 5 acknowledged the limited accuracy of their DFTB model, using it primarily as a structure generator. The current results suggest that better agreement between the approximate PES used for MH and the target PES is beneficial for reliable prediction of low-energy adsorbate geometries, even with final DFT refinement.

VII. Computational Cost: Efficiency of the Workflow

To contextualize the computational advantages of this workflow, timing data for a representative system (CH2CO on Rh(211)) is briefly discussed (see SI). DFT single-point calculations for training set generation (15%) and final DFT relaxations (81%) dominate the core-hour costs. GAP model fitting and MH simulations are computationally negligible. Single-point calculation cost depends on PES complexity and is system-dependent. CH2CO on Rh(211) represents a worst-case scenario among studied adsorbates, requiring 26 iterations for convergence. Simpler adsorbates like H2O and CH3 required only 15-16 iterations. Even for complex adsorbates like CH2CO, the entire workflow completes in under 8000 core-hours (on a 40-core Intel Skylake 6148 node). Given that local DFT relaxations consume the majority of this time, this workflow achieves comprehensive global optimization at a computational cost comparable to local optimization. Performing a full parallel MH run at the DFT level would be approximately 155 times more expensive.

Reinforcement Learning Driven Heuristic Optimization for Global Adsorption Structure Prediction

I. Introduction

II. Iterative Training: Reinforcing Potential Accuracy

III. Model Convergence: Ensuring Reliable Predictions

IV. Parallel Minima Hopping: Efficient Global Search

V. Candidate Selection and DFT Refinement: High-Fidelity Minima

VI. Application: Molecular Adsorbates on Rhodium Catalysts

VII. Computational Cost: Efficiency of the Workflow

Comments

Leave a Reply Cancel reply