Structure of an Ultrathin Oxide on Pt3Sn(111) Solved by Machine Learning Enhanced Global Optimization

Abstract Determination of the atomic structure of solid surfaces typically depends on comparison of measured properties with simulations based on hypothesized structural models. For simple structures, the models may be guessed, but for more complex structures there is a need for reliable theory‐based search algorithms. So far, such methods have been limited by the combinatorial complexity and computational expense of sufficiently accurate energy estimation for surfaces. However, the introduction of machine learning methods has the potential to change this radically. Here, we demonstrate how an evolutionary algorithm, utilizing machine learning for accelerated energy estimation and diverse population generation, can be used to solve an unknown surface structure—the (4×4) surface oxide on Pt3Sn(111)—based on limited experimental input. The algorithm is efficient and robust, and should be broadly applicable in surface studies, where it can replace manual, intuition based model generation.


Content included:
Methods Figs. S1 to S4 Table S1 Methods GOFEE details For the evolutionary structure search, we employed the GOFEE method which detailed in Ref. 31 The machine learned energy landscape was constructed based on the global fingerprint feature from Oganov and Valle. 50 A Gaussian process regression model was built utilizing the same kernel as in Ref. 31 The kernel has two squared exponential terms with different characteristic length scales, whose values together with that of the maximal covariance were found via optimization of the marginal likelihood.
In the present work, a sample of prior DFT structures replaces the population used in the original work. 31 The sample is constructed with the k-means++ clustering method using the euclidian distance between the global fingerprint features as the distance measure between structures. All DFT structures with energies within ∆E sample of the most stable structure found so far are used in the clustering. We use ∆E sample = 5 eV as structures of larger energy are not expected to represent interesting regions of configuration space. The sample size, N sample , was chosen to 10 for the surface oxide structures as this proved efficient. Previous use of the GOFEE method has been successful with a similar size for the population. For illustrative purposes, N sample = 5 is used in Fig. 1. Introducing the present k-means based sampling method as opposed to the original evolving population method eliminates the parameter, k max , that decides how different a new population member should be from existing population members to be adopted. Figure S1 displays a comparison of the two methods showing that the k-means based method is both faster and more reliable than the population based method (for two different choices of k max ). Figure S2 illustrates the actual composition of a sample from one of the conducted GOFEE searches for Sn 11 O 12 .
In each episode, 200 new independent candidates were constructed via rattling of the atoms in the sampled structures. From these, the best candidate according to E LCB = E − κσ, where Figure S1: Comparison of k-means based sampling with an evolving population. The search for the Sn 11 O 12 structure was restarted 300 times using either the new k-means based sampling or the original evolving population. For the latter, two different criteria for how similar a structure may be to other population members were used, either having max 0.99 or max 0.999 kernel elements between any two structures. Success is considered achieved when a structure has a total energy within 0.2 eV of that of the global minimum energy structure. The shaded regions represent 95% confidence intervals. The new sample-based method is both faster and more reliable. For instance, ∼67% of the independent restarts find the GM in less than ∼600 episodes, while for the evolving population based approach, ∼800 episodes are needed for the same fidelity. Similarly, the new method achieves 86% success after 1000 episodes, while the old method only achieves just short of 75% success. The structure searches were conducted for GOFEE implemented in a modified code base. They used 24 candidates and one DFT calculation per episode. The DFT calculation were sped up by having 1 k-point and 300 eV energy cutoff for the plane waves.
E and σ are the model energy and uncertainty, respectively, was chosen. For the constant, κ we used the value 2.
In the DFT evaluation step of GOFEE, we employed the original double-step procedure, where two single-point DFT calculations are performed. First, one is done for the candidate structure as just emerging from the acquisition. Next, another single-point DFT calculation is done for the structure modified by ⃗ F ∆x, where ⃗ F is the DFT force just calculated for the first structure, and where ∆x is a step length. This proceedure seeks to provide data for the machine learned landscape that encodes the proper direction of the energy gradient, and hence enables efficient relaxation in this surrogate energy landscape.

Population
In the main text

DFT calculations
GOFEE searches were carried out on a fixed support, consisting of two layers of Pt 3 Sn(111).
DFT evaluations during the searches were performed using the Atomic Simulation Environment (ASE) 51 with the grid-based projector-augmented wave (GPAW) code 52,53 in plane wave mode with an energy cutoff of 400 eV and a (2 × 2 × 1) k-point grid. Generalized gradient approximation (GGA) with the PBE functional 54 was used to describe the exchange-correlation interaction. The four best candidates for each composition was subsequently transferred to a five layer support and relaxed, fixing only the bottom two layers, using a (4 × 4 × 1) k-point grid and a 500 eV energy cutoff until all atomic forces were below 0.05 eV/Å. Fig. 3 reports the best candidate for each composition.
Four independent GOFEE searches were carried out for each of the 16 compositions, to al- Figure S2: Sketch of the adopted population scheme applied to data from the first 250 iterations (500 structures) of a GOFEE search on the Sn 11 O 12 surface composition. The scheme considers all structures evaluated so far, with an energy within ∆E sample = 5 eV of the currently lowest energy structure found. The upper left plot depicts a feature space representation of these structures, projected onto two dimensions using principal component analysis (PCA), and colored according to energy. In the center, the same structures are colored according to a clustering performed in the full feature space using the k-means algorithm. Example structures are shown for some of the clusters with atoms in the slab dimmed to highlight structural differences. The population is formed by selecting the lowest-energy structure from each cluster. The five enumerated example structures are part of the population for this particular data set and clustering. The enumeration is according to energy, with structure 1 being lowest in energy. This is coincidentally also the global minimum structure for this composition. The PCA dimensionality reduction captures 76% of the variance in the data, with the remainder (the other dimensions) accounting for the apparent overlap of clusters in the figure.
low the consistency of the searches to be evaluated. For most compositions, the assumed global minimum structure was identified in all four searches. As exceptions, it was only reproduced thrice for Sn 9 O 13 , twice for Sn 9 O 11 and once for Sn 9 O 14 .

Stability comparison
To compare the thermodynamic stability across the different compositions, and following Reuter and Scheffler, 55 the surface γ free energy was calculated for each structure, as where A is the surface area in the computational cell, E slab is the total energy of the relaxed surface slab, N Sn and N O denote, respectively, the number of tin and oxygen atoms in the slab, and finally µ Sn and µ O denote the corresponding chemical potentials. Contributions from the pressure and entropy terms are neglected. 55 Similarly, temperature and pressure contributions are neglected when evaluating the chemical potential of µ Sn , such that µ Sn (T, p) = µ Sn . With this, the surface free energy can be simplified to To estimate µ Sn , we assume the surface to be in equilibrium with the Pt 3 Sn bulk phase, which supplies the tin atoms and turns into bulk Pt 7 Sn. This gives Finally, assuming the O 2 atmosphere to form an ideal gas reservoir, the chemical potential of oxygen is taken from 55 to be where E O 2 is the energy of the isolated molecule and ∆µ O (T, p 0 ) is the size of the temperature and pressure dependent contribution to the chemical potential at temperature T and pressure p 0 , which is tabulated in. 55 The surface free energy, γ, is related to the free energy per (4 × 4) cell, as given in Fig. 3, Surface X-ray diffraction phase, shown with expanded z-coordinates to highlight corrugation within the layers. c) Individual layers in the structure. Labels indicate symmetrically distinct atoms with coordinates given in Table S1.
SXRD measurements were performed at the I07 beamline at Diamond Light Source, where the sample was prepared and characterized in an ultra-high vacuum system equipped with facilities for ion sputtering, annealing and low-pressure gas exposure. 56 The sample was a 6 mm di- were assumed. The resulting rod fit yielded a reduced χ 2 = 0.7. In-plane structure factors were fitted afterward, using the final, fixed coordinates from the rod fitting and allowed variation of only an overall intensity factor and an in-plane Debye-Waller parameter. The correspondence between experimental and simulated patterns for the structure is good, with R = 0.15. The best-fit structure is depicted in Figure S3, with coordinates presented in Table S1. Fits to SXRD    Figure S4: AFM/STM measurements with varying tip heights. Sequence of AFM images acquired in constant-height mode as the tip was stepped successively closer to the sample surface. Tip heights are given relative to that acquired at the smallest distance. Short-range interactions are exclusively repulsive between the tip and the protruding Sn, as indicated by the increasingly positive frequency shift (bright contrast).

STM and AFM measurements
Scanning tunneling microscopy (STM) characterization was performed using an Omicron VT STM located at the MAX IV Laboratory, Lund, Sweden. Measurements were acquried at room temperature with an etched W tip in constant current mode. The sample was prepared in the same manner as in the SXRD experiments.
Atomic force microscopy (AFM) was performed at the Vienna University of Technology (TU Wien) using an Omicron LT-STM equipped with a qPlus sensor, a W tip, and custom preamplifier. 61 Measurements were acquired at ∼5 K after a similar sample preparation procedure. At short tip-sample distances (bottom images in Fig. S4 mesh of k-points. The 6s 1 5d 9 Pt, 5s 2 5p 2 Sn, and 2s 2 2p 4 O valence electrons were explicitly modeled, while the remaining (core) electrons were modeled using the corresponding, elementspecific pseudo-potentials. The criterium for ionic convergence was set to 0.01 eV/Å, together with the strong criterium for electronic convergence. The electrostatic potential was generated such that it includes the ionic potential, the Hartree contribution and the exchange-correlation potential.
Constant-height AFM images were simulated by calculating the force acting on the virtual AFM tip over a range of distances from the surface. The force maps were subsequently transformed into frequency shifts according to the small oscillation amplitude approximation. 64 Three different simulations methods were applied: (i) Probing the DFT-optimized electrostatic potential above the surface with a unit point charge; (ii) the Probe Particle Model with an empirical Lennard-Jones potential modeling the interaction between the surface atoms and a negatively charged, oxygen-terminated tip; 65 and (iii) explicitly calculated DFT force-distance curves between a CO molecule above the top-protruding Sn atom of the surface, according to a procedure implemented in ref. 39 (Here an increased vacuum volume was used.) In each case the resulting simulated AFM image corresponds well to the measured AFM images. The simulated image shown in Fig. 4c in the main text was created with method (iii) 3.9 Å away from the protruding Sn atoms.