Introduction

Carbon is the fourth most abundant element in the universe and, while it readily forms a much wider range of compounds than any other element (including the bio-polymers crucial for life), its behaviour is just as rich in elemental form as well. Carbon atoms can bond to each other in fascinatingly diverse ways, forming a wide range of two- and three-dimensional allotropes, amorphous phases, clusters, fullerenes and multi-layered particles that give carbon one of the most diverse ranges of chemical and physical properties among materials1,2,3,4,5,6,7,8. The Samara carbon database, which catalogues simulation data for these proposed structures of carbon, consists of more than five-hundred periodic configurations9 (as of December 2021). Furthermore, the properties of these structures are often unique, such as the hardness of diamond; the electronic properties of graphene; or the high ductile strength of carbon-fibres, resulting in extensive use of carbon across a wide range of industries, from battery design to advanced optical technologies10,11.

One of carbon’s best known features is its phase transition from graphite to cubic diamond at pressures above 2 GPa. Diamond and graphite’s vastly different density and structural properties are reflected in carbon’s melting curve, which exhibits a dramatic change at the corresponding triple point; shifting from a subtly non-monotonic curve at lower pressures where graphite is formed, to diamond’s melting curve that quickly increases in temperature as greater pressures are applied. Diamond remains stable up to at least 300 GPa, but due to the extreme pressure little is known experimentally of carbon’s atomic structure beyond this. Ab initio calculations suggest a maximum in diamond’s melting line at around 450 GPa, as well as a transition to bc8 between 890–1000 GPa, and shock-wave experiments provide evidence for the accuracy of these predictions12,13. The bc8 structure is also predicted to have a maximum in the melting temperature at around 1450 GPa, due to changes in the coordination number in the liquid phase14. In the terapascal regime further phase transitions are predicted, such as bc8-simple cubic and simple cubic-simple hexagonal3.

Atomistic simulations have thus played a major role in discovering novel phases of carbon; furthering our understanding of its phase diagram; and driving the development of new applications by providing useful insight into their structure and properties. However, the diverse properties of carbon mean that capturing its various characteristics within interatomic potential models is particularly difficult, especially when creating models that aim to be transferable among different allotropes and reproduce carbon’s macroscopic properties reliably under a wide range of conditions.

Several empirical interatomic potential models have been developed for carbon in the past 35 years. The bond-order potential introduced by Tersoff15 in 1988 is still considered to be the fastest and most simple carbon potential. Its elegant functional form, in which the strengths of chemical bonds are modified according to the number of nearest neighbours, allows for rapid calculation of chemical properties without a significant sacrifice in accuracy when compared to other, more expensive potentials16. Despite its shortcomings - the primary one being its lack of consideration for long-range interactions - it is still an ideal choice for testing the performance of new computational methods and more complex chemical potentials. Other early carbon models include the Stillinger-Webber potentials parameterised for diamond and graphitic carbon17,18, although, due to their fixed coordination, these models are limited in their transferability across structures. Developed from the Tersoff model to include a wider range of parameters (conjugation and torsional terms), the “reactive” bond-order potentials were introduced: REBO (also referred to as the Brenner potential)19 and REBO-II20. They were further improved by the inclusion of a long-range term to create a potential that accounts for the effects of dispersion, providing the adaptive intermolecular REBO (AIREBO)21. The environment-dependent interaction potential (EDIP) consists of a two-body pair energy; a three-body angular penalty; as well as a generalised description of coordination22. EDIP is known to successfully predict topological properties of carbonaceous films as well as clusters23,24. One of the first empirical models capable of providing an accurate description of low-to-medium pressure phases of carbon is the long-range carbon bond-order potential (LCBOP). The LCBOP model is partially based on ab initio data, closely matches the ab initio MD results for the liquid structure, and accounts for interplanar interactions in graphite25. Ghiringelli et al. have calculated the pressure-temperature phase diagram of the LCBOP potential, calculating the melting line up to 60 GPa and graphite–diamond transition, showing a good agreement with experimental findings26. Another family of potentials were developed to accurately describe carbon’s bond formation and dissociation: the reactive force field (ReaxFF) potentials27,28.

The emergence of machine-learning (ML) techniques offer the construction of potential models which are comparable in cost to classical interatomic potentials and, at the same time, comparable in accuracy to ab initio-level calculations. Using the Gaussian approximation potential (GAP) formalism29,30, an ML potential was developed to describe the behaviour of liquid and amorphous carbon accurately31. This was later extended to include properties of crystalline bulk phases, defects and surfaces, known as the GAP-20 model32,33. The C60 GAP force field includes van der Waals corrections and is especially suited for the simulation of C60 fullerene structures34, with another recent version specifically trained for nano-porous carbon35. Recently, two other ML carbon potentials were also developed, using neural-networks36 and the ACE formalism37.

The performance and reliability of these potentials have been compared from different perspectives. Their (in)ability to describe amorphous structures16,23 has underscored transferability issues and highlighted the need for thorough investigation of models in order to trust the interpretation of simulation results. The accuracy in predicting microscopic properties (e.g., surface energy, formation energy of common defects) have been also compared32. The performance of seven models in predicting the properties of carbon nano-clusters have been recently investigated, with a focus on their accuracy in structure search and global optimisations24. The GAP-20 model emerged as the best performing model.

While these studies provide a detailed picture of the microscopic properties of carbon potentials, our knowledge of their macroscopic properties is limited. In order to understand the reliability and predictive power of computational results, it is important to examine the potential models’ macroscopic behaviour and evaluate their phase stability, unbiased by our chemical intuition. Ultimately, this also informs the development of new generations of potentials, such as ML-based models, highlighting strengths as well as areas for improvement.

In the current work we aim to evaluate the performance of carbon potentials and calculate their pressure-temperature phase diagram, by performing an exhaustive and predictive sampling of the potential energy surface, using the nested sampling technique38,39. Nested sampling (NS) was first introduced by John Skilling in the area of Bayesian statistics40,41, later taken up by various research fields39 and adapted to sample the potential energy surface of atomistic systems38,42. The main advantages of NS are that it automatically generates thermodynamically relevant structures without any prior knowledge of, e.g., (meta)stable crystalline structures; moreover it provides unique and easy access to the notoriously elusive partition function. Thermodynamic properties that are otherwise difficult to determine, such as the heat capacity or free energy, thus become straightforwardly calculable. The added usefulness of NS resides in the fact that a broad picture of the phase diagram can be gained by a single technique, overcoming the typical procedural barriers one faces when working with multiple simulation methods and/or packages.

The power and usefulness of NS has been thoroughly demonstrated in studying various systems, as well as in comparison to widely used computational techniques. Examples of its application include cluster formation42,43,44; calculation of the quantum partition function45; sampling transitions paths46, as well as the calculation of the pressure-temperature phase diagram for various metals47,48,49, alloys50,51, and model potentials52, identifying previously unknown stable solid phases.

In the current work we compare the behaviour of three widely used interatomic potential models for carbon using NS, which span a suitable range in terms of complexity, accuracy and computational cost. We first use the ML potential, GAP-2032, considered to be the state-of-the-art model for carbon3,16,23, to examine its reliability outside its original training conditions and hence understand better the extent of the model’s transferability and predictive power. The majority of our GAP-20 calculations were performed using the original model detailed in Ref. 32, and we also provide supplemental results generated with the updated version of the model, GAP-20U, released recently33. As the fastest and simplest model, we evaluate the phase diagram of the Tersoff model in the original parameterisation form, as available in LAMMPS (although valuable modifications to the Tersoff carbon potential also exist53,54). We also selected EDIP22 for modelling, providing a mid-point in accuracy and computation cost between the Tersoff and GAP-20 potentials.

Results

GAP-20 and GAP-20U

Nested sampling runs with the GAP-20 potential were carried out with a system size of 16 atoms at ten different pressures between 0.1 and 1000 GPa. Due to the large computational cost of the GAP potential, fewer calculations were carried out with 32 atoms - at pressures of 1, 10, 50, 500 and 800 GPa - to assess the finite size effects at pressures where different solid phases are expected. The configuration space of the GAP-20U potential was also sampled using NS, at pressures of 0.1, 1, 10 and 50 GPa, to assess the extent to which the melting line may deviate from the original GAP-20 model in the graphite and cubic diamond phases. The resulting pressure-temperature phase diagram is illustrated in Fig. 1. The experimentally determined phase boundaries55,56,57 are shown by solid black lines, highlighting that the graphite melting line has a slight maximum, as above 0.4 GPa the density of graphite becomes lower than that of the liquid, causing the melting line to have a negative gradient. This change however is very subtle, driven by the relatively weak interaction between graphite’s neighbouring hexagonal layers. Above 20 GPa the liquid carbon freezes into the high-density cubic diamond structure, resulting in melting temperatures increasing rapidly with pressure in comparison to the graphite phase.

Fig. 1: Pressure-temperature phase diagram of GAP-20 ML potentials.
figure 1

Black lines show experimental phase boundaries55,56,57; red dashed lines show high-pressure phase transitions predicted by DFT from ref. 14 (with the phase above 1 TPa being bc8); purple and green lines and symbols show NS results of GAP-20 with 16 and 32 atoms respectively; orange lines and symbols correspond to the GAP-20U potential with 16 atoms; and blue lines and symbols show the results of GAP-20U+gr with 16 atoms. Symbols reflect the most stable phase predicted by NS at the corresponding temperature and pressure. Error bars represent the full widths at half maximum of the heat capacity peaks.

The melting curve predicted by the GAP-20 model follows these experimental features with reasonable accuracy, though at pressures below 10 GPa there is a clear positive gradient in the graphite melting line where the experimental melting line is non-monotonic. It is in the graphite region of the phase diagram that we also observe the only notable deviation between the melting lines of the GAP-20 and GAP-20U models, with a difference of around 10% in melting temperatures at 0.1 GPa such that the GAP-20U model’s phase boundary has a steeper gradient and deviates further from experimental trends compared to the GAP-20. Figure 2 shows the heat capacity curves calculated by NS using GAP-20, showing how the points on the melting line were determined based on the location of the peaks. The peaks corresponding to 32-atom runs are sharper than those of the 16-atom runs, reflecting how in the thermodynamic limit the heat capacity diverges at first-order phase transitions. The difference between the transition temperature predicted by 16 and 32-atom runs is approximately 8% at lower pressures, with the difference diminishing at pressures above 100 GPa, suggesting that finite size effects become negligible at higher pressures.

Fig. 2: Heat capacity and densities of GAP-20 ML potentials.
figure 2

Top panel, a heat capacity as a function of temperature at various pressures, calculated using the GAP-20 with 16 atoms (dashed lines) and 32 atoms (solid lines). For better visibility, heat capacity values are shifted according to pressure. Bottom panel, b density as a function of temperature calculated using the GAP-20, GAP-20U and GAP-20U+gr potentials, using 16 atoms. Arrows point to the temperatures at which the peaks of the corresponding heat capacity curves are found for GAP-20. Filled and open triangles on the density axis show experimentally determined room temperature density values of graphite and diamond, respectively75.

At 0.1 GPa, the liquid phase generated by NS is dominated by chain-like structures. This is in agreement with the known low-coordinated liquid phase formed at low pressures, dominated by branch-like structures31. To demonstrate the change in the typical coordination of carbon atoms at different temperatures and pressures, we calculated the NS weighted average of the coordination number over a range of temperatures using Eq. (1), as shown in Fig. 3. Here we see that at 0.1 GPa the average number of neighbours in the liquid phase reaches a maximum of two before rapidly increasing to three at the freezing transition. As pressure increases, the liquid can no longer sustain the chain-like structures, and we observe an increase in the average coordination number. The average coordination number also reflects the structure of the solid phases, with three nearest neighbours in the case of graphite and four in the case of diamond, with higher values for the extreme high pressure phases.

Fig. 3: Average coordination number of GAP-20 model at different pressures.
figure 3

Average coordination number, within a cutoff of 1.8 Å as a function of temperature using GAP-20 with 16 atoms, sampled by NS. Multiple lines correspond to results of multiple parallel NS runs.

Up to 20 GPa the liquid freezes into the graphite structure. While at lower pressures the density of the graphite is found to be higher than that of the liquid, this trend changes, and at 20 GPa we can observe a maximum on the density curve at the transition, shown in the bottom panel of Fig. 2. This is consistent with the expectation that the melting line has a negative gradient in that pressure range. We can therefore deduce that within the GAP-20 and GAP-20U models there is a compensation point around 10–20 GPa where the density of the liquid phase is equal to that of graphite at the melting transition, corresponding to a maximum in the phase boundary. This is in qualitative agreement with experiment, though the maximum is expected to occur at lower pressures, around 0.5 GPa.

The graphite configurations explored by NS are diverse both in terms of the distance between adjacent graphite layers and in stacking pattern. Among the configurations generated by NS we can find the most energetically favourable AB and ABC stacking variants58,59, alongside AA stacking and a variety of unique arrangements where adjacent layers are shifted only partially in relation to each other, spanning the phase space between the typical AA, AB and ABC patterns. In terms of the distance between the graphite layers, we see a significant change with respect to temperature and pressure. Figure 4 shows the distribution of carbon atoms along the normal vector of the graphite structure at different pressures and temperatures, calculated as the phase space–weighted average (using Eq. (1)) from configurations generated by NS. At 0.1 GPa the typical spacing between neighbouring layers is around 3.8 Å at temperatures below 2000 K, while the intralayer distributions become significantly broader as the temperature increases. This layer distance corresponds to a lattice parameter of c = 7.6 Å, much larger than the experimentally observed value of c = 6.71 Å60. The underlying reason for this discrepancy becomes obvious by calculating the energy of graphite structures as a function of the lattice parameters, shown in panel (a) of Fig. 5. These calculations reveal that the graphitic energy surface has multiple minima with respect to layer spacing in the case of GAP-20, with the lowest energy distance confirmed to be at c = 7.6 Å. At higher pressures the contribution of the pressure-volume term to the enthalpy becomes significant enough that local minima corresponding to shorter layer distances become enthalpically favourable. This is reflected in the histograms of Fig. 4, which show that NS runs at 10 and 20 GPa sampled graphite configurations that are consistent with the local minimum at c = 5.5 Å. We even observe a phase transition at 10 GPa as temperature is reduced below 2000 K, as the average spacing rapidly decreases from c = 6.3 Å to 5.5 Å, with a double peak feature at 1000 K that reflects the simultaneous sampling of graphite basins with distinct layer separations. It is important to note that this behaviour naturally influences the average density of the sampled graphite phases as well. Specifically, it leads to a lower-than-expected density at low pressures and a higher density than expected at higher pressures. We could speculate that this behaviour, if affecting the density ratio between graphite and liquid carbon, is capable of changing the gradient of the melting curve and shifting the expected maximum in the melting temperature to higher pressures. We will address this idea further in a later section on improving the potential. The multiple minima as a function of graphite lattice parameters can be still observed, though to a lesser extent, in the case of GAP-20U (see Fig. 5, panel b). As in the case of the GAP-20, the updated model exhibits a phase transition at 10 GPa from high to low density graphite just below 2000 K, though the change in density is significantly smaller. Calculations of the graphite energy landscape using DFT (shown in panel d of Fig. 5) show that these curves should be completely smooth, with only a single minimum at 6.7 Å.

Fig. 4: Pressure and temperature dependence of graphite layer spacing.
figure 4

Distribution of the carbon atoms along the direction perpendicular to the graphite layers, calculated as the weighted average from the NS configurations, using the GAP-20 potential and 16-atom runs. Distribution around 0.0 Å acts as the reference layer, its widths representing the deviation from a perfectly flat layer. Vertical lines show the equilibrium graphite layer distance, coloured according to their respective pressures.

Fig. 5: Minimum energy layer spacing of graphite using different models.
figure 5

Potential energy of the graphite AB structure as a function of lattice parameters a and c, represented in the inset of panel a. Calculated using GAP-20 (panel a), GAP-20U (panel b), our re-trained potential including more graphite structures, GAP-20U+gr (panel c) and DFT (panel d). Symbols in panel c represent lattice parameters of the graphite configurations we added to the training set.

Although the liquid freezes to the graphite structure at 20 GPa, in the case of the GAP-20U potential we observe the solid-solid transition to diamond at 2800 K. This is marked by a sudden and significant jump in density, which can be seen in the bottom panel of Fig. 2. Using a combination of density and the Steinhardt bond-order parameters61 Q4 and W4, we are able to distinguish between diamond and graphite configurations generated by NS and calculate their contributions to the Gibbs free energy separately, as a function of temperature. Comparing these free energy contributions allows us to locate the phase transition between the different crystalline structures and the liquid, as shown in Fig. 6. As expected, the temperatures at which graphite and diamond become the most stable phases correspond exactly with peaks in the heat capacity, as well as the sudden step in density that was previously noted.

Fig. 6: Graphite–diamond phase transition of GAP-20U.
figure 6

Nested sampling results at 20 GPa, using 16 atoms and the GAP-20U potential. Top panel: density of individual configurations sampled during NS, symbols are coloured by the average Q4 bond-order parameter of the configuration. Middle panel: Gibbs free-energy difference compared to the diamond phase. Bottom panel: The corresponding heat capacity curve. Vertical grey lines show the melting temperature, Tm and the solid-solid transition between the diamond and graphite phases, TDG.

While NS simulations at 40, 50 and 100 GPa also explored graphite and hexagonal diamond structures to some extent, these phases remain metastable at all temperatures, as the cubic diamond structure becomes the dominant phase. Crucially, the change in the stable solid phase, from graphite to diamond, also corresponds to the change in melting line from a roughly vertical curve to one with a large positive gradient, as also observed experimentally55,56,57. These agreements are particularly notable, as high-pressure behaviour was not explicitly considered in the potential development process, and there is no indication that structure optimisation was performed at non-zero pressures in the training data. It must be noted however that the training data contains configurations where the stress tensor has non-zero diagonal elements, corresponding to pressures ranging between −100 and 100 GPa, isotropic or otherwise.

To evaluate the accuracy of the GAP-20U model more generally – across the liquid, graphite and cubic diamond phases – we take configurations generated with NS at three different pressures (0.1, 10 and 50 GPa) over a suitable range of temperatures (500–11000 K) and for each sample calculate the difference in potential energy predicted by the GAP-20U and DFT models. The results of these calculations are shown in Fig. 7, showing a maximum energy difference of ~0.35 eV/atom in the liquid phase at 0.1 GPa. At each pressure, the energies of the liquid configurations are typically underestimated by the GAP-20U, and unsurprisingly the overall distribution of energy differences in the liquid phase is considerably larger than those of the solid phases, with a sharp decrease in the distributions at the freezing transitions. In the graphite phase at 0.1 GPa and 10 GPa we see that the agreement between the GAP-20U and DFT energies improves as the temperature decreases and crystal order increases, however at 10 GPa there is a sudden deviation in energies just below 1000 K, corresponding to the graphite spacing transition than can be seen in Fig. 4. In comparison, the energy difference in the cubic diamond phase at 50 GPa are far smaller than in the graphite phases at low temperatures, suggesting that diamond’s higher degree of crystal symmetry and stronger, isotropic bonding makes its energy landscape less difficult to approximate via machine learning.

Fig. 7: Comparison between GAP-20U and DFT energies.
figure 7

Energy differences between the GAP-20U and DFT models of configurations generated during NS runs at 0.1, 10 and 50 GPa, plotted as a function of temperature. Points are coloured according to the density of their corresponding configuration. Grey shaded areas show phase transitions, with the widths representing the full width at half maximum of the corresponding heat capacity peaks.

Before continuing on to discuss the GAP-20 model’s extreme high pressure behaviour, we acknowledge the erroneous stability of a very low density bcc phase in the PES of the GAP-20, which would later be addressed in the updated GAP-20U model33. Due to its large nearest neighbour bonds (~80% larger than typical carbon bonds) and a coordination structure that is very different from that of the corresponding liquid phase, this phase is not explored by the NS, nor by the structure searches performed in the original work. Hence, we speculate that the phase space volume of the low density bcc phase is likely to be negligible compared to the graphite structure, and separated from the liquid by extremely high free energy barriers. Further evidence of the bcc structure possessing a relatively small phase space volume can be found in the Discussion section of the Supplementary Material.

The predictive power of the GAP-20 and GAP-20U models is reasonably good, even up to 100 GPa. Further increasing the pressure will certainly break down the reliability of the model, but exploring to what extent and under what conditions this will occur can still provide us critical information about the ability of the machine learning to extrapolate, as well as areas for future improvement. NS simulations above 100 GPa suggest that the melting line closely follows the trend expected from DFT calculations (see Fig. 1), but at extreme high pressures two new phases emerge as ground state structures of the GAP-20, both in the 16-atom and 32-atom simulations. At 500 and 800 GPa the stable structure predicted by NS is that of a strained variant of cubic diamond, where the strain is positive, in the direction of an arbitrary cubic axis and coupled with a compression along the perpendicular axes. Between 800 and 1000 GPa the system transitions to a highly compressed hexagonal close packed structure. This belongs to the P63/mmc spacegroup, having two atoms in the unit cell, each with eight nearest neighbours. We will refer to this structure as strained hexagonal close-packed (strained hcp). Figure 8 shows snapshots of these two new structures along with cubic diamond, as well as the corresponding radial distribution functions, with all three structures optimised at 300 GPa. The enthalpy differences between the different optimised structures at 0 K are shown in Fig. 9, calculated by the GAP-20 and GAP-20U potential up to 1 TPa, as well as with DFT for comparison up to 10 TPa. While both GAP-20 and GAP-20U predict the stabilisation of strained cubic diamond structure at very high pressures, cubic diamond becomes the ground state again above 380 GPa in the case of GAP-20U. This demonstrates that changes to an ML potential from refitting may influence the behavior of the model in data-sparse regions of phase space, far from its fitting conditions. It is notable that, as the bc8 structure was not included in the training data, neither versions of the GAP model predict it to be a low-enthalpy state at pressures above 1 TPa. Geometry optimisations carried out with the same DFT parameters as those used in the training show good agreement with previous ab initio random structure search results3, showing a ground state transition from cubic diamond to bc8, simple cubic, then to simple hexagonal as pressure increases. While neither of the high-pressure configurations predicted by the GAP models have proven to be ground state structures according to DFT, they are nevertheless low-enthalpy metastable states that may be worth further consideration. Finally, the considerable agreement between the extreme high pressure melting lines predicted by GAP and DFT, in spite of the GAP’s erroneous phase stability, implies that the model maintains an accurate description of carbon’s macroscopic density.

Fig. 8: Extreme high pressure structures using GAP-20.
figure 8

High pressure structures found to be stable by NS, using the GAP-20 potential. Cubic diamond (a) strained diamond (b) and strained hexagonal-close-packed (c) structures. The lower panel shows the radial distribution function of the above three structures, optimised at 300 GPa.

Fig. 9: Ground state structures of different models as a function of pressure.
figure 9

Enthalpy of several thermodynamically relevant carbon crystal structures, calculated as a function of pressure using the GAP-20 potential (top), GAP-20U potential (middle), and DFT (bottom). In each case, energies are compared to the cubic diamond structure at the same pressure, and shaded areas highlight the pressure region when a structure is the ground state. The simple cubic and simple hexagonal structures are marked with sc and sh, respectively, in the bottom panel.

GAP-20U+gr

The exhaustive and unbiased sampling of carbon’s phase space afforded by NS allows us to identify regions where each model’s description could be improved. Moreover, it helps to identify structural features that are captured inaccurately by the model. An obvious area for improvement is the extreme high pressure behaviour, i.e., the relative stability of crystal structures at pressures above 200 GPa - most notably the lack of a stable bc8 phase. Making these improvements will require the inclusion of configurations of multiple crystalline phases in the training set, with repeated exhaustive sampling to confirm the finite temperature stability of the solid phases. Given the associated computational cost of this particular flavour of GAP modelling, which focuses on providing an accurate description of the long range van der Waals interactions, we will address these improvements in a future project by concentrating on shorter range interactions which dominate at high pressures.

However, using NS we were able to identify another shortfall of the GAP-20 models, the erroneous local minima with respect to inter-layer spacing in the graphite phase. In this section we aim to improve the accuracy of the GAP-20U model’s description of the graphite phase – the primary goal being the elimination of local minima that we have previously shown to result in unphysical solid-solid graphite phase transitions. We have therefore expanded the DFT dataset on which the potential is trained by including an additional 165 ordered graphite configurations with AA, AB and ABC stacking patterns (the entire training dataset, including these new configurations, are available at DOI:10.5281/zenodo.7463706). The lattice parameter c spans a range of ±40% of the equilibrium value for each stacking pattern (determined from DFT), where caa = 7.02 Å; cab = 6.64 Å, and cabc = 6.70 Å; while a is varied by ±2% about an equilibrium value of 2.47 Å. We otherwise used the same GAP fitting parameters as in the original GAP-20U, in order to preserve the work that was done in optimising the potential’s transferability32. The additional data points from the AB-ordered set are illustrated in panel (c) of Fig. 5 alongside the energy landscape of the updated potential, which we refer to as the GAP-20U+gr. These results resemble DFT calculations much more closely than both the GAP-20 and GAP-20U models. The minimum-energy layer separation remains the same, with the c lattice parameter being 6.7 Å. Performing structure optimisations with the new potential reveals that the 0 K graphite–diamond transition has been shifted to 7.2 GPa, much closer to the ab initio prediction of 5.8 GPa as compared to 9.0 GPa in the case of the GAP-20U. Given that the GAP-20U+gr remains practically unchanged from the GAP-20U with respect to energies of non-graphite configurations and the pressure-dependent stability of different crystalline phases (see the Discussion section of the Supplementary Material), we do not expect significant deviation from the GAP-20U in other respects, though of course this is difficult to fully evaluate without considerable time and resources. Though the error with respect to the DFT graphite energy landscape is reduced considerably, there remains a shallow local minimum around c = 7.4 Å. Additional tests show that this artifact persists even when additional data points are included in this region, suggesting it is the result of influence from other configurations in the dataset. It should also be noted that long range interactions such as those between adjacent graphite layers are difficult to accurately capture with ML methods, due to the inherent increase in configurational complexity as the potential’s cut-off radius is increased. This is why recent ML potentials aiming to model the graphite phase have opted to tabulate the long range interactions35.

In order to evaluate the performance of the enhanced potential in the case of unbiased PES sampling, we have performed single NS runs — using the same NS parameters as those used for the GAP-20U — at four different pressures: 0.1 GPa, 1 GPa, 10 GPa and 20 GPa. The resulting phase transitions are included in Fig. 1 and the corresponding densities are shown in the bottom panel of Fig. 2. We find that the enhanced potential, GAP-20U+gr, predicts graphite densities that are closer to experimental values than the GAP-20U, which can most clearly be seen at 10 GPa, where the GAP-20U+gr model does not undergo a phase transition to a lower density graphite phase as temperature increases, as the GAP-20U does just below 2000 K. Performing the same thermal averaging analysis as shown in Fig. 4 on the new potential, the local minimum at c = 7.4 Å does appear to affect the average spacing at 0.1 GPa by broadening the distribution, but the resulting decrease in average density is minimal.

However, despite these improvements, we do not observe a significant change in the melting behaviour, with melting temperatures matching the GAP-20U results almost perfectly. This suggests that the inaccuracy of the gradient of the melting line, closely tied to the density ratio between graphite and liquid carbon, may in fact originate from problems not with the graphite density, as we originally suspected, but from the liquid being less dense than expected.

Tersoff potential

The phase diagram calculated with the Tersoff potential is shown in Fig. 10. Compared to the GAP-20 models, the Tersoff potential shows a significantly larger finite-size effect that is consistent with finite size effects seen in other empirical potentials47,49. The effect’s significance, quantified by the difference in temperature between 16- and 64-atom runs, is non-monotonic with respect to pressure, peaking around the graphite–diamond transition at 50 GPa. Overall, the melting line reflects the experimental trend reasonably well at 64 atoms, however the pressure-dependent phase stability is less accurate. Though graphite is formed below 50 GPa, the melting line does not reflect the expected negative gradient at lower pressures, nor the significant change in the melting line gradient above the graphite–diamond-liquid triple point, which is overestimated by around ~400% compared to experimental results. The origin of Tersoff’s monotonic melting curve in the graphite phase is its small cutoff of 4.1 Å, which leads to a dramatic underestimation of the equilibrium lattice spacing compared to DFT, by around 40%. This corresponds to a graphite phase that is more dense than the liquid phase at all pressures, hence the lack of a maximum in the melting curve.

Fig. 10: Temperature-pressure phase diagram of the Tersoff potential.
figure 10

Black lines show experimental phase boundaries55,56,57, coloured lines and symbols correspond to NS results with different system sizes. Error bars represent the full widths at half maximum of the heat capacity peaks.

Once again we use the Steinhardt bond-order parameters to sort solid configurations into diamond and graphite basins, allowing for the calculation of each phase’s contribution to the Gibbs free energy and the determination of solid-solid phase transitions. In Fig. 11 we demonstrate this at two different pressures. At 30 GPa, the large majority of the solid configurations fall into the basin of the graphite structure, however, the metastable diamond phase is also sampled to a lesser extent. A third and smaller basin (appearing to have Q4 = 0.45) can be observed between these, representing a structure where small graphite-like motifs are interconnected by four-coordinated carbon atoms. As the pressure increases, the diamond structure becomes more dominant, until it becomes the ground state structure at 80 GPa. Due to the Tersoff potential’s short range, the potential energy of the perfect cubic and hexagonal diamond structures are the same, and at pressures where the diamond phases are stable their sampling is about equal, suggesting that their free energy is comparable as well.

Fig. 11: Simultaneous sampling of graphite and diamond phases using Tersoff potential.
figure 11

Average Q4 bond order parameter61 of configurations generated by NS, using the Tersoff potential at two different pressures. Each point corresponds to a configuration generated by NS and coloured according to the average W4 order parameter. Arrows point to the Q4 values of diamond and graphite structures. Vertical dashed lines represent the phase transitions as determined by the peaks of the heat capacity curves.

EDIP

Nested sampling calculations using the EDIP potential were performed with 16 and 32 atoms, at pressures ranging from 1 GPa to 1500 GPa. The resulting phase diagram is shown in Fig. 12, showing overall excellent agreement with experimental phase behaviour up to 100 GPa. The melting line follows the experimental trends well, with a considerably smaller finite-size effect compared to the Tersoff potential. At lower pressures graphite is formed upon freezing, as expected, with typical layer spacings at low temperature corresponding to a lattice parameter of c = 6.4 Å, only a 5% underestimation of experimental data. To explore the EDIP’s graphite phase further, we plot its energy as a function of lattice parameters in Fig. 13. One of the potential’s shortcomings is its lack of dispersive, long-range interactions, and that is reflected in its graphitic energy landscape, as we see no change in energy beyond c = 6.4 Å. While this is not consistent with the clearly defined minimum separation predicted by DFT, the influence of the PV term in the enthalpy effectively prevents larger separations from being energetically relevant at finite pressures and zero temperature. To evaluate the finite temperature effect of this short interplanar cutoff, we plot thermally averaged distributions of carbon atoms perpendicular to the graphite planes in Fig. 14, for pressures of 1 and 10 GPa. These show an expected broadening of carbon atom dispersion in the reference layer at higher temperatures, due to thermal disorder, but for the nearest-layer distributions this broadening becomes more biased towards larger spacings as temperature increases, suggesting that the lack of a long-range energy barrier allows unphysically large layer separations to overcome the PV term and become thermodynamically relevant.

Fig. 12: Temperature-pressure phase diagram of the EDIP potential.
figure 12

Black lines show experimental phase boundaries55,56,57, red dashed lines show high-pressure phase transitions predicted by DFT from ref. 14 (with the phase above 1 TPa being bc8), purple and green coloured lines and symbols show NS results with different system sizes. Error bars represent the full widths at half maximum of the heat capacity peaks. The top panel includes the high-pressure range of the phase diagram at a different temperature scale to show the maximum of the melting line.

Fig. 13: Minimum energy layer spacing of graphite using EDIP.
figure 13

Potential energy of the graphite AB structure as a function of lattice parameters, calculated using the EDIP potential.

Fig. 14: Pressure and temperature dependence of graphite layer spacing using EDIP.
figure 14

Distribution of carbon atoms along the direction perpendicular to the graphite layers, calculated as the weighted average from NS configurations, using the EDIP potential. Distribution around 0.0 Å acts as the reference layer, its widths representing the deviation from a perfectly flat layer. Bars are semi-transparent to aid the visibility of the distributions. Vertical lines show the equilibrium graphite layer distance at the two different pressures.

This large separation-bias persists at higher pressures, however the effect is diminished, which can be intuitively understood as the increased pressure (and PV energy) encouraging smaller volumes, and preventing thermal fluctuations from stabilising larger-separation structures. As the temperature decreases, there is less kinetic energy available to smear the energies of the optimised structure, and thus fewer large separation configurations can be energetically viable. In spite of EDIP’s short interplanar cutoff and its effects, its description of the graphite melting line is remarkably accurate at only 32 atoms. Given the importance of the ratio between liquid and solid densities in sha** the melting line, and that graphite’s volume is particularly sensitive to interplanar separation, these results show that an accurate description of graphite’s phase space is essential for determining its melting behaviour.

At a pressure of 10 GPa we begin to observe a small number of cubic and hexagonal diamond structures among the sampled configurations, around the freezing transition. However, these structures quickly lose thermodynamic relevance in comparison to the graphite phase. When pressure is increased to 20 GPa, the diamond configurations become enthalpically viable enough that the NS algorithm simultaneously samples them along with the graphite phase, such that we can identify the solid-solid transition. Supplementary Fig. 15 shows the densities and Q4 parameters of configurations sampled by NS, which we sort into different structural basins as before. The resulting free energy of the diamond and graphite structures are shown in the middle panel of Fig. 15, demonstrating that below the melting point graphite is more stable than diamond, however their free energy difference is smaller in comparison to the GAP-20U model. The most notable difference between the two potentials is that, in the temperature region where the liquid phase is the most stable, EDIP’s graphite phase is less stable than the diamond, whereas the GAP-20U shows graphite to be more stable than diamond up to the solid-solid transition. This is likely a result of the EDIP’s short cutoff providing an unphysically broad distribution of layer spacings at higher temperatures, which necessarily incurs an entropic energy penalty. Like in the case of the Tersoff potential, the potential energy of the perfect cubic and hexagonal diamond structures are the same using EDIP, and above 20 GPa NS runs sampled both structures equally.

Fig. 15: Graphite–diamond phase transition of EDIP.
figure 15

Nested sampling results at 20 GPa, using 16 atoms and the EDIP potential. Top panel: density of individual configurations sampled during NS, symbols are coloured by the average Q4 bond order parameter of the configuration. Middle panel: Gibbs free-energy difference compared to the diamond phase in three parallel runs. Configurations were associated with basins using the number density and the bond order parameter. Bottom panel: Heat capacity curves of the three parallel runs. Vertical grey lines show the melting temperature, Tm and the solid-solid transition between the diamond and graphite phases, TDG.

The top panel of Fig. 12 shows that at extreme high pressures, above 100 GPa, the EDIP’s diamond melting line rapidly increases in temperature before reaching a maximum of 24000 K at 1000 GPa. This turning point is at a much larger temperature than those predicted by the GAP-20 and DFT14, 2.5 and 3 times larger respectively. The pressure at which it occurs is also larger, though by only 10%. At 1500 GPa NS calculations explore the six-coordinated P212121 structure below the freezing transition, while zero temperature structure optimisations confirm that this structure is stable for EDIP above 2650 GPa.

Discussion

In the current work we reviewed the performance of three interatomic potential models of carbon, ranging from fast but less transferable empirical force fields to slower ML potentials with state-of-the-art accuracy. Our study focused on assessing their ability to reproduce experimentally observed macroscopic properties. We used the nested sampling technique to sample the potential energy surface of these models over a wide pressure range, calculating their pressure-temperature phase diagram and predicting crystalline phases. We emphasise that nested sampling is a unique tool that allows us exhaustive exploration of the phase space and makes the calculation of the entire phase diagram a relatively straightforward process, while also being predictive and not restricted by known or considered crystalline structures. All three models, GAP-20, Tersoff and EDIP, predicted the graphite structure to be more stable at low pressures and the diamond structure at higher pressures. However, the transition between these as well as the location of the melting line differed considerably. Empirical potentials are often fitted to specific microscopic properties, for example to typical coordination of graphite and diamond structures, hence their high-temperature and high-pressure behaviour cannot be expected to accurately reflect the diverse structural properties of carbon. Nevertheless, while the macroscopic properties of the Tersoff potential differ from the experimental phase diagram considerably, we found the phase diagram of the EDIP potential to be very close to experimentally observed behaviour, accurately reflecting both the predicted graphite–diamond transition, as well as the melting line up to relatively high pressures.

Machine learning (ML) potentials provide the state-of-the-art in descriptions of atomic interactions, opening up routes to materials discovery that are otherwise out of our reach and, to some degree, offering ab initio level accuracy at an affordable computational cost. However, the main criticism of ML potentials is that they are inherently best suited to interpolation problems, and perform reliably only in the regime of configuration space where the potential was trained. This means that their behaviour in unexplored territory, in configurational regions where the potential is forced to extrapolate from the training data, can be unrealistic or unphysical — inhibiting their use in scientific discovery. Therefore, our results showing that the GAP-20 potential performs well and predicts the expected phase transitions reliably up to 200 GPa, well outside the original training conditions, is remarkable, emphasising the power of including a diverse range of local atomic environments in the training process. Moreover, the exhaustive exploration provided by NS also highlighted local weaknesses of the model, such as the stabilisation of unexpected graphite-layer distances or the prediction of erroneous phases at very high pressures, offering areas for potential improvement and extensions of the GAP-20 model. Using these observations we have presented an enhanced version of the GAP-20U potential called the GAP-20U+gr, which includes additional ordered graphite configurations in the training set to successfully avoid graphite phases with unphysical layer spacings becoming stable under certain thermodynamic conditions. However, these improvements to the model’s description of the graphite phase did not provide a more accurate melting line, leading us to conclude that the density of the liquid phase at low pressures must also be addressed in further updates to the machine-learned potential.

Methods

Nested sampling

The NS calculations were performed as presented in ref. 47. After the sampling has finished, we calculate the partition function and derive thermodynamic response functions to determine the phase behaviour. We use the position of peaks in the heat capacity to locate phase transitions, and calculate the phase space-weighted averages of observables (e.g., coordination number) to evaluate their finite temperature values using the following equation:

$$\langle A\rangle \approx \frac{1}{\Delta }\mathop{\sum}\limits_{i}{A}_{i}({\Gamma }_{i-1}-{\Gamma }_{i}){e}^{-\beta {H}_{i}},$$
(1)

where Δ is the isobaric partition function; β is the inverse temperature; and Ai, Hi and Γi are the observable value, enthalpy and phase space volume of the i-th configuration respectively, where Γi = (K/(K + 1))i and K is the number of walkers in the simulation.

In an infinite system, the heat capacity peaks would be divergent due to a first order discontinuity in the corresponding enthalpy vs. temperature curves, but the finite size of these systems causes a broadening of the peaks. The temperature of a given transition and its error are ascertained from the combination of data from each of the three independent runs we performed at every pressure. In order to test the convergence of the simulations we fit Gaussian functions to the heat capacity peaks, and the lower and upper bounds of the error are taken to be the minimum and maximum temperature values of the peaks’ half-maximums. The simulations were run at constant pressure, and the bounding cell of variable shape and size contained 16, 32 or 64 particles (depending on the potential), in order to estimate the finite size effect. Previous calculations show that the small system size usually causes the melting temperature to be overestimated, however, the solid-solid transitions are less affected, with sampled crystalline phases usually remaining consistent across different system sizes38,47. We note that these results may be augmented by further calculations using standard simulation techniques (e.g. parallel tempering62,63, coexistence simulations64, thermodynamic integration65) with larger system sizes, using the NS-predicted phases as a guide. For each calculation, the number of walkers, K, was chosen such that the resulting heat capacity peaks were sufficiently converged, thus predicted transition temperatures were generally within a range of 200K (exceptions are noted). Using a larger number of walkers means a sampling of higher resolution, with the computational cost increasing linearly with K. The number of walkers used for each potential and system size are recorded in Table 1. Initial sample configurations were generated randomly to simulate the gas phase, while subsequent samples were acquired by performing a sufficiently large number of randomly selected “moves”, referred to as the number of model calls in Table 1. These include Hamiltonian Monte Carlo (all-atom) moves; isotropic volume changes; and perturbations to the shape of the simulation cell via stretch and shear transformations, where the probability that each move occurs is given by the ratio 5:3:2:2 (atom:volume:stretch:shear)38.

Table 1 Summary of NS parameters used for each carbon model.

DFT calculations

In order to compare the energies of configurations and phase stability predicted by the GAP-20 and GAP-20U models, we employ density functional theory (DFT) with the same input parameters as those used to generate the data on which the GAP-20U model was trained33. DFT calculations are therefore carried out using the Vienna ab initio Simulation Package (VASP), with the dispersion-inclusive optB88-vdW exchange-correlation functional66,67,68,69, and the projector augmented wave (PAW) pseudopotential method (PAW_PBE C 08Apr2002)70,71,72 with a plane-wave cutoff of 600 eV. In each case, reciprocal space is sampled using an automatically generated, Γ-centred Monkhorst-Pack mesh such that the smallest spacing between k-points is no greater than 0.2 Å−1, and energy levels are smeared by Gaussian distributions with widths of 0.1 eV.