Introduction

Rocks of Hadean age (>4 Ga) are lacking from the Earth’s rock archive. Much of our knowledge regarding the Earth’s earliest history has been gleaned from the geochemical features of the physio-chemical resistant mineral zircon that occurs as detrital grains in metasedimentary rocks from the Jack Hills (JH), Western Australia1,2,3,4,5,6. Deciphering the source rocks of JH zircons is thus critically important in establishing the composition and tectonic affiliation of the Earth’s earliest crust2,7,8, as well as the potential for initial terrestrial habitability9,10. Studies to date have argued for mafic source rocks11,12, impact melts13,14, and felsic source rocks8,15,16,17. However, the elevated δ18O in many JH zircons7,17,18 and the predicted high source melt SiO2 contents of the JH zircons19,20 are not consistent with a dominant mafic source rock origin. The possibility of an impact melt sheet origin was also subsequently ruled out due to the noticeable distinctions between JH zircons and those from rocks at the Sudbury impact crater16,19,21. Although there is an increasing consensus for the derivation of the JH zircons from felsic melts in a continental setting, the exact source rocks remain disputed. A large number of studies22,23,24,25 have suggested that the Hadean continental crust should have compositions comparable to granitoids of the tonalite–trondhjemite–granodiorite (TTG) series (generally produced by melting and/or crystallization of a basaltic source26). This seems reasonable based on the subsequent dominance of TTGs in the Archean (4.0–2.5 Ga) continental crust26,27. This argument has also been justified by comprehensive zircon Hf isotope studies28, recent thermodynamic modeling29, and calculated model melts based on Ti-calibrated zircon/melt partition coefficients30.

Others have proposed the formation of the JH zircons in near-H2O saturated meta- (i.e., I-type) and/or peraluminous (i.e., S-type) magmas commonly seen in modern convergent plate margins6,17,31, rather than in TTG magmas. The supporting evidence includes the low crystallization temperature illustrated by Ti-in-zircon thermometry (the opposite of what is expected from TTG magmas)32,33 and mineral inclusion assemblages that are indicative of I- and S-type granitoids15,17. However, none of the above evidence is unequivocal. The calculated crystallization temperature is highly sensitive to the choice of the TiO2 activity and using a low TiO2 activity (e.g., 0.4–0.5 versus 1) for JH zircons will return a similar temperature range to TTG magmas29. Meanwhile, whether the observed inclusions are primary or not is debated34. Furthermore, even among the studies that advocate derivation from I- and S-type magmas, controversy extends to whether S-type source rocks dominated the felsic portion17, or I-type source rocks prevailed in the Earth’s earliest continental crust16. The predominance of muscovite and quartz inclusions (accounting for nearly three-quarters of inclusions in Hadean JH zircons), if indeed primary, are more consistent with derivation from an S-type dominated magma source15,17. In contrast, aluminum and phosphorus contents in JH zircons, are argued to suggest mainly I- rather than S-type protoliths8,16,35. However, neither Al nor P proxies can effectively identify TTG zircons8,16, thus TTGs remain a possible protolith of the JH zircons.

To address the composition of the Earth’s earliest continental crust and its possible tectonic significance, in this study we establish a machine learning method that can distinguish detrital zircons from TTG, I- and S-type, respectively; then, we apply this method to identify the source rocks of the JH zircons. Our results show that most JH zircons (as high as almost 70%) are from I- and S-type granites, rather than TTGs. This finding runs counter to the general view and carries important information for the style of Earth’s earliest tectonic regimes.

Results and discussion

A machine learning method for distinguishing source rocks of detrital zircons

To provide solid constraints on the Hadean continental crust, a better understanding of the source rock information recorded by zircon is required. In this study, we compiled a zircon trace element dataset from a variety of source rocks. This set included 3168 published zircon analyses from I-type rocks, 2056 from S-type rocks, and 808 from TTGs (see “Method” section), which were all the data available to the authors at the time of writing. This offers a unique opportunity to investigate the relationship between the trace element geochemistry of zircon and its provenance, which in turn can be used to decipher the likely geochemical source of detrital grains for which a connection with the original source is not preserved. Theoretically, S-type granites are more reduced than I-type granites36 and also probably TTGs (due to both being derived from igneous source rocks). Most TTGs should also be more depleted in the HREE than S-type rocks because of the widely accepted derivation of the dominant TTG groups (i.e., medium- and high-pressure groups that account for ~80% of global TTGs) from a garnet-bearing mafic source26, despite some claims to the contrary37. Thus, it is expected that S-type zircons can be distinguished from I-type and TTG zircons according to Ce/Ce* and Eu/Eu*, which are indicators of magma oxidation state38, whereas most TTG zircons should have HREE contents (e.g., Yb) greater than S-type rocks. In practice, however, distinguishing source rock composition using zircon geochemistry is complicated39, as indicated by the noticeable overlap in the chondrite-normalized rare earth element (REE) patterns (Fig. 1a–c and Supplementary Fig. 1), and in the Ce/Ce* versus Eu/Eu* diagram of zircons from those rock types (Fig. 1d). The reason for such overlaps is that the trace element chemistry of zircon is affected by many different variables in addition to parental melt composition (e.g., temperature, pressure, oxygen fugacity, and competition from other minerals40,41,42). Differentiating the relative significance (and thus further deconvolving the effects) of these variables is extremely challenging, especially for detrital grains that lack a direct link with the source rock from which they were derived39. Therefore, although parental melt composition may act as a first-order control on the trace element composition of zircons that crystallized from it, the relationship between zircon compositions and their parental magma may not be as intuitive as we have expected.

Fig. 1: Zircon trace element diagrams.
figure 1

Kernel density plots of chondrite-normalized REE patterns are shown for zircons from TTGs (a), I-type (b) and S-type (c) rocks. d. Zircon Ce/Ce* versus Eu/Eu* diagram (red circles for I-type zircons, green circles for S-type zircons and yellow circles for TTG zircons). a-c were generated using the ksdensity normal kernel smoothing function in Matlab. Chondrite values are from Sun and McDonough92. The exponential power function, as described by Zhong et al.38, was used to calculate Ce/Ce* and Eu/Eu* in d.

In this study we, therefore, have applied machine learning (ML) technology to relate the trace element geochemistry of zircon to provenance (Fig. 2). Compared with traditional classification methods that are based on single elements8,16 or some binary and/or triangular diagrams (where generally only a couple of elements are utilized)39, the advantage of ML is that it can effectively utilize more features and capture complex nonlinear relationships among large datasets43,44,45. This approach promises to achieve a much higher level of classification accuracy than the previous methods. Moreover, ML learns the classification features by itself without being explicitly programmed, and thus the internal, complex relationships within the data can be discovered algorithmically without the requirement for pre-existing knowledge. To acquire the best classification models to identify the granitic sources from which detrital zircons are derived, we applied three common supervised ML approaches, namely, Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP), and their prediction performance was compared. Details about zircon selection criteria, data curation, and modeling procedures are presented in the Methods section and Supplementary Table 1. It should be noted that our ML classifiers can only output three types of source rocks. This may be problematic when these classifiers are used to identify the provenance of zircons that may come from source rocks other than I-type, S-type and TTG rocks. Thus, a preliminary study is needed to mitigate such concerns before the use of our ML classifiers. However, as discussed at the beginning, previous studies have demonstrated that the source rocks of JH zircons should be characterized by the overwhelming majority of I-type, S-type rocks and/or TTGs over other source rocks (if any). Thus, a ML classifier trained by I-type, S-type and TTG zircons will be appropriate for the provenance studies of the JH zircons.

Fig. 2: Overview of the study design.
figure 2

Data collection & filtering: compositions of three types of zircons were compiled from publications52 and filtered to exclude analyses on cracks, inclusions, and/or noticeable hydrothermal alteration features. We used an undersampling technique to handle the class imbalance problem, after which data were split into 80% training and 20% test. 17 features were extracted, including 11 REEs (except La, Pr and Nd), Th, U, Th/U, U/Yb, Ce/Ce*, and Eu/Eu*. Model training: zircon classifiers were trained with a tenfold cross-validation hyperparameter search, using the algorithms Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP). Evaluation & interpretability: classification performance was measured using metrics commonly used in ML (confusion matrix and ROC curves) and the influence of every input feature value on the model output was explained by the calculated SHAP values48; model performance was also tested using the well-studied detrital zircons from the Gangdese magmatic belt, southern Tibet and the Western Dharwar Craton, southern India. Application & interpretation: SVM and MLP, which show better performance, were used to identify the source rocks of 4.4–3.3 Ga Jack Hills detrital zircons; the diverse sources of these detrital zircons indicate a special tectonic environment for Earth’s earliest continental crust.

Seventeen features—including 11 REEs (Ce, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, and Lu), Th, U and 4 derived trace element ratios (Th/U, U/Yb, Ce/Ce* and Eu/Eu*)—were used for the ML algorithms. Th and U concentrations were corrected for radioactive decay since the time of crystallization (see Methods). These 17 features were selected because (1) they are routinely analyzed in many laboratories and are more commonly reported in the literature; and (2) they have been shown to be useful in discriminating zircon provenance39,41,46, despite some claims to the contrary11,47. Moreover, our statistical analysis work has indicated that although none of these selected elements and/or ratios is able to independently identify all three types of zircons, each can distinguish at least one zircon type from the rest (Supplementary Fig. 2). For example, most S-type zircons can be distinguished by lower Ce and higher Tb; most I-type zircons can be distinguished by higher Th/U and higher Ce/Ce*; and most TTG zircons can be distinguished by much lower Th and U.

For each model, the individual metrics for each fold during the tenfold cross-validation process are reported in Supplementary Table 2, with the average performance metrics for the test set after the tenfold cross-validation process summarized in Supplementary Fig. 3 and Supplementary Table 3. It can be seen that according to the performance metrics of the test set, all three trained ML algorithms present good performance in identifying source rocks of zircons, with an overall accuracy of 0.88 for SVM, 0.84 for RF, and 0.87 for MLP (Supplementary Table 3). In trained SVM and MLP models, the precision for each type of zircon is higher than 0.82; in the RF model, the individual precision is also higher (0.82–0.87) except for TTG zircon (0.79). Moreover, three trained ML models are also characterized by higher AUC values (0.967 for SVM, 0.965 for RF, and 0.968 for MLP). All the above results confirm that the trained models perform very well in predicting zircon types. To investigate how the models had learned input-output relationships, we used an explainable artificial intelligence approach (SHAP48). As described in detail in the Materials and Methods section, a SHAP value is calculated for each feature of each zircon type during the training process. The amplitude of the SHAP value reflects how important a feature is for a certain zircon type, while the sign of the SHAP value reflects whether the feature has a positive or negative contribution to the zircon type, in other words why it is important.

By comparing with the statistical analysis pattern (Supplementary Fig. 2), the SHAP summary plots indicate that the relationship between input and output was captured plausibly (Supplementary Figs. 46). For example, Th/U was captured as the most important feature in distinguishing I-type zircons for all three models (Supplementary Figs. 46); for I-type zircons high Th/U inputs (red) produce high SHAP values and therefore have a strong positive influence on the model output, whereas for S-type and TTG zircons, low Th/U inputs (blue) produce high SHAP values and therefore have a strong positive influence on the model output. This corresponds with our basic understanding derived from the statistical analysis result, where I-type zircons are visually characterized by noticeably higher Th/U than other types of zircons (Supplementary Fig. 2). The other four most important features in distinguishing I-type zircons are Tb, Eu, Yb and Lu for the SVM model (Supplementary Fig. 4); Eu/Eu*, Th, Ce/Ce* and Ce for the RF model (Supplementary Fig. 5); and Tb, Lu, Eu/Eu* and Th for the MLP model (Supplementary Fig. 6). Ce and Eu, as well as the derived ratios (Ce/Ce* and Eu/Eu*) are important in distinguishing S-type zircons (Supplementary Figs. 46). Th and Th/U are of importance in distinguishing TTG zircons in the RF and MLP models (Supplementary Figs. 5, 6), whereas Dy and Tm are the most important features in the SVM model (Supplementary Fig. 4). Again, these all correspond well with what has been seen from the statistical analysis result (Supplementary Fig. 2), despite the slight distinctions among models in the relative importance of different features. While we now know the input-output relationships in each trained model, their geological significance, for example, why most I-type zircons are characterized by much higher Th/U than other zircon types49, is still unclear and further research is merited.

Plausibility checks by two case studies

It has been suggested that a model’s stated performance may not accurately reflect its performance post-deployment because of, for example, overfitting50 and black-box effects of the used ML methods51. Thus, before applying these trained models to provenance studies of the JH zircons, we first evaluated them using the 150–50 Ma detrital zircons from the Gangdese magmatic belt in southern Tibet and 3600–2700 Ma detrital zircons from the Western Dharwar Craton, southern India52. None of the Phanerozoic Gangdese detrital zircon grains belongs to the TTG zircon population and previous studies have demonstrated that the 150–50 Ma batholith in the Gangdese magmatic belt (and thus detrital zircons in this area with the same age span) is predominantly I-type53. In contrast, the Western Dharwar Craton detrital zircons should dominantly be of TTG origin54. Thus, these detrital zircons provide two ideal examples to test the plausibility of these ML models in distinguishing the provenance of real-world detrital zircons. The provenance results predicted by the three models are shown in Supplementary Fig. 7. It can be seen that the three ML methods give very similar results for each case. Most of the detrital grains (616 of 733 analyses in SVM, 594 in RF, and 586 in MLP) from the Gangdese magmatic belt are classified into the population from I-type rocks with only a few (2–7%) wrongly classified into the TTG population (Supplementary Fig. 7a), whereas TTGs are identified as the dominant source rocks of the Western Dharwar Craton detrital zircons (53 of 65 analyses in SVM, 46 in RF and 50 in MLP; Supplementary Fig. 7b). These model results are consistent with our basic understanding of local geology.

Overall, the metric results derived from the three test sets (including one test set during training and the above two external test sets) in this study consistently affirm the robustness of the three trained classifiers. Despite this, in the Gangdese case (Supplementary Fig. 7a), the trained RF model returns a greater proportion of incorrectly classified TTG zircons (7%) compared to the SVM and MLP (both 2%) models, indicating the relatively low performance of the RF model in distinguishing TTG zircons. This is also reflected in the confusion matrix where the accuracy of the trained RF model in identifying TTG zircons (0.79) is lower than that of two other models (both 0.88; see Supplementary Fig. 3 and Supplementary Table 3). Considering that correctly distinguishing the detrital zircons of TTG origin is of particular importance for this study, only the trained SVM and MLP models will be used for the provenance studies of the JH zircons.

The provenance of the Jack Hills zircons

We compiled a high-quality JH detrital zircon database comprising 666 published trace element analyses52. The classification results based on the trained SVM and MLP models were given in Fig. 3. The two ML models give a very consistent zircon-type distribution pattern with time, further indicating the high reliability of the results. According to the SVM classifier, 36% of 666 compiled JH grains are from I-type rocks, 33% from S-type rocks, and 31% from TTG rocks (Fig. 3a). The MLP classifier gives a very similar result, with 32% of grains classified into the TTG population (Fig. 3b). Figure 3c further shows that the JH zircons derived from I- and S-type source rocks dominate over those from TTGs except in the Hadean. During 4.2–4.0 Ga, the proportion of JH zircons derived from TTGs is broadly consistent with, and only locally higher than, that from I- and S-type rocks. The same pattern is also observed during 4.4–4.2 Ga, although only 12 grains in the 666 JH data (accounting for less than 2%) give ages older than 4.2 Ga, and thus the zircon sources for this timeframe are less constrained. The above source rock pattern for the JH zircons may be flawed to some degree by a preservation bias inherent in using detrital zircons. However, in the absence of a natural selection mechanism that preferentially excludes zircons formed from TTG magmas, TTGs are unlikely to have contributed noticeably to the Hadean JH zircon population. Overall, it can be seen that the JH continental crust pattern is different from typical Archean continental crust where TTGs account for an overwhelming proportion (>80%) of felsic rocks55.

Fig. 3: Provenance of the compiled JH zircons.
figure 3

a Classification result based on the trained SVM model. b Classification result based on the trained MLP model. c The average distribution of zircon types with time.

Implications for the early Earth

Our study shows that the JH continental crust—which probably represents Earth’s earliest continental crust—was not predominantly composed of TTGs. On the contrary, it encompasses a high proportion of I- and S-type rocks (in the Hadean and especially in the Archean) that are commonly richer in K2O. The variety of granitic sources for the JH zircons is noticeably different from the typical Archean continental crust. The record derived from the surviving Archean crust suggests that TTGs should constitute more than 80% of the felsic portion55, while potassic granitoids (including typical I- and S-type rocks) appear later in Earth’s history, locally after 3.2 Ga and globally by the end of the Archean56,57,58,59.

Remarkably, comprehensive trace element analyses have not been conducted for Hadean zircons from terrestrial localities beyond the JH region. This hinders to some degree a robust comparison of JH zircons with other Hadean populations, which, in turn, makes it unclear how representative the JH crust is of the Hadean world60. Nonetheless, we can achieve some clues from the 4.02 Ga Idiwhaa tonalitic gneiss (ITG) within the Acasta Gneiss Complex in Canada. This rock is the oldest well-preserved terrestrial rock unit61. The whole-rock trace-element systematics of the ITG is markedly different from the average Archean TTGs. TTGs are generally characterized by noticeably depleted HREE and indistinct Eu anomalies (due to the involvement of garnet in melting or magma fractionation)62; in contrast, it has been found that the REE pattern of the ITG shows little fractionation of LREEs from HREEs and pronounced negative Eu anomalies (probably due to noticeable plagioclase fractionation)62. Thus, these lines of evidence, combined with those from JH zircons, collectively support the notion that Hadean continental crust was composed of a more diverse suite of granitoids than just the TTGs that predominate in typical Archean crust.

What then, can be said about the early Earth? The diverse assemblage of I- and S-type granitoids and TTG’s in Earth’s earliest continental crust must in part reflect the tectonic setting. The geochemical diversity of Archean TTGs has been generally ascribed to two geodynamic settings: subduction models and plateau-like models. The subduction models involve plate tectonics and dominance of horizontal forces, suggesting that TTGs were produced by partial melting of a subducting slab63,64,65,66. The plateau-like models instead suggest the formation of TTGs near the base of thick, plateau-like basaltic crust in non-plate tectonic regimes37,67,68,69. Many numerical modeling studies have suggested that the hotter mantle conditions in the Hadean Earth than in the present-day Earth—as indicated by geological and geochemical data70,71—may not allow continuous subduction and thus have supported the formation of TTGs under plateau-like vertical tectonic regimes72. The ITGs, though compositional different from TTGs, have been proposed to be nearly identical to those of some intermediate rocks from Iceland, a modern-day plateau setting formed via a mantle plume62. This seemingly supports the proposal of the vertical tectonic settings for the early Earth.

However, for occurrences where a plateau-like setting (whether or not formed via vertical tectonics) is proposed, few I- and S-type granites have been reported22,26. Alternatively, according to modern-day environments, the I- and S-type granites mainly occur in convergent plate margin settings17. This is consistent with recent geochemical modeling and B and Ca isotope studies19,30,73,74, in which a modern continental arc-like setting is proposed for the Hadean Earth. It is noted that modern continental arcs are generally characterized by the overwhelming majority of I- over S-type rocks, whereas S-type source rocks inferred from JH zircons are only slightly lower than the I-type population (Fig. 3). The relatively higher proportion of peraluminous zircons in the JH region than in the modern arcs, if not resulting from a preservation bias, could be explained by the moderate to high-pressure fractionation of hydrous mafic material (although the most volumetrically crucial way to produce strongly peraluminous JH melts might be still through melting of weathered sediments), which could produce alumina-rich melts and thus a higher proportion of peraluminous zircons over those found in typical modern arcs8. Lateral motion of the Hadean and early Archean lithosphere and its recycling into the mantle does not necessarily suggest ‘modern-style’ plate tectonics58,75,76, but is consistent with a more general mobile lid environment, which may take many forms59,77,78.

Overall, our work shows that the dominant components of Earth’s earliest continental crust are consistent with some form of lateral motion of the lithosphere and are difficult to be produced by models based purely on (plume-driven) vertical tectonics. If the earliest TTGs inferred from the JH zircons were indeed formed by plateau-like models as argued for their Archean counterparts37,67,68,69, then the most plausible scenario for the early Earth is that, as in modern Earth, the Hadean continental crust was organized into two different tectonic regimes63. The major difference is probably that in the Hadean Earth, vertical tectonics dominates over modern plate tectonic-like tectonic regime79. This is thus consistent with many arguments that the plate tectonic regime on Earth was unlikely to have commenced synchronously, but rather began locally and progressively became more widespread58,59,77,80,81. Additional work is needed to illustrate how these different tectonic styles reconciled with each other in the early Earth.

Methods

Data compilation and feature selection

Zircon compositions from I-type, S-type and TTG rocks worldwide, as well as detrital zircon compositions from the Gangdese magmatic belt, the Western Dharwar Craton, and the JH region, were compiled from over 140 references (ca. 14,500 analyses) available to the authors at the time of writing52. Previous studies have shown that adakitic rocks, which are mainly characterized by high whole-rock Sr/Y and La/Yb ratios82, represent a special end member of I-type rocks that share many similar geochemical features with Archean TTGs67. To equip the ML models with the ability to distinguish such special I-type end-member from TTG zircons, 4600 zircon analyses from adakitic rocks have been compiled into the I-type dataset. 11 REEs (Ce, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, and Lu), Th, U, Th/U, U/Yb, Ce/Ce* and Eu/Eu* were used as features for all ML algorithms. Due to radioactive decay, the measured Th and U concentrations would be lower than those at the time of crystallization. This is especially true for the Archean TTG zircons and the JH zircons. Thus, both Th and U (and thus the derived Th/U and U/Yb) were corrected back to the time of crystallization. We did not include three REEs—La, Pr and Nd—in our ML models. This is because La and Pr are present at very low levels in natural zircons and are generally close to or below the instrumental detection limits. Thus, they are missing in many zircon analyses and even in the places where they are reported, their contents may not be reliable. Although the Nd concentrations of magmatic zircons are generally far above the detection of limit, our statistical analysis work shows that the difference of Nd (e.g., the medians and interquartile range) among the three types of zircons is not as remarkable as that observed for other REE (Supplementary Fig. 2). Some elements like Al, P, Sc, Hf and Y, which may also be useful in the identification of the origin of zircon, were not used in this study. This is because many geochemical analyses of zircons did not report these elements, and thus excluding these elements allows us to use more published data.

Treatment of missing values and data filtering

According to the above descriptions, to conduct the ML modeling in this study will require at least 13 trace elements (11 REEs plus Th and U). However, not all of these 13 elements were determined causing gaps in the compiled database. This is inevitable because these data were compiled from different studies where different analytical procedures were used and could not have all been capable of determining the full range of elements. For simplicity, we exclude the analyses which contained missing values for Th and U. In contrast, the analyses with partial missing values of REE data were not excluded, since they can be easily extrapolated from other REE concentrations using the method by Zhong et al.38. The use of this composite dataset that comprises different sources of data also requires quality control to handle outliers. Statistical errors can be easily identified (e.g., by the standard deviation), which relies on data being normally distributed. However, geochemical data rarely exhibit normal or log-normal distributions83, indicating that statistical outliers may probably arise as a natural product of diverse geological processes84. Thus, in this study, we did not exclude statistical outliers in case that discarding them may ultimately bias the models. In this study, our focus is on assessing outliers resulting from analytical or human errors. Many studies showed that zircon compositions (especially LREEs) are highly susceptible to contamination by accessory mineral inclusions85,86. The common accessory mineral inclusions include apatite, titanite, monazite, allanite and xenotime3,85, which are also characterized by noticeably higher La contents than the host zircon85. To exclude such artifacts, we follow previous studies and use selection criteria of La < 1 ppm87. The resulting number of individual analyses obtained in result of the above filtering decreased to 9050. 2193 analyses with noticeably discordant ages (discordance more than 20%), which are in general related to alteration and/or metamorphism, were also discarded. For the JH zircons, grains with 207Pb/206Pb age < 3300 Ma were further discarded since they might have experienced noticeable Pb loss. After filtering, 3168 of 8500 zircon grains from I-type rocks, 2056 of 5350 from S-type rocks, 808 of 2305 from TTGs, 733 of 1494 from the Gangdese magmatic belt, 66 of 192 from the Western Dharwar Craton, and 666 of 905 from the JH region were retained. The statistical feature of the compiled zircon data from the three source rocks was shown in Supplementary Fig. 2. As already mentioned, the three distinct zircon populations can be distinguished to a certain degree by each element and/or ratio. The datasets for I- and S-type rocks and TTGs were then randomly subdivided into training (80%) and test (20%) sets, respectively, each preserving the proportion of high and low values for a given element of the full dataset.

Treatment of class imbalance problem

In this study, the compiled zircon analyses from different source rocks are imbalanced: the proportion of zircon from TTGs (13%) is noticeably lower than that from S-type rocks (34%), both in turn lower than that from I-type rocks (53%). Such a class imbalance is a common problem in ML. Previous studies have demonstrated in such a situation most of the classifiers may be biased towards the major classes and thus probably show poor classification rates for minor classes88. The common technique to solve this problem is oversampling the minority class or undersampling the majority class to produce a relatively class-balanced database. In this study, undersampling was used because our preliminary work showed that it worked better than oversampling according to the performance metrics. Specially, we used Tomek Link (developed by Tomek) undersampling technique89. The advantage of Tomek Link is that it does not aim to reach an absolute balance between different classes, rather it focuses on removing the boundary values and the noise from the dataset and does not alter the rest of the dataset89. Thus, there is less chance of losing important information, which has been argued as a common problem for undersampling90.

ML model training

In this study, three supervised ML methods—Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)─were used to determine statistical relationships between zircon trace element concentrations and their source rocks. A large number of studies have illustrated that these methods are robust in solving problems from geoscience43. The Python programming language (Python 3.7.0) was used for the three ML algorithms. SVM was conducted with sklearn.svm.SVC in scikitlearn library 0.23.2; RF was conducted with sklearn.ensemble.RandomForestClassifier in scikitlearn library 0.23.2; and for MLP, sklearn.neural_network.MLPClassifier in scikitlearn library 0.23.2 was used. To achieve the best performance results, for each model we used a grid search technique with the tenfold cross-validation method to find the optimal hyper-parameters. Supplementary Table 1 lists the values of the main hyper-parameters used in this study for each model. For parameters that are not listed in Supplementary Table 1, default values were used.

Model evaluation and interpretability

For each algorithm, the model achieved from the training dataset was then applied to the test dataset with its performance being evaluated by various metrics, including confusion matrix, accuracy, and the area under the receiver operating characteristic (ROC) curve (AUC)91. In a confusion matrix, true positive (TP), true negative (TN), false positive (FP), and false-negative (FN), respectively, are presented, which can be used to calculate the overall accuracy and the precision for each type of zircon based on the following equations (Eqs. 12). The AUC provides a single measure of the overall model accuracy that is threshold independent. An AUC value of 0.5 indicates the prediction is as good as random, whereas 1 indicates perfect prediction.

$${{{{{\rm{Overal}}}}}}\,{{{{{\rm{accuracy}}}}}}=\frac{{{{{{\rm{TP}}}}}}+{{{{{\rm{TN}}}}}}}{{{{{{\rm{TP}}}}}}+{{{{{\rm{TN}}}}}}+{{{{{\rm{FP}}}}}}+{{{{{\rm{FN}}}}}}}$$
(1)
$${{{{{\rm{Precision}}}}}}=\frac{{{{{{\rm{TP}}}}}}}{{{{{{\rm{TP}}}}}}+{{{{{\rm{FP}}}}}}}$$
(2)

Due to the fact that the ML models internally calculate the importance of the values of features, it is often difficult to interpret the results without knowledge of the process between the input and output of data, like a black box. To overcome this limitation, our study applied SHapley Additive exPlanations (SHAP48) to estimate the importance of the studied features and to interpret and analyze the results (see Supplementary Figs. 46). Grounded in cooperative game theory, SHAP provides a reliable and consistent ranking of the unique relative importance of each feature. In addition to providing a ranking for the unique and additive importance of all identified features, SHAP allows for examining interactions between features in a model. A positive SHAP value indicates that the feature has a positive contribution to the interest zircon type, while a negative value represents a negative impact on the zircon type.