Introduction

Traditionally, materials scientists investigate or characterize engineering materials by analyzing a series of micrographs that reveal its complex microstructure at scales varying from the millimeter down to the nanometer. These are often manually done by individual scientists, sometimes aided by computational techniques1,2. These human-centered workflows have severe drawbacks, e.g. the demand for expertize, poor repeatability, and time-consuming processes. Since the development of materials informatics, further attention has been paid to improve the status quo through data-driven techniques, including machine learning and other artificial intelligence approaches1,2,3,4. As a core component of deep learning for image recognition, convolutional neural networks (CNNs) have the potential ability to speed up the analysis of micrographs and improve the repeatability of the analysis, and have the potential to reveal unforeseen patterns and details that would be hidden without application of advanced data-mining techniques2,4. In this work, we demonstrated the application of CNN for automatically recognizing nanoscale ordered structures within the point cloud of an atom probe reconstruction.

For crystalline materials, depending on the composition, preferential occupations of certain elements at specific crystallographic sites can take place, resulting in the occurrence of ordered nanoparticles in alloys. For face-centered cubic (FCC) alloys, the existence of a high-density dispersion of coherent L12 ordered nanoparticles have been highly desirable to enhance their mechanical performance, such as Al3Sc particles in additive manufacturing aluminum alloys5,6, γ′ precipitates in superalloys6,7,8, and L12-type nanodomains in some high-entropy alloys9,10,11. Currently, it is a hot topic to accurately recognize these nanodomains with the help of develo** advanced characterization tools or analysis algorithms and further reveal an alloy’s structure-property relationship.

There are two mainstreams to handle this task: transmission electron microscopy (TEM) and atom probe tomography (APT). TEM can capture accurate crystallographic information, even down to the atomic scale, but it is not generally a three dimensional (3D) technique and TEM diffraction patterns correspond to the interaction volumes through which the electron beam pass resulting in unspecific results. 3D information is generally not available in such studies and atomic mass information is also lacking12,13. As a comparison, APT is capable of detecting 3D elemental distributions with a near-atomic spatial resolution (the resolution of the best scenario is 0.3 nm in the lateral direction and 0.1 nm in the depth direction) and high chemical sensitivity (10–100 ppm)12,14,15. However, two significant drawbacks limit its application to precisely reveal the crystallographic and chemical information of fine-size ordered structures: the detection efficiency, limited to 50–80%, as well as trajectory aberrations and associated reconstruction artefacts15. Although a single atom cannot be imaged precisely by APT, a large number of reconstructed field evaporated atoms can provide statistical distributions of different elements and local crystallographic information can be revealed by spatial distribution maps (SDMs)15,16. SDMs are produced by first calculating the 3D offset vectors between each atom in either a local or global set. These offsets are then accumulated into 3D voxelized histograms, 2D histograms (projected upon a plane, commonly the xy, yz, or zx plane), or 1D histograms (commonly along the z-axis). In this work, only the 2D zx-SDM and the 1D z-SDM were used. Previously, SDMs have been applied to investigate ordered structures in certain binary14,16 and ternary alloys17,18. Different structures generate distinctly different SDMs signatures and thus desired ordered structures can be recognized within APT data for further revealing the structure-property relationship. However, to analyze large APT datasets and recognize a large number of SDMs images (about 100 thousands of images in this study) is a computationally intensive task and almost impossible if manually attempted. Thus, an automatic ordered structure identification method is required. Note that the previous solution17,18 is to exploit the difference in compositions between ordered structures and matrix and then analyze SDMs of different subsets divided by isocomposition surfaces of a certain composition, like at 8 at% Li in an Al-Li-Mg system17. However, the filtering method cannot ensure that atoms from the matrix are not included, i.e., it is a little arbitrary to choose 8 at% Li as the dividing line17. Moreover, this approach will become invalid when there is little difference in compositions between ordered structures and matrix, like the L10 ordered structure in Au-43Cu-7Ag (at%) alloy19, or when characterizing short-range ordered structures13,20,21.

Recently, machine learning has been applied to APT data to automate the identification of a specimen’s crystallographic orientation or improve microstructural feature extraction22,23. Machine learning algorithms have the potential to unveil ordered structures by learning characteristic patterns in experimentally obtained SDMs. As a representative in the field of image recognition, CNNs have been used to automate the identification of microstructural and crystallographic features using micrographs4,24,25. A remarkable advantage of CNNs is the automatic extraction of features with minimal human intervention26,27. The essence of image recognition via CNNs is to extract different levels of features such as low-level edges and color features as well as more abstract features through a series of convolutional and pooling layers. Different crystal structures will generate different interplanar spacing in zx-SDM patterns with different relative color scaling. These edges and color features in SDMs are the key pieces of information for determining the structural types. Another advantage of CNNs for computer vision is its translation invariance but this phenomenon was not met in our study due to the pre-defined image generation procedure.

In this work, a CNN-based strategy is proposed to automatically recognize nanoscale L12-type ordered structures in FCC-based alloys using APT data with an ultra-high recognition ability. Firstly, a crystal structure library was built to include a wide range of possible configurations to then feed into producing many simulations of APT data, all based on either the L12 or FCC crystal structure. From these simulated structures, the corresponding zx-SDMs along with specific crystallographic direction were generated. The obtained SDMs (used as inputs) combined with their corresponding crystal structures (used as labels) were divided into training, validation, and test datasets, which were then used to train a CNN to generate an L12 ordered structure recognition model. A second training procedure was also performed after enriching these synthetic datasets with few experimentally obtained data, to enhance the model performances and speed up the training. Finally, the experimentally-obtained SDMs from an Al-Li-Mg alloy were input into this recognition model to identify the 3D distributions of the L12–type δ′–Al3(LiMg) particles in the FCC matrix. This result was further compared with the previous isocomposition approach to highlight its advantages.

Results

APT data analysis

The typical 3D atom probe tomography of Al–6.79Li–5.18Mg (at%) alloy is shown in Fig. 1a. Four crystallographic poles are observed from Fig. 1b, corresponding to [\(\bar 1\)01], [\(\bar 1\)11], [\(\bar 1\)02], [\(\bar 1\)13], respectively. The 3D atom probe tomography along [\(\bar 1\)01] crystallographic pole is reconstructed and presented in Fig. 1c. Figure 1d provides the close-up of a thin slice in Fig. 1c, where atomic planes are imaged clearly, suggesting the high resolution in the depth direction which is necessary for generating the zx-SDMs and the z-SDMs. The isocomposition surfaces containing more than 8 at% Li are visualized.

Fig. 1: APT data of Al–5.18Mg–6.79Li (at%) alloy.
figure 1

a 3D atom probe tomography; b 2D desorption map histogram; c reconstructed 3D atom probe tomography along [01] crystallographic pole; d a close-up of a thin slice in c; examples of zx-SDMs along [01]: Al–Al interatomic vector tomograms of e the FCC matrix and f a L12 precipitate. The corresponding unit cell models are also given.

A parallel python-based program code was made to quickly scan the dataset shown in Fig. 1c via a cube with the side length of a and generate a large number of the corresponding zx-SDMs along [\({\bar{1}}\)01]. Figure 1e, f gives two examples of zx-SDMs corresponding to FCC matrix and L12 precipitate voxels (a = 4 nm), respectively. Note that only the zx-SDMs of Al-Al pairs are shown because the zx-SDMs of other element pairs cannot provide useful information due to the limited amount of data in these voxels. The interplanar spacing as shown in Fig. 1f is twice as large as the one in Fig. 1e. This difference allows one to recognize the structural difference and thus highlight the sites of L12 particles.

Simulation of zx-SDMs

For the machine learning part, the first step was to build the synthetic dataset to train CNN parameters. Here, a crystal structure library was built to include various possible configurations, around the FCC and L12 crystal structures. Several parameters were considered to construct the crystal structure library, including lattice parameters, crystallographic rotation, simulating limited spatial resolution, loss of atoms simulating imperfect detection efficiency, and data size (number of atoms). Then, the simulated zx-SDMs along with [011] were generated from these simulated configurations. Note that [011] is equivalent to the [\({\bar{1}}\)01] pole shown in Fig. 1c.

A procedure to generate the simulated zx-SDMs is shown in Fig. 2. Firstly, the FCC and L12 crystal structures with a volume of 25 nm3 were built based on the Posgen program28, as shown in Fig. 2a. The lattice constant of the FCC-Al structure was defined as 0.405 nm29,30. The same lattice constant was set for the L12-type Al3(MgLi) structure due to the fully coherent nature17. Note that Mg and Li atoms were not separately labeled in this model because only the zx-SDMs of Al–Al pairs were used to make structure recognition in this paper. Then, the Euler transformation was applied to change the projection pole from [001] to [011], as shown in Fig. 2b. Thirdly, certain levels of Gaussian noise were added to shift the atoms in x, y, and z reconstruction directions to model finite spatial resolution, as shown in Fig. 2c. Note that the standard deviation (σ) of Gaussian noise in the z-direction is smaller than those in x and y directions, simulating the higher resolution in the depth direction. Fourthly, a certain fraction of atoms were randomly removed to simulate imperfect detection efficiency (see Fig. 2d). Finally, the corresponding zx-SDMs of Alased on the Posgen program Al pairs of two crystal structures with different parameters were generated to build the zx-SDMs dictionary, as shown in Fig. 2e.

Fig. 2: Procedure to build crystal structure library and generate simulated zx-SDMs dictionary.
figure 2

a Generating supercell; b making Euler transformation; c adding Gaussian noise to shift atoms in x, y, z directions; d removing part of atoms; e simulated zx-SDMs of Al–Al pairs. Note that the Mg and Li atoms are not separately labeled in the L12 structure.

Table 1 summarizes the parameters used for building a crystal structure library and generating the corresponding zx-SDMs dictionary. Two kinds of crystal structures were included with different noise levels and detection efficiencies. The upper and lower boundaries of the Gaussian noise levels were chosen based on the similarity between the simulated and measured zx-SDMs. The adjusted detection efficiency is used to change the number of atoms, and thus it is not kept a fixed experimental value. The generated zx-SDMs dictionary was augmented by generating more zx-SDMs which were rotated at certain angles randomly. The rotation augmentation was applied to simulate the observed small-angle pattern rotations in the experimental zx-SDMs. In total, 18416 simulated images were included in this dictionary and divided almost equally into two classes. A noise estimation31 was made on the simulated and experimental images, respectively, as shown in Supplementary Fig. 1a, b. The simulated data can well embrace the noise level of the experimental zx-SDMs. Note that adding different levels of noise to the input data of the CNN can help in out-of-distribution generalization and transfer to the real data32.

Table 1 Parameters for generating the crystal structure library and generating the corresponding zx-SDMs dictionary.

Network configuration

The simulated z-SDMs were split into 90% for training and validation and 10% for test. Five-fold cross-validation was exploited to train the model. The used images were 150×150 pixels greyscale images with one channel of input, whose pixel values were between 0 (black) and 255 (white). The adopted CNN is shown in Fig. 3a and it consists of a six-layer structure with (plus four convolutional max pooling) layers and two fully connected layers (containing the last output layer). The detailed architecture of each layer is shown in Fig. 3b. Note that other CNN configurations have been tested but exhibited quite high loss values of training and validation as shown in Fig. 3c.

Fig. 3: Proposed CNN configuration.
figure 3

a The architecture of CNN for identifying ordered structures; b the detailed architecture of each layer; c the final loss values of training and validation after 7 epochs of different CNN structures including different convolutional (plus max pooling) layers as shown in a only using simulated data.

Two cases were performed: (1) all training and validation datasets consist of simulated data while the test dataset consists of simulated data and experimental data; (2) training, validation, and test datasets consist of simulated data and experimental data. Here, 7 epochs were used to find the minimum cross-entropy loss and the entire training process only took approximately 14 min on an Intel Core i7-9700 CPU 3.00 GHz.

Training, validation, and test results

For case 1, Fig. 3c shows the final loss values of training and validation of the optimized CNN after 7 epochs. The history of loss values is shown in Supplementary Fig. 2a. This model trained only on synthetic data reaches a 100% classification accuracy on the test data, which suggests that no overfitting occurs. Thus, L2 regularization was not further applied in this CNN. Nevertheless, it does not generalize to real experimental data, yielding a poor classification accuracy on a small test dataset composed of 48 experimental zx-SDMs of the Al–Li–Mg system, as shown in Fig. 4a. It was found that several images corresponding to the FCC structure were wrongly given high L12 probabilities, like image number 0 and 47. This is mainly because the simulated zx-SDMs still cannot fully imitate the complex experimentally-obtained zx-SDMs of the FCC structure.

Fig. 4: Test results in experimental data.
figure 4

Predicted L12 structure probability of a small volume of APT data (Fig. 1c) in a case 1 and b case 2. The corresponding part zx-SDMs are attached. c ROC curves of the two cases.

To solve this, in case 2, only 12 experimental zx-SDMs corresponding to the FCC structure were augmented into 84 samples using the same method as the simulated data and added to train CNN. The same procedure was implemented and a model was obtained with similar training and validation losses of about 1.3 × 10−4. The history of loss values is shown in Supplementary Fig. 2b. This model was also tested using the 10% test dataset (only containing the simulated data) and 100% classification accuracy was made. The predicted results of the 48 experimental zx-SDMs were shown in Fig. 4b. As compared with Fig. 4a, this model exhibited very good prediction ability on both the simulated and experimental data. When a near-zero value was predicted, the test image is close to the zx-SDM of FCC structure, while a near-one value signified similarity with the zx-SDM of L12 structure. Note that a value around 0.5 means that the test image is a mix of the two structures, as shown the image number 27 in Fig. 4b, or other invalid images due to limited data (Supplementary Fig. 3).

To more quantitatively illustrate the model’s performance, a receiver operating characteristic curve (ROC) analysis33,34 was further made on the 48 experimental images for the two cases, as shown in Fig. 4c. The area under the ROC curve (AUC) is 0.995 in case 2, which is higher than 0.963 in case 1. The higher AUC value suggests better model classification ability.

Application to Al–Li–Mg APT data

After verifying the obtained ordered structure recognition model in a small test dataset, it was further applied to the big dataset shown in Fig. 1c. The side length a of the scanning cube was set as 4 nm, and the scanning stride was 1 nm. Each smaller cube with a 1-nm side length was divided and its probability was represented by the sum of the predicted probabilities of all overlapped 4-nm cubes. Thus, the range of the summing probabilities was between 0 and 64. The frequency distribution of the L12 structure probabilities of 98175 4-nm cubes is shown in Supplementary Fig. 3. Similar to Fig. 4b, the lower value is closer to FCC while the higher value is to L12 structure. Figure 5a shows the frequency distribution of the L12 structure probabilities of the 1-nm voxels. Two distinct peaks were observed at close to the values of 0 and 64, which corresponded to the FCC matrix and L12 structure, respectively. The data from three zones in Fig. 5a are extracted and their corresponding zx-SDMs are illustrated in Fig. 5b. The zx-SDM of zone 1 exhibits obvious L12 signature, while this signature is unclear in zone 2 or 3. Finally, the value of 62 was taken as separating the two different crystal structures, which is closer to the AUC value in case 2 multiply by 64, i.e., 63.68. Figure 6a shows the 1-nm voxels map with the L12 structure probability above 62. The corresponding nanoparticle size distribution is shown in Supplementary Fig. 4, but more APT data is needed to give a statistical result. This will be used as an input into microstructure and strength models to build the structure-property relationship30. Note that the recognized minimum precipitate radius can be down to 0.5 nm, suggesting the ultra-high recognition ability. Figure 6b, c shows the species-specific z-SDMs for the segmented FCC matrix and L12 structures, respectively, plotted with arbitrary units for ease of comparison. All peak-peak distances in Fig. 6c corresponded to the interplanar spacing of the FCC matrix, while all peak-peak distances in Fig. 6b were twice as those in Fig. 6c. As a comparison, Fig. 6d shows the previous atom map obtained using above 8 at% Li isocomposition17. The corresponding species–specific intensity distributions along the z-axis in the z-SDMs of above and below 8 at% Li are shown in Fig. 6e, f, respectively, showing the same periodicities as seen in Fig. 6b, c. Moreover, the sites of precipitates from the two methods are very similar.

Fig. 5: Prediction of the CNN-based L12-ordered structure recognition model in Al–Li–Mg APT data.
figure 5

a Frequency distribution of the predicted L12 structure probabilities of the 1-nm voxels generated from the dataset shown in Fig. 1c based on Supplementary Fig. 3; b the zx-SDMs generated from the data corresponding to zone 1, 2, and 3 in a.

Fig. 6: Comparison of 3D atom tomography of L12 ordered structures revealed by two methods.
figure 6

3D atom probe tomography of precipitates (a) by the proposed CNN approach with the predicted L12 structure probability (P) above 62 and (d) by the previous report using 8 at% Li isocomposition. For (a), the corresponding species-specific intensity distributions along the z-axis in the z-SDMs with the L12 structure probability above (b) and below (c) 62. For (d), the corresponding species-specific intensity distributions along the z-axis in the z-SDMs of above (e) and below (f) 8 at% Li.

Visualization of the CNN model

Deep learning models are often treated as black-box methods. To understand where the obtained CNN model is looking in an image, the authors employed two kinds of visualization methods: feature maps and gradient-weighted class activation map** (Grad-CAM). Feature maps are the outputs of the convolutional and pooling layers after applying different filters to the input image, which helps us learn how the proposed layer structure processes an input image35,36. Figure 7a exhibits the feature maps learned by the first and fourth convolutional layers from two different zx-SDMs corresponding to FCC and L12 structures. After applying the filters in the first convolutional layer, a lot of versions of the FCC or L12 zx-SDMs were portrayed with different features highlighted in Fig. 7a. For example, some focus on the edges, while others highlight the foreground or background. As going deeper into the CNN structure, the model identifies more abstract concepts. At this step, we often cannot interpret these deeper feature maps. In deeper layers, the model identifies more abstract concepts.

Fig. 7: Visualization of the CNN model.
figure 7

Part of (a) feature maps and (b) Grad-CAM of zx-SDMs of the simulated FCC and L12 structures after the first to the fourth convolutional layers in the applied CNN.

The output of Grad-CAM is a heatmap visualization for a given class label37,38. We can use this heatmap to visually verify where in the image the CNN is looking. As can be seen from Fig. 7b, the obtained CNN model mostly focuses on the interplanar spacing feature in the deeper convolutional layers, which is the desired result.

Discussion

In this paper, a CNN-assisted APT approach has been successfully applied to recognize L12 ordered precipitates in the FCC matrix with an ultra-high recognition ability. The proposed CNN-APT approach has several advantages over the traditional method based on isocomposition thresholding. The most important is that the traditional method is only based on the differences in compositions, while the present method attempts to take into account the entire crystal structure information including the occupancy sites and types of different atoms, more exactly, how this crystallographic information manifests its signature in 2D zx-SDM images. This enables the proposed method to precisely recognize ordered structures in different crystal materials. In terms of the traditional method, on one hand, it is arbitrary to choose one value to filter matrix data based on the isocomposition, like below 8 at% Li in this Al–Li–Mg system (Fig. 6d). It is hard to ensure that matrix atoms are not included in such precipitate characterization, which will significantly affect the composition measurements17. The proposed CNN approach has the capability of classifying the different crystal structures distinguishably. As shown in Fig. 5, two obvious peaks were observed and 62 was reasonably chosen to filter the data. This chosen threshold matched well with the obtained AUC value in case 2 multiply by 64, i.e., 63.68. As shown in Fig. 6, the average radius of precipitates from the isocomposition and proposed methods is 2.59 ± 0.9 and 2.54 ± 1.03 nm, respectively. On the other hand, the isocomposition method could fail when encountering the weak differences in compositions, especially like short-range ordered structures occurring in Ti and high-entropy systems13,20,21,39. The proposed method based on full crystal structure information is quite promising to handle this challenge in the near future.

Moreover, a 1-nm voxel, i.e., with 0.5 nm radius, is identified as the L12 structure only when the sum of the predicted probabilities of the surrounding overlapped 4-nm voxels is above 62. This suggests that the average predicted probability of the individual 4-nm voxel is above 0.96875. As shown in Supplementary Fig. 3, the average value corresponds to the clear L12 structure signature in the 4-nm voxel. Thus, this ensures that the 1-nm voxel most likely is the L12 structure. The supplementary Fig. 5a, b gives an example of the 2D zx-SDM and 1D Al-Al z-SDM of a 1-nm voxel extracted from Fig. 6a. This shows a slight L12 signature. However, this signal is weak and that is why the authors finally employed larger voxels to make the CNN recognition. Overall, the minimum radius of the detectable nanoparticle is down to 0.5 nm using the proposed method.

The CNN employed in this work handles a piece of zone of one image using filters, which enables the neural networks to watch a field rather than a pixel24. Each convolutional layer contains several filters to scan this image using a specific size kernel. As shown in Fig. 7, through the first convolutional layer, clean edge and gray features are detected. With the deepened convolutional layers, more abstract and sparser features are obtained, and the CNN model mostly focuses on the desired interplanar spacing feature. If the deeper convolutional layers are not added, a higher loss value will be obtained like Fig. 3c. This ablation study highlights the importance of deeper convolutional layers. Two case studies have been performed to train the CNN with or without real experimental data. The obtained models from the two cases exhibited quite a high classification accuracy in the simulated test dataset, but only the second case with some experimental data performed well in both simulated and experimental test datasets. This is attributed to the more complex situation occurring in experimental data which has not been fully involved in the simulated data.

It is worth mentioning that the transfer learning was considered in the beginning but finally given up. Firstly, applying the transfer learning requires enough data to train the neural networks40. Here, we only employed 12 experimental zx-SDMs corresponding to the FCC structure which were further augmented into 84 images. These should not be enough for performing the transfer learning. Moreover, as mentioned above, the addition of noise helps in out-of-distribution generalization and transfer to the real data32. In case 1 with only simulated data, the poor performance is mainly that several images corresponding to the FCC structure were wrongly given high L12 probabilities. Thus, the authors utilized a small set of experimental data to further embrace the complex FCC structure. This made the present model perform well in making the L12 structure recognition.

In this paper, zx-SDMs have been successfully used to make L12 structure recognition via CNN. Another possible analysis way is to deal with the experimental z-SDMs (like Fig. 6b, c) and some curve analysis methods could be performed, like 1D CNN. In addition, the potential of applying 3D CNN to directly handle 3D APT cloud points could be explored, although it may be quite difficult. Note that the uncertainty quantification41,42, as a non-trivial question and current research hot topic43, should be considered in the next step, which is one of the challenges of using CNNs for scientific application. Moreover, the proposed method can easily be extended to other ordered structures encountered in FCC alloys. One only needs to build an appropriate crystal structure library and corresponding zx-SDMs dictionary. The methodology could also be extended to BCC and HCP alloying systems in the future. It should be pointed out that the success of the proposed CNN-APT method requires the occurrence of the pole structures in the detector event histogram (like Fig. 1b) where APT data exhibits the high depth resolution. The tomographic reconstruction is often calibrated by using the pole structures17. The pole information can be found in various metallic materials, such as aluminum, magnesium, and titanium alloys. There are several other methods to extract precipitates within matrix using APT data, such as pair correlation function (PCF) and K-nearest neighbor (KNN) distance analysis15. Two obvious advantages of the two approaches are that they do not require the occurrence of pole structure and involve examining the average local neighborhood as a function of distance along all directions. A drawback is that the information along the lateral direction having the lower resolution could hinder the recognition of small-scale clusters/precipitates. In fact, the SDM technique is very similar to the PCF but along a particular crystallographic direction with the higher resolution, and thus the best structure information can be exploited to reveal the potential clusters/precipitates. In the future, it is promising to explore to make ordered structure or cluster recognition by coupling the proposed CNN framework with those other approaches based on localized compositional measurements or solute pair distances, especially when no pole occurs.

In conclusion, this is a demonstration of the potential of the CNN-based method for ordered structure recognition within APT data. It is demonstrated that this image recognition approach has the capability of revealing nanoscale L12 ordered particles in FCC system using simulated zx-SDMs and a small amount of experimental data. The minimum radius of detectable nanodomain can be down to 0.5 nm. As compared to the traditional method based on isocomposition, the proposed CNN-APT approach is more outstanding in revealing L12 ordered precipitates with the average radius of 2.54 nm in the FCC Al-Li-Mg system. The next work is to extend this proposed methodology to more challenging short-range ordering phenomena.

Methods

APT experiments

The studied APT data (45 million) of Al–6.79Li–5.18Mg (at%) (Al–1.8Li–5Mg (wt%)) aged for 8 h at 423 K is from ref. 17. The Cameca Inc. LEAP 3000XSi was applied to gather atomic-scale data with a 55% detection efficiency. IVAS 3.8.4 was used to make data reconstruction and visualization. The reconstruction parameters, i.e., the field factor and image compression factor, were calibrated by the method introduced in refs 44,45.

Convolutional neural networks

For the used CNN, all layers used ReLu (Eq. 1) as the activation function, except the output layer which used Sigmoid (Eq. 2) for classification purposes.

$${\mathrm{ReLu}}(x) = {\mathrm{max}}(0,x)$$
(1)
$${\mathrm{Sigmoid}}\left( x \right) = 1/\left( 1 + {\exp}\left(-{x} \right) \right)$$
(2)

The two kinds of activation functions were applied to each neuron in CNN to determine whether the neuron should be activated or not. They also help normalize the output of each neuron to a range between −1 and 1 or between 0 and 1. The binary cross-entropy46 was chosen as the loss function, which is often used to train a binary classifier. The loss value is given:

$${\mathrm{Loss}} = - \frac{1}{N}\mathop {\sum }\limits_{i = 1}^N \left( {y_i.\log \left( {p\left( {y_i} \right)} \right) + \left( {1 - y_i} \right).{\mathrm{log}}\left( {1 - p\left( {y_i} \right)} \right)} \right)$$
(3)

where y is the label (0 for FCC and 1 for L12 structure) and p (y) is the predicted probability of each image corresponding to the L12 structure for all N images. The training was performed using RMSprop optimization47 with certain learning rate. RMSprop is an optimization method used to iteratively update the parameters in the CNNs so that the loss value is minimized. Different learning rates have been tested and the optimum value of 0.001 was finally chosen in this study. Here, the training and validation datasets were divided into several mini-batches with the size of 32, i.e., images were randomly chosen without replacement per epoch. For neural networks, several iterations are performed to train the entire dataset once which is called one epoch. The CNN was implemented using Keras 2.2.448 with the TensorFlow 1.13.1 backend49 on Python 3.7.

ROC curve

The ROC curve is applied to illustrate binary-classifier performance33,34. The curve is created on two basic evaluation measurements: true positive rate and false-positive rate. The former is a performance measurement of the positive part, while the latter is of the negative part. The CNN model provides a probability score of L12 structure for each tested image. Model evaluation measures are calculated by moving threshold values across the scores. The AUC is calculated for model comparison.