Introduction

Significant advances in experimental and computational methods have led to a rise in the quantity and complexity of information-rich materials datasets. The datasets present major challenges in processing and extracting information that can advance our understanding of the underlying mechanisms of complex processes in material systems. Therefore, it is imperative to store large datasets and process them effectively. To handle the quantity and complexity of datasets, there has been a gradual shift in materials research from an empirical science to data-driven science [1,2,3,4]. The shift led to the emergence of an interdisciplinary field called materials data science, in which data science techniques are implemented to solve materials science problems [5, 6].

The primary objective of materials data science is to mine large materials datasets to extract high-value and useful insights about the material system under consideration. Materials datasets are diverse and multimodal in nature, ranging from tabular data to images. In particular, data from various microscopy techniques provide valuable source of real-space information that captures and quantifies the temporal evolution of phase transformations in materials [6, 7].

One of the widely explored topics in materials science is the phase transformations in systems spanning from metals to polymers [8,9,10,11,12,13,14]. Phase transformations, such as the nucleation and growth of particles, play a critical role in determining the properties of material systems. Understanding the growth kinetics of phase transformations is essential for develo** new materials with desired properties.

The evolution of particle growth can be tracked using various techniques including microscopy [14,15,16], which can lead to the generation of large datasets. To efficiently analyze large batches of microscopy data capturing particle growth, deep learning (DL) methods can be employed to obtain materials parameters of interest, such as areas of particles and their temporal evolution. DL is a specialized subset of machine learning, wherein neural networks are used to learn effective representations of data [17]. The concepts of DL can be extended for image analysis tasks. The applications of DL have been demonstrated in different fields, such as medicine and astrophysics in which image data is ubiquitous [18,19,20,21,27, 28]. st-graphs have the ability to capture the spatial and temporal relationships between individual entities, such as particles, in an ensemble [27]. In a simplistic st-graph representation, each particle can be represented as a node or vertex, and the spatial proximity between particles is encoded using edges between nodes. In this work, we use the nearest-neighbor distances between particles to define the edges in an st-graph. By incorporating the nearest-neighbor distances, we can effectively define the connections and relationships among the particles within the graph representation. Our work on st-graphs serves as a precursor to develo** st-scene graphs [31].

Using the framework, we detected twelve unique crystallites over 700 image sequences using a U-Net variant, achieving 0.97 level of confidence in pixel-wise classification accuracy. Pixel-wise classification accuracy will be referred to as accuracy throughout this article. We were able to track the growth kinetics as isolated or im**ed crystallites over time. We explored st-graph representations over different timesteps to understand the proximity of crystallites during the growth process. We assimilated our work to develop a visual query tool in which the user can query any timestep to see the relationship of an image sequence to the corresponding st-graph representation and growth kinetics. The visual query tool is published as a Python package on PyPI repository [32]. We are able to analyze all the image sequences and crystallites at one go, enabling detailed insights into the growth kinetics.

Methods

Experimental Methods

FK-800 is a co-polymer of chlorotrifluoroethylene (CTFE) and vinylidene fluoride (VDF), with a CTFE:VDF ratio of 3:1 [33]. Between the glass transition temperature (\(\sim \)28 \(^\circ \)C) and the melt temperature (\(\sim \)110 \(^\circ \)C), the fluoroelastomer rearranges and crystallizes. The fluoroelastomer sample was prepared by dissolving 0.5 wt % FK-800 powder into ethyl acetate under ambient conditions and stirring it using a magnetic stir bar for \(\sim \)2 h. To generate a film, 5 μL of solution was dropped onto a plasma-cleaned 10 \(\times \) 10 mm silicon wafer and then immediately spun at 2450 rpm for 3 s and 8000 rpm for 30 s. Samples were then annealed at 125 \(^\circ \)C for 1 h and rapidly quenched into liquid nitrogen-cooled water to generate an amorphous film. This process resulted in an \(\sim \)50 nm thick amorphous film on a silicon substrate.

Atomic force microscopy (AFM) image sequences were captured using an Oxford Instruments Asylum Research Cypher ES equipped with a hot-stage set to 50.0 ± 0.2 \(^\circ \)C. We acquired image sequences in tap** mode using an AC160 cantilever at a scan rate of 1.96 line Hz, resulting in an image acquisition time of 261 s. Due to the slow growth of the crystallites (<0.02 nm/s), there were no measurable changes during the image acquisition time. The lateral pixel resolution for the 2\(\times \)2 μm\(^{2}\) area was about 4 nm and the tip radius had a nominal value of about 7 nm, resulting in an image resolution of \(\sim \)10 nm.

The image sequences generated by the equipment were saved as an IGOR Pro (Wavemetrics) Binary Wave or.ibw file by default. These.ibw files were preprocessed to extract the images as TIFF files and metadata in .csv files. The resulting TIFF files had a pixel resolution of 512 \(\times \) 512. A total of 1200 images sequences were extracted. The dataset was reduced to the first 700 images from the sequence, since full crystallization is achieved then and later images are redundant. We have used the AFM height images in this study, although complimentary amplitude and phase images may also be used in our future work. Times refer to the start time of the image (± 0.5 s). Time differences between images were approximated as the difference between start times.

Forty-six AFM height images were manually labeled using the polygon tool in LabelMe to identify crystallites [34]. Annotations were extracted from LabelMe as JSON files that were converted into binary TIFF images to serve as the ground truth for model training. The annotated images were manually selected to ensure all stages of crystallite growth were represented within the dataset. The dataset was split into training, validation, test subsets, where the validation and test sets were each composed of four manually selected images. The validation and test images that were selected spanned all crystallite growth stages to provide robust evaluation statistics. The model was trained with and without data augmentation. For the data augmentation in the training process, images were randomly flipped, rotated, and had their brightness or contrast adjusted in each iteration.

Data Science Methods

We selected a U-Net architecture as our DL model to segment crystalline from amorphous regions. Our U-Net implementation is constructed with four encoder and four decoder blocks. Each encoder block is composed of a convolutional block followed by a max pooling layer. Each decoder block is composed of a transposed convolutional layer, concatenation layer, and convolutional block in that order. The convolutional blocks are composed of two convolutional layers with a kernel size of 3, where each convolutional layer is followed by a batch normalization layer and rectified linear unit (ReLU) activation. The network was implemented from scratch using TensorFlow.

The model was trained for 100 epochs with early stop** set to a patience of 10. Adam was used as the optimizer with a learning rate of 0.001 and binary cross-entropy was selected as our loss function. We measure model performance across four metrics: accuracy, precision, recall, and binary intersection over union (IoU), which are defined in terms of true positive (TP), true negative (TN), false positive (FP) and false negative (FN).

  • Accuracy: The accuracy measures the proportion of correctly classified pixels overall.

    $$\begin{aligned} \textrm{Accuracy} = \frac{\textrm{TP} + \textrm{TN}}{\textrm{TP} + \textrm{TN} + \textrm{FP} + \textrm{FN}} \end{aligned}$$
    (1)
  • Precision: The precision evaluates the proportion of pixels classified as positive that are actually positive.

    $$\begin{aligned} \textrm{Precision} = \frac{\textrm{TP}}{\textrm{TP} + \textrm{FP}} \end{aligned}$$
    (2)
  • Recall: The recall calculates the proportion of actual positive pixels that are correctly classified.

    $$\begin{aligned} \textrm{Recall} = \frac{\textrm{TP}}{\textrm{TP} + \textrm{FN}} \end{aligned}$$
    (3)
  • Intersection over Union (IoU): The IoU, also known as the Jaccard index, calculates the proportion of overlap between the predicted segmentation mask and the ground truth mask.

    $$\begin{aligned} \textrm{IoU} = \frac{\textrm{TP}}{\textrm{TP} + \textrm{FP} + \textrm{FN}} \end{aligned}$$
    (4)

The U-Net model predicts a probability map where each pixel is labeled from 0 to 1. This value is a confidence score of the particular pixel; 0 indicates no confidence and 1 indicates complete confidence that the pixel belongs to the cyrstallite class. We selected 0.5 as the threshold to binarize the resulting probability map to a binary mask, where each pixel is labeled as either crystallite or amorphous region.

The best performing model was saved based on validation metrics and used to generate predictions for all image sequences. After generating predictions for the full dataset, connected components was used to quantify the statistics of individual crystallites. The following statistics were recorded for each crystallite: area, centroid, bounding box, and edge coordinates. Since connected components relies on neighboring pixels, this means that if two pixels from different crystallites are next to one another, the algorithm will label them as a single crystallite.

To solve this issue, masks predicted by the U-Net model were processed to impute borders between crystallites. This post-processing procedure involved taking the final image sequence and manually annotating pixels that belonged to the amorphous region. This yielded a border mask that enforced division between crystallites that appear visually merged. Since a pixel which is part of the amorphous region in the final image sequence will also be part of the amorphous region in all previous image sequences, the border mask can be applied to all U-Net predictions across all the image sequences. This high-fidelity procedure ensured that borders between crystallites always existed and connected components captured crystallite statistics appropriately.

We calculated nearest-neighbor distances using Euclidean distances for each pair of crystallites in a matrix format. Given two sets of edge coordinates \(E_{1}\) and \(E_{2}\), we constructed a Euclidean distance matrix between them and used the minimum value as the nearest-neighbor distance between a pair of crystallites. We generated the nearest-neighbor distances for all crystallites in a sequence of images.

The interface of the visual query tool was designed using NetworkX, a Python package for the creation, operation, visualization, and analysis of complex networks [35]. For the st-graph representations shown in this article, we used the igraph Python library along with NetworkX.

Results

The dataset we analyzed in this study consists of AFM image sequences that track the temporal evolution of a fluoroelastomer film while heating.

Figure 1 shows the nucleation and growth of crystallites from an amorphous FK-800 film. Nucleation likely occurs on the defect sites and expands in time to form crystallites. The crystallites typically rise 10–15 nm above the amorphous film and appear as lighter colors in the height images. As the crystallites grow, they consume material near their growth front, leading to a depleted zone (darker colors) at their edges. In our analysis, we approximate the film as having two components (amorphous and crystalline), and include the depleted zone as part of the amorphous phase. In our future work, we will treat the depleted zone as a third component. We also approximate the particles as two-dimensional because the lateral dimensions (in microns) are much greater than the height (<25 nm).

Fig. 1
figure 1

Hot-stage atomic force microscopy (AFM) height image sequences illustrating the nucleation and growth in FK-800 fluoroelastomer. a At the beginning, the film is predominately amorphous. Crystallites grow after thermal aging at 50.0\(^\circ \)C for b 1 h, c 8 h, d 12 h, e 1 day, and f 2 days. All images have a field of view of 2 \(\times \) 2 μm\(^{2}\) and a false-color z-scale of 25 nm (shown at the right). Times are precise to within 300 s and temperature is accurate to ±0.2\(^\circ \)C

Figure 2 illustrates the AFM images and the corresponding U-Net predictions obtained by segmenting the image sequences. From the U-Net predictions, the crystallites are shown using white pixels and amorphous regions in black pixels. The AFM images represented in TIFF format, along with the corresponding U-Net prediction results, are displayed for two selected timesteps in the sequence. The AFM images are represented in grayscale with lighter colors representing the taller crystallites emerging from the amorphous region.

The accuracy of image segmentation is 0.94 with data augmentation and 0.97 without data augmentation. The data for the metrics that characterize the quality of the U-Net predictions have also been summarized.

Fig. 2
figure 2

AFM height images (a, c) and corresponding U-Net predictions (b, d) at two time steps. a, b Timestep 67 corresponds to an image start time of 14050 s and c, d timestep 236 corresponds to an image start time of 49492 s. The table shows the evaluation parameters for U-Net predictions

Beyond categorizing crystalline and amorphous regions, we require identification of each individual crystallite to understand their temporal evolution. We have also categorized crystallites as isolated or im**ed, and whether they are within or outside the field of view. Categorizing crystallites in such a manner enables crystallite tracking, quantification of crystallite kinetics, and allows us to flag crystallites that have different behavior. Over the entire range of time, we identified a total of twelve crystallites. Each crystallite was assigned a unique ID based on its nucleation time. The crystallites were labeled by numbers that reflect the sequence of their nucleation. Figure 3 shows the crystallite IDs for an image toward the end of the time period when the film is almost fully crystallized and the crystallites im**e each other. Crystallite 2 originally nucleated within the field of view and grew partially out of it over time. Other crystallites (such as crystallite 8) nucleated outside the field of view. Only crystallite 3 remained fully in frame throughout the crystallization process.

Fig. 3
figure 3

Labels for crystallites formed during the crystallization process

To study phase transformations, the Avrami equation has been widely used [36]. Equation 5 shows the Avrami equation, where n is the Avrami exponent, k is the rate constant, y is the fraction of material transformed, and t is the time for the transformation to occur.

$$\begin{aligned} y = 1 - e^{-kt^{n}} \end{aligned}$$
(5)

Fig. 4 shows the overall temporal progression of crystallization by plotting the fraction of the surface that is crystalline. The normalized areas throughout this study were obtained by dividing areas by the image dimensions. The Avrami equation has been used to fit the data corresponding to normalized total crystalline area versus time using n = 2 to represent 2D growth. The curve starts near zero, increases rapidly initially, and then reaching a plateau. The estimated value of k obtained from the fitting is 3.1\(\times \)10\(^{-10}\) s\(^{-2}\).

Fig. 4
figure 4

Normalized crystalline coverage versus time. The plot shows the fraction of the surface that is crystalline as determined by the U-Net model (gray dots). Avrami fitting is done for normalized area versus time. The blue line indicates the Avrami fit with using an Avrami exponent, n = 2

Figure 5 shows the temporal evolution of individual crystallite growth using normalized areas. Summing all of the crystallites leads to Fig. 4, such that at the final time of observation, the total normalized crystalline area approaches 1.00. Owing to different nucleation times, the total period of observation for each crystallites differs. Offsetting by the nucleation time allows us to compare curve shapes which have substantial variation. The crystallite areas asymptotically approach a plateau, corresponding to the final crystallite size. We attribute the curve shape variation to several factors including anisotropic growth, im**ement, and partial area measurements. We observed that only one out of twelve crystallites (crystallite 3) remain fully within the field of view over the entire period of observation. Crystallite 2 remains fully within the field of view only for the first 125 timesteps (26214 s).

To account for im**ement, we investigated the kinetics of each crystal in the context of its nearest neighbors. Figure 5 displays the areas of crystallites as a function of time that are isolated or im**ed by neighboring crystallites. The isolated crystallites are denoted by solid circles and im**ed crystallites are denoted by open circles. The criterion for im**ement is when two crystallites are separated by 10 px (\(\sim \)0.04 μm) or less, which is approximately the width of the depleted zone. Once the crystallite gets im**ed by a neighboring crystallite, we mark the crystallite as im**ed and refer to it as first im**ement. For each crystallite, the im**ement due to neighboring crystallites occurs at different times. For this purpose, time after nucleation was considered to ensure that all the curves have a common starting time and are comparable. The crystallite areas increase slowly at first, then the increase rate goes through a maximum and then asymptotically approaches a plateau, corresponding to the final crystallite size.

Fig. 5
figure 5

Normalized areas of crystallites as a function of time after nucleation. Each curve represents a different crystallite ID. Isolated and im**ed crystallites are denoted by open and solid points, respectively

Figure 6 shows the temporal evolution of effective radius for individual crystallites for isolated and im**ed crystallites. The effective radius was obtained by approximating the 2D crystallite shape as a circle using A = \(\pi \) \(r^{2}\), where A is the crystallite area, and r is the effective radius. The overall curvature is similar to Fig. 5.

Fig. 6
figure 6

Temporal evolution of effective radii of crystallites. Each grid corresponds to a single crystallite. The crystallite status is distinguished based on when it is isolated (purple) and when it is first im**ed (orange)

Using effective radii, we calculated effective radial growth rates for crystallites using the finite difference derivative of effective radii and times. Figure 7 shows the temporal variation of the effective radial growth rates (\(G_\textrm{r}\)) for isolated and im**ed crystallites. Error bands were estimated based on error propagation for the finite difference derivative, where the primary error stems from the radial measurements and was assumed to be 1 pixel (\(\sim \)0.004 μm). To minimize the time error, we used image sequences that were scanned in the same direction (i.e., both down) such that the differences in the image start time reflected the difference in time at any pixel position within the image. Hence, the effective radial growth rates were calculated for every 5 timesteps (\(\sim \)1305s) to capture significant changes in radius over time.

Fig. 7
figure 7

Temporal evolution of effective radial growth rates over time with error bands (highlighted in gray). Each grid represents a crystallite. Isolated (purple) and im**ed (orange) crystallites are captured in this plot

In most cases, \(G_\textrm{r}\) of crystallites decreases monotonically in time, approaching zero as all edges of the crystallite become im**ed or grow out of the field of view. However, for crystallites 1 and 2, \(G_\textrm{r}\) increases to a maximum value, decreases, and then approaches zero. We attribute this behavior to crystallites that nucleate with their fast growth direction perpendicular to the substrate plane rather than in the substrate plane. The propensity to nucleate vertically may be more likely in the initial stages of growth, as this behavior is observed for the crystallites that nucleated earlier. More statistics would be needed to verify this.

To quantify the proximity of crystallites from each other, we obtained st-graph representations for different timesteps. Figure 8 shows the relationship of AFM image sequences to the corresponding st-graph representations. The nodes in the graph show the crystallites and the presence of edges denote that the nearest neighbor distances between crystallites are less than 100 px (\(\sim \)0.4 μm). These representations show the proximity of crystallites and formation of new crystallites over time.

We have assimilated the information from kinetics of crystallite growth and st-graphs into the visual query tool (MDS3-KGraph). The visual query tool lets the user go to any particular timestep of the crystallization process to observe the corresponding st-graph representation and kinetics until that timestep. We can obtain a summary of crystallite characteristics, such as area and centroid, at any timestep. The visual query tool is available for usage on PyPI [32].

Fig. 8
figure 8

AFM image sequences represented as spatiotemporal graph representations over time. The presence of edges mean that the nearest neighbor distances between crystallites are less than 100 px (\(\sim \)0.4 μm). The pixel (px) values of the nearest neighbor distances are denoted in the figure

Discussion

We demonstrated different components of our modular materials science framework, which enables us to do image segmentation, particle growth analysis, and build st-graph representations. In this section, we explain some of the caveats for this type of analysis and discuss our results further.

Information about the Dataset

We conducted a 2D quantitative analysis of AFM height images. This is an approximation we used in our analysis, as AFM height images are three-dimensional in nature. The FK-800 fluoroelastomer film has crystallites that are less than 50 nm thick and have lateral dimensions in microns, so the crystallites can be approximated as 2D disks. The height change in crystallites is negligibly small compared to the lateral change dimensions of growth. 2D analysis provides important information about areal coverage of crystallites, providing a useful starting point for future 3D analysis.

Although we can formally quantify individual crystallite areas over time and calculate effective radii and time-dependent effective radial growth rates, this dataset is not ideal for such detailed analyzes due to its narrow field of view. In an ideal situation, we would obtain the inherent effective radial growth rates from isolated crystallites that are not influenced by neighboring crystallites and remain fully within the field of view for accurate area measurements. Therefore, the conclusions we can draw regarding effective radial growth rates in this study are limited. However, we have demonstrated a viable framework for such a study using image sequences using AFM. This framework could be implemented for image sequences with a larger field of view or for data obtained using other imaging techniques, such as optical microscopy.

Even with these limitations, the framework enables us to calculate various growth characteristics like areas and effective radial growth rates for all crystallites across all the timesteps. In addition, we track isolated and im**ed crystallites as well as crystallites growing in-frame or out of the image frame with time. This enables us to see the behavior of crystallites throughout time, without requiring us to sample certain crystallites or timesteps.

Analysis of Image Sequences

As crystallites grow and occupy space, they will invariably im**e upon each other at their growth fronts, slowing their growth rates. We define im**ement when two crystallites are less than or equal to 10 px (\(\sim \)0.04 μm). This was decided qualitatively by observing several crystallites and the depleted zones between pairs of crystallites. We have incorporated this information in our data, giving us a holistic picture of crystallite growth in the isolated and im**ed cases.

We selected a U-Net variant as the DL image analysis method in this study. A key advantage of U-Nets are the use of a convolutional encoder-decoder architecture that allows efficient end-to-end learning by combining high resolution features from the input image with context from the downsampling path. This enables precise localization and segmentation of objects while also giving global context from the image. Given its demonstrated success in similar segmentation tasks, such as nuclei detection in medical imaging [37], U-Nets are an excellent choice to accurately identify and delineate the boundaries of our crystallite structures.

In another study, we have implemented You-Look-Only-Once (YOLO) algorithm for crystallite detection and segmentation [38]. YOLO is fast at inference, being able to segment at over 30 frames per second (FPS). This speed comes from its single-shot design that divides the image into a grid and makes predictions for bounding boxes and class probabilities directly from full image context in one pass. While very fast, this grid design struggles to make precise segmentation maps like U-Nets, especially for small, thin, or overlap** segmentation. Therefore, in a task where precision is more important than speed we choose the U-Net model.

The success of DL methods to identify crystallites enables several avenues for analysis that would be difficult, but not impossible if images were analyzed using manual methods. First, these analyzes can be used to quantify the crystalline and amorphous fractions of the the surface. We can obtain the surface-averaged time course of crystallization, which can be compared with crystallite growth models. Second, identification of individual crystallites allows comparison between them, opening up the possibility of quantifying statistical variations. Third, kinetics of individual crystallites can be investigated in the context of their neighbors, which is expected to be important when there are long-range interactions between crystallites. Although this particular dataset is not ideal for demonstrating all of these effects on fluoroelastomer crystallization kinetics, it is sufficiently complex to demonstrate the workflow, the issues, and the potential.

Our current image analysis approach contains some limitations. To validate across the dataset using the U-Net model, we selected four diverse images. This strategic selection was to maximize dissimilarity between images by considering different stages of growth, which allowed us to assess the robustness of the U-Net model. Expanding the validation set with images from different length scales, temperatures, and nucleation densities is important in our future work. This will provide a more comprehensive assessment of the model’s generalizability and robustness. Currently, crystallite separation is enforced by imputing the final amorphous region across all the predictions. However, this requires a manual annotation of the final image and post-processing predicted masks, which is inefficient. Future directions might include leveraging instance segmentation rather than semantic segmentation, a weighted loss term to penalize erroneous borders, or alternative model architectures that can incorporate temporal information to distinguish boundaries.

Growth Behavior of the Entire System and Individual Crystallites

In Avrami fitting, we used an n value of 2 to approximate the crystallites as 2D disks. The same approximation was used in another similar study from our research group where YOLO was utilized [38].

In this study, we approximated the growth of these crystallites to be isotropic to simplify the calculation of the effective radius. Each individual crystallite is seen to have different rates of growth in different directions at different times, leading to growth anisotropy. The temperature and concentration gradients associated with this phenomenon will be explored in a future study.

Crystallite geometries play an important role in our understanding of the fast growth directions. Crystallites that initiate out-of-plane transition into in-plane over time. We have not considered these aspects in our current work but we hope to explore these concepts in greater detail in our future work. In addition, we will explore different crystallite morphologies in our future work.

Conclusion

We have developed a general materials data science framework for the 2D analysis of particle growth. The novelty of our approach consists of employing deep learning and visualizing growth kinetics for image sequences without sampling for certain crystallites or timesteps. We have shown that this approach can be used for quantitative analysis of the temporal evolution of particle dispersions, especially the analysis of particle growth, and that the information can be effectively summarized in a visual query tool. The relationship between image sequences and spatiotemporal graphs could help us in understanding growth kinetics in the presence of neighboring crystallites and potential long-range interactions.

We implemented this general framework to investigate growth kinetics associated with the formation of crystallites in a fluoroelastomer film. It is critical to formulate an algorithm particle recognition (image segmentation) in order to avoid artifacts. We expect that the methods used in the study are suitable for studying growth kinetics of any general materials system.