Introduction and Motivation

Since 2012, society has seen drastic improvements in the fields of automated/autonomous data analysis, informatics, and deep learning (defined later). The advancements stem from gains in widespread digital data, computing power, and algorithms applied to machine-learning (ML) and artificial intelligence (AI) systems. Here, we distinguish the term ML as obtaining a computed model of complex non-linear relationships or complex patterns within data (usually beyond human capability or established physics to define), and AI as the framework for making machine-based decisions and actions using ML tools and analyses. Both of these are necessary but not sufficient steps for attaining autonomous systems. Autonomy requires at least three concurrently operating technologies: (i) perception or sensing a field of information and making analyses (i.e., ML); (ii) predicting or forecasting how the sensed field will evolve or change over time; and (iii) establishing a policy or decision basis for a machine (robot) to take unsupervised action based on (i) and (ii). We note that item (ii) in the aforementioned list is not often discussed with respect to ML since the technical essence of item (ii) resides within the realm of control theory/control system engineering. Nonetheless, these control systems are increasingly using both models and ML/AI for learning the trajectories of the sensed field evolution and generating the navigation policy, going beyond ML for interpretation of the sensed field [1,2,3] In this context, we also note that making predictions or forecasts about engineered systems is a core strength of the materials, processes, and structures engineering (MPSE) fields of practice. As we discuss later, that core strength will be essential to leverage for both bringing some aspects of ML/AI tools into MPSE and for aiding with understanding the tools themselves. Thus, a natural basis exists for a marriage between ML or data science and MPSE for attaining autonomous materials discovery systems.

From another perspective, in engineering and materials, “Big Data” often refers to data itself and repositories for it. However, more vexing issues are tied to myriad sources of data and the often sparse nature of materials data. Within current MPSE practices, the scale and velocity of acquiring data, the veracity of data, and even the volatility of the data are additional challenges for practitioners. These raise the question of how to analyze and use MPSE data in a practical manner that supports decisions for developers and designers. That challenge looms large since the data sources and their attributes have defied development within a structured overall ontology, thus leaving MPSE data “semi-structured” at best. Here too ML/AI technologies are likely transformational for advancing new solutions to the long-standing data structure challenge. By embracing ML/AI tools for dealing with data, one naturally evolves data structures associated with the use of ML tools, related both to the input form and the output. Further, when the tools are employed, one gains insights in the sufficiency of data for attaining a given level of analysis. Finally, since the tools for ML and AI are primarily being developed to treat unstructured data, there may be gains in understanding the broad MPSE data ontology by employing them within MPSE.

Materials data have wide-ranging scope and often relatively little depth. In this context, depth can be interpreted as the number of independent observations of the state of a system. The lack of data depth stems from not only the historically high costs and difficulty of acquiring materials data, especially experimentally, but also from the nature of the data itself (i.e., small numbers (< 100) of mechanical tests, micrographs or images, chemical spectra, etc.). Yet utilizing data to its fullest is a key aspect of advanced engineering design systems. Consequently, the emerging ML/AI technologies that support mining and extracting knowledge from data may form an important aspect of future data, informatics, and visualization aspects of engineering design systems, provided that the ML/AI tools can be evolved for use within more limited data sets. That evolution must include modeling the means/systems for acquiring data itself. That is, because the data are so expensive and typically difficult to acquire, the data must exist within model frameworks such that models permit synthesizing data that is related to that which is actually acquired, or fills gaps in the data to facilitate further analysis and modeling. Having such structures would permit ML/AI tools to form rigorous relationships between these types of data, measured and synthesized. Most likely, MPSE practitioners will need to evolve methods such that they are purposefully designed to provide the levels of data needed for ML/AI within this data–model construct.

The role of ML/AI in the broader context of integrated computational materials engineering (ICME) is still evolving. Although materials data has been a topic of interest in MPSE for some time [4, 5] ML/AI was not called out in earlier ICME reports and roadmaps [6, 7] or in the Materials Genome Initiative (MGI) that incorporates ICME in the MPSE workflow [8, 9]. However, it is an obvious component of a holistic ICME approach, supporting MGI goals in data analytics and experimental design as well as materials discovery through integrated research and development [9]. As detailed in the discussion below, ML/AI is rapidly being integrated into ICME and MGI efforts, supporting accelerated materials development, autonomous and high-throughput experiments, novel simulation methodologies, advanced data analytics, among others.

ML and AI technologies already impact our every-day lives. However, as practitioners of the physical sciences, we may ask what has changed, or why should a scientist be concerned now with ML and AI technologies for MPSE? Aren’t these technologies simply sophisticated curve fits or “black box” tools? Is there any physics there? Less skeptically and more objectively, one might also ask what are the important achievements from these tools, and how are those achievements related to familiar physics? Or, how can one best apply the newest advances in ML and AI to improve MPSE results? Speculating still further, why are there no emerging AI-based engineering design systems that recognize component features, attributes, or intended performance to make recommendations about directions for final design, manufacturing processes, and materials selections or developments? Such systems are possible over the next 20 years. Indeed, Jordon and Mitchell suggest that “\(\dots \)machine learning is likely to be one of the most transformative technologies of the 21st century\(\dots \)” [10] and therefore cannot be neglected in any long-range development of engineering practices.

The present overview is intended to serve as a selective introduction to ML and AI methods and applications, as well as to give perspective on their use in the MPSE fields, especially for modeling and simulation. The computer science and related research communities are producing in excess of 2000 papers per year over the last 3 years (more than 15,000 in the last decade) on new algorithms and applications of ML technologies.Footnote 1 One cannot hope to offer a comprehensive review and discussion of these in a readable introductory review. As such, we examined perhaps 10% of recent literature and chose to highlight a small fraction of the papers examined. These reveal selected aspects of the field (perhaps some of which are lesser known) that we believe should capture the attention of MPSE practitioners, knowing that the review will be outdated upon publication.

Selected Context from Outside of MPSE

Readers may already be familiar with applications of ML- and AI-based commercial technologies, e.g., music identification via real-time signal processing on commodity smartphone hardware; cameras having automatic facial recognition; and recommendation systems for consumers that inform users about movies, news stories, or products [11, 12]. Further, AI technologies are used to monitor agricultural fields for insect types and populations, to manage power usage in computer server centers exceeding human performance, and are now being deployed in driver-assisted and driverless vehicles [13,14,15,16].

Just since 2016, a data-driven, real-time, computer vision and AI system has been deployed to identify weeds individually in agricultural fields and to locally apply herbicides, as a substitute for broadcast spraying [17]. Google switched its old “rules-based” language translation system to a deep-learning neural network-based system, realizing step-function improvements in the quality of translations, and they continue to grow that effort and many others around deep learning, abandoning rules-based systems [18, 19]. The games of “Go,” “Chess,” and “Poker” have been mastered by machines to a level that exceeds the play of the best human players [20,21,22,23]. Perhaps more important to MPSE, the new power of deep-learning networks was vividly shown in 2012, when researchers not only made step-function improvements in image recognition and classification but also surprisingly discovered that deep networks could teach themselves in an unsupervised fashion [24, 25]. Most recently, a self-taught unsupervised gaming machine exceeded the playing capability of the prior “Go” champion, also a machine that was developed with human supervised learning [26]. For selected instances, the machines can now even self-teach tasks better than the best-skilled human experts! The powers and applications of ML/AI tools are expanding so rapidly that it is hard to envisage any aspect of MPSE or multiscale modeling and simulation, or engineering overall, that will not be impacted over the next decade. Our primary challenge is to discern how such capabilities can be best integrated into MPSE practices as standard methods, and for implementing them in appropriate ways as soon as possible.

Background and Selected Terms

To better understand aspects of the current ML/AI revolution, it is useful to consider selected background and terms from literature about the field. AI as a field of study has been around since the middle 1950s; however, it is the recent growth in data availability, algorithms, and computing power that have brought a resurgence to the field, especially for ML based on deep-learning neural networks (DLNN) [25, 27]. In practice, it has become important to distinguish the term “AI,” that is now most commonly associated with having machines achieve specific tasks within a narrow domain or discipline, from the term “artificial general intelligence” (AGI) that embodies the original and futuristic goal of having machines behave as humans do. The former is in the present while the latter is likely beyond foreseeable horizons.

ML has long been used for non-linear regression, to find patterns in data, and served as one approach for achieving AI goals [28, 29]. Three types of learning are commonly recognized as “supervised” where the system learns from known data; “unsupervised” where the unassisted system finds patterns in data; and, “reinforcement” learning where the system is programmed to make guesses at solutions and is “rewarded” in some way for correct answers, but is offered no guidance about incorrect answers. All three modes are used at today’s frontiers.

For the purposes of this overview, “data science” is a general term that implies systematic acquisition and analysis, hypothesis testing, and prediction around data. The field thereby encompasses wide-ranging aspects of the information technologies employed in data acquisition, fusion, mining, forecasting, and decision-making [30]. For example, all aspects of data science would be employed for autonomous systems. Alternatively, materials “informatics” is focused on analysis of materials data to modify its form and to find the most effective use of the information; i.e. materials informatics is a subset of materials data science. Aspects of these concepts are shown schematically in Fig. 1. Data sciences, informatics, and some ML technologies are related to each other, and selectively were used in research and engineering for over half a century. However, until the last decade, their impact was minimal on materials and processes development, structures engineering, or the experimental methods used for parameterizing and verifying models. The challenges in MPSE are simply too complex and data was too limited and expensive to obtain. Now, studies do show that the ML technologies can find relationships, occasionally discover physical laws, and suggest functional forms that may otherwise be hidden to ordinary scientific study, but these are few [29, 31].

Fig. 1
figure 1

Data science may be considered as the technologies associated with acquiring data, forming and testing hypotheses about it, and making predictions by learning from the data. Five domains of activity are evident: (1) data acquisition technologies; (2) processing the data and making analyses of it; (3) building models and making forecasts from the data; (4) decision-making and policies driven from the data; and (5) visualizing and presenting the data and results. “Informatics” has primarily been involved with items 2 and 3 and has expanded slightly into items 1 and 5. ML principally encompasses items 1–3 and 5. AI usually encompasses items 1–4, while placing emphasis on item 4

Historical efforts in ML attempted uses of “artificial neural networks” (ANN or NN) to mimic the neural connections and information processing understood to take place in human brains (biological neural networks or BNN). In a fashion that loosely mimics the human brain, these networks consist of mathematical frameworks that define complex, non-linear relationships between input information and outputs. Generally, for all network learning methods, the ANN contains layers of nodes (matrices) that hold processed data that was transformed by the functional relationship that connects the nodes. A given node receives weighted inputs from a previous layer, performs an operation to adjust a net weight, and passes the result to the next layer. This is done by forming large matrices of repeatedly applied mathematical functions/transforms connecting nodes, and expansion of features at each node. To use the network, one employs “training data” of known relationship to the desired outputs, to “teach” the networks about the relationships between known inputs and favorable outputs (the weights). By iteration of the training data, the networks “learn” to assign appropriate weighting factors to the mathematical operations (linear, sigmoidal, etc.) that make the connections, and to find both strong and weak relationships within data.

Importantly, the early networks typically had only one-to-three hidden layers between the input and output layers, and a limited number of connections between “neurons;” thus, they were not so useful for AI-based decision-making. Until recently, computers did not have the capacity and algorithms were underdeveloped to permit any deeper networks or significant progress on large-scale challenges [28, 32,33,34]. The techniques fell short of today’s deep-learning tools connected to AI decision-making. Consequently, with few exceptions, the historical technical approaches for achieving AI, even within specific applications, have been arduously tied to “rules,” requiring human experts to delineate and update the rules for ever-expanding use cases and learned instances—that is until now.

Generally speaking, today’s ANN have changed completely. The availability of vast amounts of digital data for training; improvements to algorithms that permit new network architectures, ready training, and even self-teaching; and parallel processing and growth in computing power including graphics processor unit (GPU) and tensor processing unit (TPU) architectures have all led to deep-learning neural networks DLNN or “deep learning” (DL). Such DLNN often contain tens-to-thousands of hidden layers, rather than the historical one-to-three layers (thus the term “deep learning”). These advanced networks can contain a billion nodes or more, and many more connections between nodes [19, 25]. Placing this into perspective, the human brain is estimated to contain on the order of 100-to-1000 trillion connections (synapses) between less than 100 billion neurons. By comparison, today’s best deep networks are still 4–5 orders of magnitude smaller than a human brain. However, BNN still serve as models for the architectures being explored, and being only 4–5 orders of magnitude smaller than a human BNN still provides tremendous, unprecedented capabilities.

Within DL technologies, there are several use-case-dependent architectures and implementations that provide powerful approaches to different AI domains. Those based on DLNN typically require extensive data sets for training (tens of thousands to millions of annotated instances for training). As mentioned previously, this presents a major challenge for their use in MPSE that most likely will have to be mitigated using simulated data in symbiosis with experimentally acquired data. “Convolutional and de-convolutional or more appropriately transposed convolutional neural networks” (CNN and TCNN, respectively) and their variations have three important network architecture attributes including 3D volumes of node arrays and deep layers of these arrays, local connectivity such that only a few 10s of nodes communicate with each other at a time, and shared weights for each unit of connected nodes. These attributes radically speed up training, permitting the all-important greater depths. During use, the mathematical convolution (transposed convolution) operation allows concurrent learning and use of information from all of the locally narrow but deep array elements. Architecturally, the networks roughly mimic the BNN of the human eye, and have proven their effectiveness in image recognition and classification tasks, now routinely beating human performance in several tasks [34,35,36,37].

Several even more advanced DLNN architectures emerged recently including “Recurrent” (RNN) that have taken on renewed utility in their use for unsupervised language translation [38], “Regional” (R-CNN) used for image object detection, [39] and “Generative Adversarial Networks” (GAN) [40] for unsupervised learning and training-data reduction, to name but a few (for overviews and reviews, see work by Li and by Schmidhuber [41,42,43]). Each of these architectures adapts DL to different task domains. For example, language translation and speech recognition benefit by adding a form of memory for time series analysis (RNN). GAN include simulated-plus-unsupervised training, or S + U learning, for which simulated data is “corrected” using unlabeled real data, as shown in Fig. 2. Reinforcement learning technology, of which GAN are a subset, was used for the self-taught machine that mastered “Go” and has been used for the most recent language translation methods [26, 44]. Further, in a task that has similarities to aspects of MPSE, S + U training was used to correct facial recognition systems for the effects of pose changes, purely from simulated data [45, 46]. Given the widely expanding applications of DL, there is a high likelihood that architectures, algorithms, and methods for training will continue to evolve rapidly over the next 3–5 years.

Fig. 2
figure 2

One may envisage producing materials microstructure models using S + U learning. In the schematic, numerous unlabeled real images are fed into a “discriminator” CNN that learns from both real and refined synthetic images, and classifies images from the refiner as real or fake. The “refiner” is a CNN that operates on simulated microstructure images, enhancing them toward the realism of measured micrographs. The simulations may be used to sample more microstructure spaces, or nuances of microstructure, that are difficult or expensive to measure experimentally, while the measured micrographs enhance the simulated images adding realism. Adapted from [

Selected Applications and Achievements in Materials and Structures

For the case of multiscale materials and structures, we consider applications of ML/AI techniques in two main areas. First, selected examples illuminate accomplishments for materials discovery and design. While not necessarily noted in the works, these tie directly to MGI and IMCE goals. These are followed by some examples applying ML/AI methods in structures analysis. Here again, the MGI goal of accelerating materials development, deployment, and life-cycle sustainment directly ties to the structures analysis aspect of ML/AI.

Materials Discovery

For about the last two decades, ML for materials structure-property relationships has used comparatively mature informatics methods. For example, principal component analysis (PCA) operating upon human-based materials descriptors can lend insights into data. For PCA, the descriptor space is transformed using mathematics to maximize data variance in the descriptor dependencies, yielding a new representation for finding relationships. The new representation usually involves a dimensionality reduction to the data resulting in a loss of more nuanced aspects of the data. Past efforts used microstructure descriptors (in a mean-field sense), such as average grain size, constituent phase fractions or dimensions, or material texture, and sought to relate these to mean-field properties, such as elastic modulus or yield stress [59, 60]. In the absence of high-throughput computational tools for obtaining materials kinetics information, structure-property relationships, and extreme-value microstructure influences, other studies resorted to experimental data to establish or to narrow the search domains for new materials [61,62,63,64,65,66,67,68,69,70].

Our expectation is that these approaches will also become more efficient, reliable, and prevalent in the coming decade or more, particularly since open data, open-source computing methods, and technology businesses are becoming available to support the methods and approaches [71,72,73,74]. Further, advancements in materials characterization capabilities, process monitoring and sensing methods, and software tools that have taken place over the previous 20 years [75] are giving unprecedented access to 3- and 4D materials microstructure data, and huge data sets pertaining to factory-floor materials processing. Such advancements suggest that the time is ripe for bringing ML/DL/AI tools into the materials and processes domain.

Mechanics, Mechanical Properties, and Structures Analysis

Historically, multiscale modeling, structures analysis, and structures engineering have all benefited from ML/AI tools. Largely because of their general ability to represent non-linear behaviors, different forms of ANN architectures have been used since the 1990s to model materials constitutive equations of various types [76, 77], optimize composites architectures [78], and to represent hysteresis curves or non-linear behavior in various applications (such as fatigue) [79,80,81]. The closely related field of non-destructive evaluation also benefited from standard ANN techniques [82], though this field is not treated herein. Further, these methods were used for more than two decades in applications such as active structures control [83,84,85], and even for present-day flight control of drones [86]. Today, DL is bringing entirely new capabilities to structures and mechanics analysis.

In more recent work, ML methods are being used to address challenging problems in non-linear materials and dynamical systems and to evolve established ANN and informatics methods [87,88,89,90]. Further, newer deep learning and other powerful data methods are beginning to be employed. For example, Versino et. al. showed that symbolic regression ML is effective for constructing constitutive flow models that span over 16 orders of magnitude in strain rate [91]. Symbolic regression methods involve fitting arbitrarily complex functional forms to data, but doing so under constraints that penalize total function complexity, thus resulting in the simplest sufficient function to adequately fit the data [92]. Integrated frameworks are also beginning to appear [93]. These suggest a promising future that we consider more fully in what follows.

A Perspective on the Unfolding Future

Looking forward, it is appropriate to consider the question, what has changed in ML/AI technologies, and what has fostered the explosive growth of this field? Also, how might these advancements impact MPSE? This section considers these questions and provides selected insights into the prospects for ML/DL/AI and their associated technologies. The perspective focuses on examples of using these tools for materials characterization, model development, and materials discovery, rather than a complete assessment of ML, DL, AI, and data science or informatics. Further, the emphasis is on achievements from 2015 to the present, with many examples from the last year or so.

Imaging and Quantitative Understanding

Most recently, computer vision tools, specifically CNN/DL methods, were applied to microstructure classification, thus forming initial building blocks for objective microstructure methods and opening a pathway to advanced AI-based materials discovery [63, 94,95,96,97,98]. By adopting CNN tools developed for other ML applications outside of engineering, these researchers were able to objectively define microstructure classes and automate micrograph classification [95, 97, 98]. Figure 3 shows an example of the methods being applied to correlate visual appearance to processing conditions for ultrahigh carbon steel microstructures. Today, even while they remain in their infancy, such methods are demonstrated to have about a 94% accuracy in classifying types of microstructure, and they rival human capabilities for these challenges [94, 97].

Fig. 3
figure 3

A t-SNE map (see L. van der Maaten and G. Hinton, Visualizing data using t-SNE, Jrnl. Mach. Learn. Res., 9 (2008), p. 2579.) of 900 ultrahigh carbon steel microstructures in the database by Hecht [99] showing a reduced-dimensionality representation of multi-scale CNN representation of these microstructures [94]. Images are grouped by visual similarity. The inset at the bottom right shows the annealing conditions for each image: annealing temperature is indicated by the color map and annealing time is indicated by the relative marker size. The map is computed in an unsupervised fashion from the structural information obtained from the CNN; microstructures having similar structural features tend to have similar processing conditions. This is especially evident tracing the high-temperature micrographs from the bottom of the figure to the top right: as the annealing time increases, the pearlite particles also tend to coarsen. Note—the Widmanstatten structures at the left resulting from similar annealing conditions were formed during a slow in-furnace cooling process, as opposed to the quench cooling for most of the other samples

These early materials image classifiers are also showing promise for improved monitoring of manufacturing processes, such as powder feed material selection for additive manufacturing processes [100, 101]. Over the next 20 years, autonomous image classification will be common, with the classifiers themselves being trained in an unsupervised fashion, choosing the image classes without human intervention, thereby opening entire new dimensions to the MGI/ICME paradigms [34]. This means that materials and process engineers are likely to have machine companions monitoring all visual- and image-based aspects of their discipline, in order to provide guidance to their decision-making, if not making the decisions autonomously. In the coming decades, machine-based methods may have aggregated sufficient knowledge to autonomously inform engineers without having any prior knowledge of the image data collection context. They will likely operate autonomously to identify outliers in production systems or other data. Thus, one should expect radical changes to materials engineering practices, especially those based upon image data.

Further, current work by DeCost, Holm, and others is beginning to address the challenge of materials image segmentation. While the use of DL and CNN methods has recently made great strides for segmenting and classifying pathologies in biological and medical imaging [35, 102, 103], the methods are completely new in their application to materials and structures analysis. Figure 4 shows an example metal alloy microstructure image segmentation using a CNN tool. Note how the CNN learns features with increasing depth (layers) of the network, going from left-to-right in the image. Given that image segmentation and quantification (materials analytics) is among the major obstacles to bringing 3D (and 4D) materials science tools into materials engineering, the ML methods represent nascent capabilities that will result in dramatic advances in 5 years and beyond. Further, current computer science and methods research is focusing on understanding the transference capabilities of CNN/DL tools [104]. Transference refers to understanding and building network architectures that are trained for one type of image class or data set, and then using the same trained network to classify completely different types of images/features on separate data, without re-training.

Fig. 4
figure 4

CNN (PixelNet architecture, A. Bansal et. al., Pixelnet: Representation of the Pixels, By the Pixels, and for the Pixels, CoRR (2017). ar** constitutive models [87, 88, 91], modeling hysteretic response [89], improving reduced order models [165, 166], and even for optimizing numerical methods [167]. At still coarser scales, there is research to understand and model complex dynamical systems and to use ML methods for dimensionality reduction [

Summary and Conclusions

The fields of machining learning, deep learning, and artificial intelligence are rapidly expanding and are likely to continue to do so for the foreseeable future. There are many driving forces for this, as briefly captured in this overview. In some cases, the progress has been obviously dramatic, opening new approaches to long-standing technology challenges, such as advances in computer vision and image analysis. Those capabilities alone are opening new pathways and applications in the ICME/MGI domain. In other instances, the tools have only provided evolutionary progress so far, such as in most aspects of computational mechanics and mechanical behavior of materials. Generally speaking, the fields of materials and processes science and engineering, as well as structural mechanics and design, are lagging other technical disciplines in embracing ML/DL/AI tools and exploring how they may benefit from them. Nor are these fields using their formidable foundations in physics and deep understanding of their data to contribute to the ML/DL/AI fields. Nonetheless, technology leaders and those associated with MPSE should expect unforeseeable and revolutionary impacts across nearly the entire domain of materials and structures, processes, and multiscale modeling and simulation over the next two decades. In this respect, the future is now, and it is appropriate to make immediate investments in bringing these tools into the MPSE fields and their educational processes.