Data-driven physics-constrained recurrent neural networks for multiscale damage modeling of metallic alloys with process-induced porosity

Deng, Shiguang; Hosseinmardi, Shirin; Wang, Libo; Apelian, Diran; Bostanabad, Ramin

doi:10.1007/s00466-023-02429-1

Data-driven physics-constrained recurrent neural networks for multiscale damage modeling of metallic alloys with process-induced porosity

Original Paper
Published: 11 January 2024

Volume 74, pages 191–221, (2024)
Cite this article

Computational Mechanics Aims and scope Submit manuscript

Shiguang Deng^1,2,
Shirin Hosseinmardi³,
Libo Wang¹,
Diran Apelian¹ &
…
Ramin Bostanabad³

817 Accesses
Explore all metrics

Abstract

Computational modeling of heterogeneous materials is increasingly relying on multiscale simulations which typically leverage the homogenization theory for scale coupling. Such simulations are prohibitively expensive and memory-intensive especially when modeling damage and fracture in large 3D components such as cast metallic alloys. To address these challenges, we develop a physics-constrained deep learning model that surrogates the microscale simulations. We build this model within a mechanistic data-driven framework such that it accurately predicts the effective microstructural responses under irreversible elasto-plastic hardening and softening deformations. To achieve high accuracy while reducing the reliance on labeled data, we design the architecture of our deep learning model based on damage mechanics and introduce a new loss component that increases the thermodynamical consistency of the model. We use mechanistic reduced-order models to generate the training data of the deep learning model and demonstrate that, in addition to achieving high accuracy on unseen deformation paths that include severe softening, our model can be embedded in 3D multiscale simulations with fracture. With this embedding, we also demonstrate that state-of-the-art techniques such as teacher forcing result in deep learning models that cause divergence in multiscale simulations. Our numerical experiments indicate that our model is more accurate than pure data-driven models and is much more efficient than mechanistic reduced-order models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Predicting mechanical fields near cracks using a progressive transformer diffusion model and exploration of generalization capacity

Article Open access 12 January 2023

Learning solutions of thermodynamics-based nonlinear constitutive material models using physics-informed neural networks

Article 09 January 2024

Accelerated Micromechanical Response Prediction Using a Deep Network-Based Surrogate Model

Notes

The accuracy of the surrogate in predicting the microstructural response given deformation paths that are not seen in training.
By Transferable we mean an RNN that can be used as the constitutive law at all macro IPs in a wide range of multi-scale simulations.
In 3D, a load path consists of 6 strain sequences where each sequence is of length $n_{load}$.
Interestingly, state variables are also used in classical constitutive laws such as the over-stress tensor that is used when modeling multi-axial large-strain kinematic hardening.
$\tilde{{\mathbb {C}}}_{n+1}^{alg}$ is equal to the elastic modulus in unloading.
$\tilde{{\mathbb {C}}}_{n+1}^{alg}=(1-{\tilde{D}}_{n+1}){\mathbb {C}}^{el}$ in unloading.

References

Feyel Frédéric, Chaboche Jean-Louis (2000) FE2 multiscale approach for modelling the elastoviscoplastic behaviour of long fibre SiC/Ti composite materials. Comput Methods Appl Mech Eng 183.3–4:309–330
Google Scholar
Pascale Kanouté DP, Chaboche Boso Jean-Louis, Schrefler BA (2009) Multiscale methods for composites: a review. Archiv Comput Methods Eng 16(1):31–75
Google Scholar
Jian-Ying Wu, Nguyen Vinh Phu, Nguyen Chi Thanh, Sutula Danas, Sinaie Sina, Bordas Stéphane PA (2020) Phase-field modeling of fracture. Adv Appl Mech 53:1–183
Google Scholar
Griffith Alan Arnold (1921) VI. The phenomena of rupture and flow in solids. Philos Trans Royal Soc London Ser A Contain Papers Math Phys Character 221:163–198
Google Scholar
Dugdale Donald S (1960) Yielding of steel sheets containing slits. J Mech Phys Solids 8(2):100–104
Google Scholar
Bouchard Pierre-Olivier, Bay François, Chastel Yvan (2003) Numerical modelling of crack propagation: automatic remeshing and comparison of different criteria. Comput Methods Appl Mech Eng 192.35–36:3887–3908
Google Scholar
Vinh Phu Nguyen and Hung Nguyen-Xuan (2013) High-order B-splines based finite elements for delamination analysis of laminated composites. Compos Struct 102:261–275
Google Scholar
Jian-Ying Wu (2011) Unified analysis of enriched finite elements for modeling cohesive cracks. Comput Methods Appl Mech Eng 200.45–46:3031–3050
MathSciNet Google Scholar
Moës Nicolas, Dolbow John, Belytschko Ted (1999) A finite element method for crack growth without remeshing. Int J Numer Methods Eng 46.1:131–150
MathSciNet Google Scholar
Moës Nicolas, Gravouil Anthony, Belytschko Ted (2002) Non-planar 3D crack growth by the extended finite element and level sets–Part I: mechanical model. Int J Numer Methods Eng 53.11:2549–2568
Google Scholar
Rashid Yrn R (1968) Ultimate strength analysis of prestressed concrete pressure vessels. Nuclear Eng Design 7.4:334–344
Google Scholar
Cervera Miguel, Jian-Ying Wu (2015) On the conformity of strong, regularized, embedded and smeared discontinuity approaches for the modeling of localized failure in solids. Int J Solids Struct 71:19–38
Google Scholar
Krajcinovic Dusan (1989) Damage mechanics. Mech Mater 8.2–3:117–197
Google Scholar
Jirásek Milan (2007) Mathematical analysis of strain localization. Revue européenne de génie civil 11.7–8:977–991
Google Scholar
Simo Juan C, Ju JW (1987) Strain-and stress-based continuum damage models-I. Formulation. Int J Solids Struct 23(7):821–840
Google Scholar
De Borst R, Sluys LJ (1991) Localisation in a Cosserat continuum under static and dynamic loading conditions. Comput Methods Appl Mech Eng 90.1–3:805–827
Google Scholar
Bazant Zdenek P, Belytschko Ted B, Chang Ta-Peng et al (1984) Continuum theory for strain-softening. J Eng Mech 110.12:1666–1692
Google Scholar
Bazant Zdenek P, Jirásek Milan (2002) Nonlocal integral formulations of plasticity and damage: survey of progress. J Eng Mech 128.11:1119–1149
Google Scholar
Poh Leong Hien, Sun Gang (2017) Localizing gradient damage model with decreasing interactions. Int J Numer Methods Eng 110.6:503–522
MathSciNet Google Scholar
Bram Vandoren, Simone A (2018) Modeling and simulation of quasi-brittle failure with continuous anisotropic stress-based gradient-enhanced damage models. Comput Methods Appl Mech Eng 332:644–685
MathSciNet Google Scholar
Dvorak George J (1992) Transformation field analysis of inelastic composite materials. Proc Royal Soc London Ser A Math Phys Sci 437(1900):311–327
MathSciNet Google Scholar
Roussette Sophie, Michel Jean-Claude, Suquet Pierre (2009) Nonuniform transformation field analysis of elastic-viscoplastic composites. Compos Sci Technol 69.1:22–27
Google Scholar
Liu Zeliang, Bessa MA, Liu Wing Kam (2016) Self-consistent clustering analysis: an efficient multi-scale scheme for inelastic heterogeneous materials. Comput Methods Appl Mech Eng 306:319–341
MathSciNet Google Scholar
Tang Shaoqiang, Zhang Lei, Liu Wing Kam (2018) From virtual clustering analysis to self-consistent clustering analysis: a mathematical study. Comput Mech 62.6:1443–1460
MathSciNet Google Scholar
Deng Shiguang, Soderhjelm Carl, Apelian Diran, Bostanabad Ramin (2022) Reduced-order multiscale modeling of plastic deformations in 3D alloys with spatially varying porosity by deflated clustering analysis. Computat Mech 70.3:517–548
MathSciNet Google Scholar
Shiguang Deng, Diran Apelian, Ramin Bostanabad (2023) Adaptive spatiotemporal dimension reduction in concurrent multiscale damage analysis. Computat Mech 72:1–33
MathSciNet Google Scholar
Planas R, Oune N, Bostanabad R (2021) Evolutionary Gaussian processes. J Mech Design 143(11):111703. https://doi.org/10.1115/1.4050746
Article Google Scholar
Oune N, Bostanabad R (2021) Latent map Gaussian processes for mixed variable metamodeling. Comput Methods Appl Mech Eng 387:114128. https://doi.org/10.1016/j.cma.2021.114128
Article MathSciNet Google Scholar
Chen W, Iyer A, Bostanabad R (2022) Data centric design: a new approach to design of microstructural material systems. Engineering 10:89–98. https://doi.org/10.1016/j.eng.2021.05.022
Article Google Scholar
Zanjani Foumani Zahra, Mehdi Shishehbor, Amin Yousefpour, Ramin Bostanabad (2023) Multi-fidelity Costaware Bayesian optimization. Comput Methods Appl Mech Eng 407:115937. https://doi.org/10.1016/j.cma.2023.115937
Article Google Scholar
Loujaine Mehrez, Jacob Fish, Venkat Aitharaju, Rodgers Will R, Roger Ghanem (2017) A PCE-based multiscale framework for the characterization of uncertainties in complex systems. Comput Mech 61(1–2):219–236. https://doi.org/10.1007/s00466-017-1502-4. (ISSN: 0178-7675 1432-0924)
Article MathSciNet Google Scholar
Carlos Mora, Tammer Eweis-Labolle Jonathan, Tyler Johnson, Likith Gadde, Ramin Bostanabad (2023) Probabilistic neural data fusion for learning from an arbitrary number of multi-fidelity data sets. Comput Methods Appl Mech Eng 415:116207. https://doi.org/10.1016/j.cma.2023.116207
Article MathSciNet Google Scholar
Jones RE, Templeton JA, Sanders CM, Ostien JT (2018) Machine learning models of plastic flow based on representation theory. Comput Model Eng Sci 117:309–342. https://doi.org/10.31614/cmes.2018.04285
Article Google Scholar
Furukawa Tomonari, Yagawa Genki (1998) Implicit constitutive modelling for viscoplasticity using neural networks. Int J Numer Methods Eng 43.2:195–219
Google Scholar
Furukawa Tomonari, Hoffman Mark (2004) Accurate cyclic plastic analysis using a neural network material model. Eng Anal Bound Elem 28.3:195–204
Google Scholar
Fernández Mauricio, Rezaei Shahed, Mianroodi Jaber Rezaei, Fritzen Felix, Reese Stefanie (2020) Application of artificial neural networks for the prediction of interface mechanics: a study on grain boundary constitutive behavior. Adv Model Simul Eng Sci 7.1:1–27
Google Scholar
**aoxin Lu, Yvonnet Julien, Detrez Fabrice, Bai **bo (2017) Multiscale modeling of nonlinear electric conductivity in graphene-reinforced nanocomposites taking into account tunnelling effect. J Comput Phys 337:116–131
MathSciNet Google Scholar
Mianroodi Jaber Rezaei, Siboni Nima H, Raabe Dierk (2021) Teaching solid mechanics to artificial intelligence–A fast solver for heterogeneous materials. NPJ Comput Mater 7.1:1–10
Google Scholar
Haghighat Ehsan, Raissi Maziar, Moure Adrian, Gomez Hector, Juanes Ruben (2021) A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics. Comput Methods Appl Mech Eng 379:113741
MathSciNet Google Scholar
Peivaste Iman, Siboni Nima H, Alahyarizadeh Ghasem, Ghaderi Reza, Svendsen Bob, Raabe Dierk, Mianroodi Jaber Rezaei (2022) Machine-learning-based surrogate modeling of microstructure evolution using Phasefield. Comput Mater Sci 214:111750
Google Scholar
Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao Jian, Bessa MA (2019) Deep learning predicts pathdependent plasticity. Proc Natl Acad Sci 116.52:26414–26420
Google Scholar
Wang Kun, Sun WaiChing (2018) A multiscale multi-permeability poroplasticity model linked by recursive homogenizations and deep learning. Comput Methods Appl Mech Eng 334:337–380
MathSciNet Google Scholar
Ling Wu, Kilingar Nanda Gopala, Noels Ludovic et al (2020) A recurrent neural network-accelerated multi-scale model for elasto-plastic heterogeneous materials subjected to random cyclic and non-proportional loading paths. Comput Methods Appl Mech Eng 369:113234
MathSciNet Google Scholar
Ghavamian F, Simone A (2019) Accelerating multiscale finite element simulations of history-dependent materials using a recurrent neural network. Comput Methods Appl Mech Eng 357:112594
MathSciNet Google Scholar
Logarzo Hernan J, Capuano German, Rimoli Julian J (2021) Smart constitutive laws: inelastic homogenization through machine learning. Comput Methods Appl Mech Eng 373:113482
MathSciNet Google Scholar
Otero Fermin, Oller Sergio, Martinez Xavier (2018) Multiscale computational homogenization: review and proposal of a new enhanced-first-order method. Archiv Comput Methods Eng 25(2):479–505
MathSciNet Google Scholar
Tang Shaoqiang, Yang Yang (2021) Why neural networks apply to scientific computing? Theor Appl Mech Lett 11(3):100242
Google Scholar
Hornik Kurt, Stinchcombe Maxwell, White Halbert (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Google Scholar
Lipton Zachary C, Berkowitz John, Elkan Charles (2015) A critical review of recurrent neural networks for sequence learning. In: ar**v preprint ar**v:1506.00019
Hanin Boris (2018) Which neural net architectures give rise to exploding and vanishing gradients?. In: Advances in neural information processing systems vol 31
Staudemeyer Ralf C, Morris Eric Rothstein (2019) Understanding LSTM–A tutorial into long short-term memory recurrent neural networks. In: ar**v preprint ar**v:1909.09586
Karpathy Andrej, Johnson Justin, Fei-Fei Li (2015) Visualizing and understanding recurrent networks. In: ar**v preprint ar**v:1506.02078
Silhavy Miroslav (2013) The mechanics and thermodynamics of continuous media. Springer, Berlin
Google Scholar
Yang Han, Sinha Sumeet Kumar, Feng Yuan, McCallen David B, Jeremić Boris (2018) Energy dissipation analysis of elastic-plastic materials. Comput Methods Appl Mech Eng 331:309–326
MathSciNet Google Scholar
Feigenbaum Heidi P, Dafalias Yannis F (2007) Directional distortional hardening in metal plasticity within thermodynamics. Int J Solids Struct 44.22–23:7526–7542
Google Scholar
**ang Zixue, Peng Wei, Liu Xu, Yao Wen (2022) Self-adaptive loss balanced physics-informed neural networks. Neurocomputing 496:11–34
Google Scholar
Márquez-Neila Pablo, Salzmann Mathieu, Fua Pascal (2017) Imposing hard constraints on deep networks: Promises and limitations. In: ar**v preprint ar**v:1706.02025
Goodfellow Ian, Bengio Yoshua, Courville Aaron (2016) Deep learning. MIT press, Cambridge
Google Scholar
Deng Shiguang, Mora Carlos, Apelian Diran, Bostanabad Ramin (2022) Data-driven calibration of Multifidelity multiscale fracture models via latent map Gaussian Process. J Mech Design 145(1):011705
Google Scholar
Bazant Zdenek P (2010) Can multiscale-multiphysics methods predict softening damage and structural failure? Int J Multiscale Comput Eng 8(1):61–67
Google Scholar
Bengio Samy, Vinyals Oriol, Jaitly Navdeep, Shazeer Noam (2015) Scheduled sampling for sequence prediction with recurrent neural networks. In: Advances in neural information processing systems vol 28
Li Hengyang, Kafka Orion L, Gao Jiaying, Cheng Yu, Nie Yinghao, Zhang Lei, Tajdari Mahsa, Shan Tang Xu, Guo Gang Li et al (2019) Clustering discretization methods for generation of material performance databases in machine learning and design optimization. Comput Mech 64:281–305
MathSciNet Google Scholar
Liu Dao**, Hang Yang KI, Elkhodary Shan Tang, Liu Wing Kam, Guo Xu (2022) Mechanistically informed data-driven modeling of cyclic plasticity via artificial neural networks. Comput Methods Appl Mech Eng 393:114766
MathSciNet Google Scholar
Bostanabad Ramin, Liang Biao, Gao Jiaying, Liu Wing Kam, Cao Jian, Zeng Danielle, Xuming Su, Hongyi Xu, Li Yang, Chen Wei (2018) Uncertainty quantification in multiscale simulation of woven fiber composites. Comput Methods Appl Mech Eng 338:506–532
MathSciNet Google Scholar
Osanov Mikhail, Guest James K (2016) Topology optimization for architected materials design. Annu Rev Mater Res 46:211–233
Google Scholar
Zheng-Dong Ma, Noboru Kikuchi, Christophe Pierre, Basavaraju R (2006) Multidomain topology optimization for structural and material designs. J. Appl. Mech. 73(4):565–573
MathSciNet Google Scholar
Deng Shiguang, Suresh Krishnan (2016) Multi-constrained 3D topology optimization via augmented topological level-set. Comput Struct 170:1–12
Google Scholar
Deng Shiguang, Suresh Krishnan (2015) Multi-constrained topology optimization via the topological sensitivity. Struct Multidiscip Optim 51(5):987–1001
MathSciNet Google Scholar
Oliver Javier (1989) A consistent characteristic length for smeared cracking models. Int J Numer Methods Eng 28(2):461–474
Google Scholar
Oliver Javier, Huespe Alfredo Edmundo, Pulido MDG, Chaves E (2002) From continuum mechanics to fracture mechanics: the strong discontinuity approach. Eng Fract Mech 69.2:113–136
Google Scholar
Liu Zeliang, Fleming Mark, Liu Wing Kam (2018) Microstructural material database for self-consistent clustering analysis of elastoplastic strain softening materials. Comput Methods Appl Mech Eng 330:547–577
MathSciNet Google Scholar
Smith Michael (2009) ABAQUS standard user’s manual. In: Dassault Systèmes Simulia Corp, Version 6.9
Oliver Javier, Huespe Alfredo Edmundo, Cante JC (2008) An implicit/explicit integration scheme to increase computability of non-linear material and contact/friction problems. Comput Methods Appl Mech Eng 19.721–24:1865–1889
Google Scholar
Liu Gui-Rong (2009) Meshfree methods: moving beyond the finite element method. CRC Press, Boca Raton
Google Scholar
Jönsthövel TB, Van Gijzen MB, Vuik C, Kasbergen C, Scarpas A (2009) Preconditioned conjugate gradient method enhanced by deflation of rigid body modes applied to composite materials. Comput Model Eng Sci (CMES) 47.2:97
Google Scholar
Saha Sourav, Kafka Orion L, Ye Lu, Cheng Yu, Liu Wing Kam (2021) Macroscale property prediction for additively manufactured in625 from microstructure through advanced homogenization. Integr Mater Manuf Innov 10:360–372
Kafka Orion L, Cheng Yu, Cheng Puikei, Wolff Sarah J, Bennett Jennifer L, Garboczi Edward J, Cao Jian, **ao **anghui, Liu Wing Kam (2022) X-ray computed tomography analysis of pore deformation in IN718 made with directed energy deposition via in-situ tensile testing. Int J Solids Struct 256:111943
Google Scholar
Yang Yang, Zhang Lei, Tang Shaoqiang (2022) A comparative study of cluster-based methods at finite strain. Acta Mechanica Sinica 38(4):421153
MathSciNet Google Scholar
Nie Yinghao, Li Zheng, Cheng Gengdong (2021) Efficient prediction of the effective nonlinear properties of porous material by FEM-Cluster based Analysis (FCA). Comput Methods Appl Mech Eng 383:113921
MathSciNet Google Scholar
Dispinar D, Akhtar Shahid, Nordmark Arne, Di Sabatino Marisa, Arnberg LJMS (2010) Degassing, hydrogen and porosity phenomena in A356. Mater Sci Eng A 527.16–17:3719–3725
Google Scholar

Download references

Acknowledgements

The authors appreciate the supports from ACRC consortium, the feedback from the anonymous reviewers, and the helpful discussions with Dr. Ling Wu and Dr. Ludovic Noels. Ramin Bostanabad also acknowledges support from NSF (award number 2211908) and the Office of Naval Research (award number N00014-23-1-2485).

Author information

Authors and Affiliations

Materials Science and Engineering, University of California, Irvine, USA
Shiguang Deng, Libo Wang & Diran Apelian
Department of Mechanical Engineering, Northwestern University, Evanston, USA
Shiguang Deng
Department of Mechanical and Aerospace Engineering, University of California, Irvine, USA
Shirin Hosseinmardi & Ramin Bostanabad

Authors

Shiguang Deng
View author publications
You can also search for this author in PubMed Google Scholar
Shirin Hosseinmardi
View author publications
You can also search for this author in PubMed Google Scholar
Libo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Diran Apelian
View author publications
You can also search for this author in PubMed Google Scholar
Ramin Bostanabad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ramin Bostanabad.

Ethics declarations

Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Multiscale Continuum Damage Model

There are two popular approaches for modeling damage [69, 70]: (1) a discrete approach based on fracture mechanics which models displacement discontinuity at discontinuous interfaces, and (2) a continuous approach based on continuum mechanics which models damage as strain softening by inelastic strains. In our work, we adopt the latter approach to generate a database of microstructural responses. The continuous approach, however, often suffers from non-objectivity where localized damage bands become narrower and narrower upon mesh refinement, i.e., they are mesh dependent (this dependency is caused by ill-posed governing equations due to damage-induced non-positive definiteness). One way to address this issue is to modify the continuum constitutive law by introducing crack bandwidth via nonlocal functions.

However, integrating nonlocal functions within a multiscale damage model is quite difficult [18, 60]. This is because, on the one hand, a material characteristic length $l_m$ is needed to stabilize ill-posed governing equations (caused by non-positive definite stiffness matrix) to address the mesh dependency at microscale. On the other hand, a macroscale characteristic length $l_M$ is required to provide nonlocality for distant macro elements. Simultaneously imposing $l_m$ and $l_M$ helps to stabilize the multiscale damage model but is not physically realistic. If only $l_m$ is imposed and $l_M$ is neglected, the multiscale model merely transfers the damage-induced localization to the finer scale which is equivalent to a single scale model with $l_m$ and contradicts with the purpose of multiscale modeling without properly transmitting microscale damage to macroscale [60].

One way to properly impose nonlocal functions in multiscale damage models is to introduce a material characteristic length in each RVE that accounts for the influence of damage on neighboring RVEs (and associated macro elements) [71]. We adopt another approach to properly impose both the micro and macro damage regularizations [26]: our micro damage model is regularized by a predefined material fracture energy and its resulting effective damage parameter is subsequently regularized by a macro damage model via a nonlocal function for neighboring RVEs. We provide more details on our approach below.

We adopt an continuum damage model to simulate strain softening in ductile metals whose load-carrying capacity drops due to the degradation of yield stress and stiffness. To simulate the onset of softening, we choose ductile damage initiation criteria which models the effective strain at damage initiation, i.e., ${\bar{E}}_{d}^{pl}$, as a function of stress and strain states. We presume ${\bar{E}}_{d}^{pl}$ is a constant and that damage begins when the equivalent plastic strain is equal or greater than it, i.e., ${\bar{E}}^{pl} \geqslant {\bar{E}}_{d}^{pl}$.

A major challenge of continuum damage models is the softening-induced non-positive definite stiffness matrix that results in slow solution convergence and negative wave speeds [60]. Specifically, the ill-posed problem causes equilibrium equations to lose objectivity with respect to mesh sizes by exhibiting spurious mesh sensitivity. In microstructural simulations, to address the lack of objectivity to mesh choices, we convert the stress–strain relation in the constitutive equation to a stress-displacement relation to formulate the micro-damage evolution after damage initiation as:

$$\begin{aligned} G_f = \int _{{\bar{E}}_0^{pl}}^{{\bar{E}}_f^{pl}} l_e S_y \,d{{\bar{E}}^{pl}} = \int _{0}^{{\bar{u}}_f^{pl}} S_y \,d{{\bar{u}}^{pl}} \end{aligned}$$

(A-1)

where $l_e$ indicates the element’s characteristic length in an arbitrary RVE, and $G_f$ represents the dissipated energy that opens a unit area of crack after damage initiation. The equivalent plastic displacement ${\bar{u}}^{pl}$ is the fracture work conjugate to the yield stress $S_y$ from the onset of damage (with the effective plastic strain ${\bar{E}}_0^{pl}$ and zero plastic displacement ${\bar{u}}^{pl}$) until the final failure (with the effective fracture strain ${\bar{E}}_f^{pl}$ and the fracture displacement ${\bar{u}}_{f}^{pl}$). Using Equation (A-1) we define the microscale damage evolution based on an exponential form of the released energy [72] as:

$$\begin{aligned} D_m = 1 - exp\left( -\frac{1}{G_f} \int _{0}^{{\bar{u}}_f^{pl}} S_y \,d{{\bar{u}}^{pl}}\right) \end{aligned}$$

(A-2)

where $D_m$ represents the microscale damage parameter that monotonically increases in the range of [0.0, 1.0]. We note that in our context of isotropic continuum damage, $D_m$ is a scalar and it becomes a tensor in anisotropic damage models. In addition, we note that $D_m$ approaches 1.0 asymptotically with infinitely large ${\bar{u}}^{pl}$ in Equation (A-2). In practice, we set $D_m$ as 1.0 when the dissipated energy exceeds $0.99G_f$. In our continuum damage model, we formulate a softening response of a micro point with an elasto-plastic behavior as:

$$\begin{aligned} {\varvec{S}}_m = (1-D_m){\varvec{S}}^0_m; \hspace{0.3cm} {\varvec{S}}^0_m = {\mathbb {C}}^{el}:{\varvec{E}}^{el}_m = {\mathbb {C}}^{el}:({\varvec{E}}_m-{\varvec{E}}^{pl}_m) \nonumber \\ \end{aligned}$$

(A-3)

where ${\varvec{S}}_m$ and ${\varvec{S}}^0_m$ are, respectively, the damaged stress and the reference stress that undergoes the same deformation path but in the absence of damage at a micro material point. ${\mathbb {C}}^{el}$ represents the fourth-order elasticity tensor. ${\varvec{E}}_m$, ${\varvec{E}}^{el}_m$ and ${\varvec{E}}^{pl}_m$ are the microscale total strain, elastic strain, and plastic strain, respectively.

From the microscale stress, we can compute the effective stress via Equation (2). Additionally, we compute the RVE’s effective damage parameter [71] by:

$$\begin{aligned} D_M = 1 - \frac{\Vert {\varvec{S}}_M: {\varvec{S}}_M^0 \Vert }{\Vert {\varvec{S}}_M^0:{\varvec{S}}_M^0 \Vert } \end{aligned}$$

(A-4)

where the homogenized damage parameter $D_M$ indicates the damage status of the RVE. Its value depends on the values of the effective stress ${\varvec{S}}_M$ and the effective reference stress ${\varvec{S}}_M^0$ without damage; thus, it is clear that the effective damage parameter is not a function of homogenized plastic strains.

We proceed to constrain the effective damage parameter $D_M$ via an integral-type non-local damage model to mitigate the spurious mesh dependency on the macroscale as:

$$\begin{aligned} {\hat{D}}_M({\varvec{P}}, {\varvec{P}}') = \int _{B} {\omega (\Vert {\varvec{P}}-{\varvec{P}}'\Vert )}D_M({\varvec{P}}') \,d{\varvec{P}}' \end{aligned}$$

(A-5)

where ${\hat{D}}_M({\varvec{P}}, {\varvec{P}}')$ is the non-local damage parameter at the macroscopic point ${\varvec{P}}$ surrounded by points ${\varvec{P}}'$ in the compact neighborhood B. $D_M({\varvec{P}}')$ represents the local damage parameter at ${\varvec{P}}'$ and $\omega $ indicates the non-local weighting function which depends on the distance $\Vert {\varvec{P}}-{\varvec{P}}'\Vert $ between the studied point and its supporting points. In this work, we define $\omega $ by a polynomial bell-shape function as:

$$\begin{aligned} {\omega (\Vert {\varvec{P}}-{\varvec{P}}'\Vert )} = \frac{\omega _\infty (\Vert {\varvec{P}}-{\varvec{P}}'\Vert )}{\int _{B} {\omega _\infty (\Vert {\varvec{P}}-{\varvec{P}}'\Vert )} \,d{\varvec{P}}'} \end{aligned}$$

(A-6a)

$$\begin{aligned} {\omega _\infty (\Vert {\varvec{P}}-{\varvec{P}}'\Vert )} = \left\langle 1 - \frac{4(\Vert {\varvec{P}}-{\varvec{P}}'\Vert )^2}{l_d^2} \right\rangle ^2 \end{aligned}$$

(A-6b)

where $\langle \dots \rangle $ is the Macauley bracket defined as $\langle x \rangle = max(0,x)$, $l_d$ denotes the strain localization bandwidth whose value represents the non-local interacting radius, and the support domain B is a sphere with a radius of $l_d/2$ in 3D models.

In our multiscale damage model, we compute the regularized macro stress, ${\varvec{S}}$, as:

$$\begin{aligned} {\varvec{S}}=(1-{\hat{D}}_M){\varvec{S}}^0_M \end{aligned}$$

(A-7)

where we assume the effective reference stress ${\varvec{S}}^0_M$ can be closely approximated by the effective damaged stress (${\varvec{S}}_M$) and the effective damage parameter ($D_M$) from microstructural analyses. That is:

$$\begin{aligned} {\varvec{S}}^0_M={\varvec{S}}_M/(1-D_M) \end{aligned}$$

(A-8)

We can directly relate ${\varvec{S}}$ to the RVEs’ effective damaged stress (${\varvec{S}}_M$) via ${\varvec{S}}={\varvec{S}}_M(1-{\hat{D}}_M)/(1-D_M)$. These two stresses are identical if there are no macro nonlocal functions, i.e., ${\hat{D}}_M=D_M$. However, as discussed before, it is important to properly impose damage regularization on each scale to address the mesh dependency issue of continuum damage mechanics [18, 60]. In our model, we regularize micro damage by material fracture energy in Equation (A-2) and macro damage by nonlocal functions in Equation (A-5). Hence, the two stresses (${\varvec{S}}$ and ${\varvec{S}}_M$) in our case are directly related via a coefficient of $(1-{\hat{D}}_M)/(1-D_M)$.

We now demonstrate that the relation in Equation (A-8) is directly related to the definition of the effective damage parameter in Equation (A-4):

$$\begin{aligned}{} & {} {{\varvec{S}}_M} = (1-D_M) {\varvec{S}}_M^0 \end{aligned}$$

(A-9a)

$$\begin{aligned}{} & {} {{\varvec{S}}_M: {\varvec{S}}_M^0} = (1-D_M) {\varvec{S}}_M^0: {\varvec{S}}_M^0 \end{aligned}$$

(A-9b)

$$\begin{aligned}{} & {} {({\varvec{S}}_M:{\varvec{S}}_M^0)/({\varvec{S}}_M^0:{\varvec{S}}_M^0)} {=} (1{-}D_M) ({\varvec{S}}_M^0:{\varvec{S}}_M^0)/({\varvec{S}}_M^0:{\varvec{S}}_M^0) \nonumber \\ \end{aligned}$$

(A-9c)

$$\begin{aligned}{} & {} {({\varvec{S}}_M:{\varvec{S}}_M^0)/({\varvec{S}}_M^0:{\varvec{S}}_M^0)} = (1-D_M) \end{aligned}$$

(A-9d)

where we can obtain the definition of the effective damage parameter as in Equation (A-4) since $0 \le 1-D_M \le 1$.

B Hybrid Constitutive Integration

The non-positive definiteness of the stiffness matrix is the primary reason for the slow convergence of classic implicit time integration schemes that are used in continuum damage simulations. For illustration, consider the constitutive equation of an isotropic damage model integrated by an implicit backward-Euler integration scheme. Its algorithmic tangent operator at an arbitrary macroscopic IP can be written as:

$$\begin{aligned} {\mathbb {C}}_{n+1}^{alg}= & {} \frac{\partial {\varvec{S}}_{n+1}}{\partial {\varvec{E}}_{n+1}} = (1-D_{n+1}){\mathbb {C}}^{el} \nonumber \\{} & {} - \frac{S_{n+1} - H_{n}{\bar{E}}_{n+1}^{pl}}{({\bar{E}}_{n+1}^{pl})^3} {\varvec{S}}_{n+1}^{0} \otimes {\varvec{S}}_{n+1}^{0} \end{aligned}$$

(B-1)

where ${\mathbb {C}}_{n+1}^{alg}$, ${\bar{E}}_{n+1}^{pl}$, $S_{n+1}$, ${\varvec{S}}_{n+1}^{0}$ and $H_{n}$ represent the fourth-order algorithmic tangent operator, equivalent plastic strain, equivalent stress, referenced stress tensor, and softening modulus, respectively. The subscripts denote time steps and the symbol $\otimes $ represents the cross product between tensors. Softening causes negative values for $H_{n}$ which can render ${\mathbb {C}}_{n+1}^{alg}$ indefinite. A non-positive ${\mathbb {C}}_{n+1}^{alg}$ leads to an ill-conditioned elemental stiffness matrix with near-zero or negative eigenvalues, and further deteriorates the global stiffness matrix in the element assembly process. Such ill-posed matrices dramatically reduce the efficiency of iterative solvers (e.g., Newton–Raphson methods) and often cause job abortion before final convergence.

To fundamentally resolve the convergence issue, we adopt a hybrid time integration scheme [26, 73] to integrate the governing equations of elasto-plastic and softening equations explicitly-implicitly. The basic idea of the hybrid integration is to maintain the positive definiteness of the system’s algebraic tangent operator by separately integrating constitutive equations in two consecutive stages via explicit and implicit schemes. At the first stage, we explicitly extrapolate internal material state variables at time step $n+1$ from step n to compute the explicit stress state $\tilde{{\varvec{S}}}_{n+1}$ that balances the equilibrium equation between internal and external forces. At the second stage, we compute the implicit stress state ${\varvec{S}}_{n+1}$ based on the current strain state ${\varvec{E}}_{n+1}$ using the classic backward Euler method to update the trial stress tensor and yield functions for the next time step where the tangent operator between $\tilde{{\varvec{S}}}_{n+1}$ and ${\varvec{E}}_{n+1}$ is kept positive definite.

For the elasto-plastic model, we choose the material state variable as the incremental plastic strain tensor $\triangle \tilde{{\varvec{E}}}_{n+1}^{pl}$ such that $\tilde{{\varvec{S}}}_{n+1}$ can be computed as:

$$\begin{aligned}{} & {} \tilde{{\varvec{S}}}_{n+1}(\triangle \tilde{{\varvec{E}}}_{n+1}^{pl}) = \tilde{{\varvec{S}}}_{n+1}^{trial} - {\mathbb {C}}^{el}:\triangle \tilde{{\varvec{E}}}_{n+1}^{pl} = {\mathbb {C}}^{el}:{\varvec{E}}_{n+1} \nonumber \\{} & {} \quad - {\mathbb {C}}^{el}:{\varvec{E}}_{n}^{pl} - {\mathbb {C}}^{el}:\triangle \tilde{{\varvec{E}}}_{n+1}^{pl} \nonumber \\{} & {} \quad \triangle \tilde{{\varvec{E}}}_{n+1}^{pl} = \frac{\triangle t_{n+1}}{\triangle t_n} \triangle {\varvec{E}}_n^{pl} \end{aligned}$$

(B-2)

where ${\varvec{E}}_n^{pl}$ represents the implicit incremental plastic strain tensor at time step n, $\triangle t_n$ and $\triangle t_{n+1}$ indicate the lengths of time steps at two consecutive steps. The algorithmic tangent operator (under loading^{Footnote 5}) is therefore computed as:

$$\begin{aligned}{} & {} \tilde{{\mathbb {C}}}_{n+1}^{alg} = \frac{\partial {\tilde{{\varvec{S}}}_{n+1}(\triangle \tilde{{\varvec{E}}}_{n+1}^{pl})}}{\partial {{\varvec{E}}_{n+1}}} \nonumber \\{} & {} \quad = \frac{\partial ({\mathbb {C}}^{el}:{\varvec{E}}_{n+1} - {\mathbb {C}}^{el}:{\varvec{E}}_{n}^{pl} - {\mathbb {C}}^{el}:\triangle \tilde{{\varvec{E}}}_{n+1}^{pl})}{\partial {{\varvec{E}}_{n+1}}} = {\mathbb {C}}^{el} \nonumber \\ \end{aligned}$$

(B-3)

In a similar manner, for isotropic continuum damage models, we choose the explicitly interpolated material state variable in the hybrid integration as the incremental plastic multiplier $\triangle {\tilde{\lambda }}_{n+1}$, i.e., $\triangle {\tilde{\lambda }}_{n+1} = (\triangle t_{n+1} / \triangle t_n) \triangle \lambda _n$. We can then write its explicit damaged stress and algorithmic tangent operator (under loading^{Footnote 6}) as:

$$\begin{aligned}{} & {} \tilde{{\varvec{S}}}_{n+1} = (1-{\tilde{D}}_{n+1}) {\varvec{S}}_{n+1}^0 = (1-{\tilde{D}}_{n+1}) {\mathbb {C}}^{el}:{\varvec{E}}_{n+1};\nonumber \\{} & {} {\tilde{D}}_{n+1} = {\tilde{D}}_{n+1} (D_n, \triangle {\tilde{\lambda }}_{n+1}) \end{aligned}$$

(B-4)

$$\begin{aligned}{} & {} \tilde{{\mathbb {C}}}_{n+1}^{alg} = \frac{\partial {\tilde{{\varvec{S}}}_{n+1}}}{\partial {{\varvec{E}}_{n+1}}} = (1-{\tilde{D}}_{n+1}) {\mathbb {C}}^{el} \end{aligned}$$

(B-5)

where ${\varvec{S}}_{n+1}^0$ is the effective stress tensor, and ${\tilde{D}}_{n+1}$ represents the explicit state of the damage variable which is a function of its previous implicit state $D_n$ and the current explicit incremental plastic multiplier $\triangle {\tilde{\lambda }}_{n+1}$.

In the hybrid integration scheme, the loading tangent operators of the elasto-plastic model in Equation (B-3) and the damage model in Equation (B-5) are trivially equal to the elastic modulus ${\mathbb {C}}^{el}$ and $(1-{\tilde{D}}_{n+1}) {\mathbb {C}}^{el}$. Hence, the hybrid integration scheme preserves the positive-definiteness of the governing equations and also allows to assemble the global stiffness matrix only once before online simulations. The global stiffness matrix remains constant for the elasto-plastic regime and only needs partial updates on matrix entries associated with the softening IPs by Equation (B-5). As softening is often highly localized in small regions, the global stiffness can be incrementally updated during the entire elasto-plastic-hardening-softening process [26]; saving significant memory footprints with robust convergence performance.

C Deflated Clustering Analysis

Simulation of microstructural softening via the classic FE² method involves demanding computational costs, which is prohibitive for generating big training data for machine learning models. To accelerate the database generation, we adopt our previously developed mechanistic ROM, i.e., deflated clustering analysis (DCA) [25, 26]. Its high efficiency comes from two facts: (1) the number of unknown variables in the system is dramatically reduced from a large number of finite elements to a few clusters by agglomerating elements via clustering as shown in Fig. 23, and (2) the algebraic equations of the reduced system contains much fewer close-to-zero eigenvalues that results in better convergence comparing to the classic FE system.

Our DCA utilizes k-means clustering, i.e., an unsupervised machine learning technique for data interpretation and grou**, to agglomerate neighboring elements into a set of interactive irregular-shape clusters. The clustering begins with feeding the coordinates of element centroids into a feature space where randomly scattered cluster seeds serve as initial cluster means. Clusters accept or reject elements by iteratively minimizing the within-cluster variance until all elements are assigned to a cluster. The clustering procedure can be mathematically stated as a minimization problem as:

$$\begin{aligned} {\varvec{C}} = \min \limits _{{\varvec{C}}}\sum \limits _{I = 1}^k \sum \limits _{n \in C^I} \Vert \pmb {\varphi }_n - \bar{\pmb {\varphi }}_I \Vert ^2 \end{aligned}$$

(C-1)

where ${\varvec{C}}$ represents the k clusters with ${\varvec{C}} = \{C^1, C^2, \dots , C^k\}$. $\pmb {\varphi }_n$ and $\bar{\pmb {\varphi }}_I$ indicate the coordinates of the centroid of the $n^{th}$ element and the mean of the coordinates of the $I^{th}$ cluster, respectively. A clustering example is illustrated in Fig. 23 where the discrete domain of a 2D generic RVE with 5, 000 elements is decomposed into 100 clusters.

We construct clustering-based reduced mesh via Delaunay triangularization by connecting cluster centroids where the topological relations between clusters are preserved from the original FE mesh. By assuming the motions of cluster centroids are directly related to clustering nodes, we can compute the nodal displacements via polynomial augmented radian point interpolation [74] as:

$$\begin{aligned} {\varvec{u}}_c = {\varvec{R}}{\varvec{a}} + {\varvec{Z}}{\varvec{b}} \end{aligned}$$

(C-2)

where ${\varvec{u}}_c$ represents the displacements of cluster centroids. ${\varvec{a}}$ is the coefficient vector of the radial basis function matrix ${\varvec{R}}$, and ${\varvec{b}}$ is the coefficient vector of the polynomial basis matrix ${\varvec{Z}}$. Meanwhile, the radial coefficient and the polynomial basis need to satisfy the following equation for every node per cluster and every polynomial basis function to ensure solution uniqueness [74] as:

$$\begin{aligned} {\varvec{Z}}{\varvec{a}} = {\varvec{0}} \end{aligned}$$

(C-3)

The displacements of cluster centroids are augmented with rotational degrees of freedom to represent the six rigid body motions in a 3D deflation space [75], including three translations and three rotations. Upon the completion of a non-linear analysis on the reduced mesh, the displacement solutions can be projected back to the original FE mesh by:

$$\begin{aligned} {\varvec{u}}_{i}^{j} = {\varvec{W}}_{i}^{j} \pmb {\lambda }_{j} \end{aligned}$$

(C-4)

where ${\varvec{u}}_{i}^{j}$ represents the displacement vector at the $i^{th}$ node in the $j^{th}$ cluster. In addition, $\pmb {\lambda }_{j}$ is the rigid body motion of the centroid of the $j^{th}$ cluster, while the ${\varvec{W}}_{i}^{j}$ indicates the deflation matrix for the $i^{th}$ node in the $j^{th}$ cluster as:

$$\begin{aligned} \pmb {\lambda }_{j}= & {} [u_{jx}, u_{jy}, u_{jz}, \theta _{jx}, \theta _{jy}, \theta _{jz}]^{T}; \nonumber \\ {\varvec{W}}_{i}^{j}= & {} \begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{} z_{i}^{j} &{} -y_{i}^{j} \\ 0 &{} 1 &{} 0 &{} -z_{i}^{j} &{} 0 &{} x_{i}^{j} \\ 0 &{} 0 &{} 1 &{} y_{i}^{j} &{} -x_{i}^{j} &{} 0 \end{bmatrix} \end{aligned}$$

(C-5)

where $u_{jx}$ and $\theta _{jx}$ are the displacement and rotation of the $j^{th}$ cluster along x axis, and the ($x_{i}^{j}$, $y_{i}^{j}$, $z_{i}^{j}$) are the relative 3D coordinates of the $i^{th}$ node with respect to the centroid of the $j^{th}$ cluster. By assuming all elements in the same cluster share identical stress and strain fields, microstructural effective responses can be reproduced in a highly efficient manner such that the unknown variables are dramatically decreased from FE system that accounts for distinct field variables per element to the reduced system with much fewer distinct solutions per cluster.

To demonstrate the efficacy of our DCA, we compare its simulation results on a 3D multiscale cube against the classic FE² method in Fig. 20. The macro-cube is fully constrained at its bottom surface, and it is subject to an upward extension on the top surface with $d = 7$ mm. The cube is meshed with 12 tetrahedral elements of reduced-integration (one IP at the center of each tetrahedron). We assume each macro-IP is associated with the same porous RVE containing one spherical pore in the middle as shown in Fig. 20a.

To determine the number of clusters for a given problem (for any clustering-based ROM, e.g., DCA, SCA, or SCA’s variants), we can perform a quick preliminary convergence study where we gradually increase the number of clusters and determine the minimum number of clusters above which the results insignificantly change. This convergence study can also be done by comparing the results of the ROM to that of direction numerical solutions (DNS). Yet another method is to formulate a data-driven inverse optimization problem [59] where the cluster number is considered as an optimization variable. In this work, we carry out a convergence study to ensure our ROM’s solutions do not change as the number of clusters increase and that they are consistent with the DNS, i.e., FE². Specifically, we apply four clustering levels (k) of 400, 800, 1, 200 and 1, 600 to an RVE meshed with 15, 000 elements and investigate the effects of k on the RVE’s effective softening behaviors, see Fig. 20.

We compare the reaction force-displacement curves from FE² and FE-ROM in Fig. 21a. By considering the FE² solutions as the benchmark, we observe that: (1) the FE-ROM solutions with $k = 400$ slightly overestimate the component’s strength as insufficient clustering in the RVE artificially strengthens the material [23, 25]; and (2) as k increases, the FE-ROM responses (especially the post-failure behaviors) become closer and closer to the benchmark. Specifically, we observe that when k increases to 1, 200 and 1, 600, FE-ROMs achieve sufficiently accurate results compared to FE².

We compare the computational costs of the different solvers in Fig. 21b. While all experiments are performed on an HPC by paralleling 60 CPU cores with 360 GB RAM, the clock time of FE² is the longest (about 24.9 hours). The clock time of the ROM with 1, 200 and 1, 600 clusters is about 2.5 and 3.2 hours, resulting in the acceleration factors of 9.9 and 7.8, respectively. Considering the fact that the ROM with $k = 1,200$ is about $28\%$ faster than its counterpart with $k = 1,600$ while achieving similar accuracy, we adopt $k = 1,200$ while building the training dataset in Sect. 4.

For efficient generation of (micro)structure-performance datasets, we note that many other ROMs can also be used for porous microstructural analyses. For example, self-consistent analysis (SCA) [23, 76, 77] and virtual clustering analysis (VCA) [24] can achieve highly efficient and accurate microstructural homogenization results by treating pores as a soft material with the $0.1\%$ modulus of matrix materials [78]. Another method is the FEM-cluster-based analysis (FCA) [79] where the Hill-Mandel theorem is replaced with the energy equivalence theorem without filling pores with reference material properties. As our focus in this paper is on building the deep learning model that can faithfully surrogate microstructural analyses, we use our in-house DCA package and plan to leverage other methods such as SCA in our future works.

D Gated Recurrent Unit

To alleviate vanishing and exploding gradient issues of RNNs in processing long sequential data, long short term memory (LSTM) and gated recurrent unit (GRU) are typically used. GRU is a variant of the LSTM that, while providing similar accuracy, is more parsimonious and hence computationally more efficient. It is for this reason that we choose GRU as the memory cell in our proposed RNN architecture as in Fig. 4.

To demonstrate the working mechanism of GRUs, we three interconnected cells of a GRU layer in Fig. 22. In a GRU layer, a typical cell at an arbitrary time step t generates predictions $\hat{{\varvec{y}}}_t$ and internal memory-like hidden variables ${\varvec{h}}_t$ after reading in the current inputs ${\varvec{x}}_t$ and the hidden variables ${\varvec{h}}_{t-1}$ from the previous cell. Compared to the RNN cell in Fig. 3b, the GRU cell uses reset and update gates to regulate its internal information flow. The reset gate ${\varvec{r}}_t$ reads ${\varvec{x}}_t$ and ${\varvec{h}}_{t-1}$ to determine the candidate hidden state $\hat{{\varvec{h}}}_t$ by filtering out less important information passing from the previous cell. Its operations include:

$$\begin{aligned} {\varvec{r}}_t&=\sigma \left( {\varvec{W}}_{h r} {\varvec{h}}_{t-1}+{\varvec{W}}_{x r} {\varvec{x}}_t+{\varvec{b}}_r\right) \end{aligned}$$

(D-1a)

$$\begin{aligned} \tilde{{\varvec{h}}}_t&={\text {tanh}}\left( {\varvec{r}}_t \odot {\varvec{W}}_{h {\tilde{h}}} {\varvec{h}}_{t-1}+{\varvec{W}}_{x {\tilde{h}}} {\varvec{x}}_t+{\varvec{b}}_{{\tilde{h}}}\right) \end{aligned}$$

(D-1b)

where $\sigma $ is the sigmoid activation function that returns a value in the range of [0, 1], tanh is the hyperbolic tangent function, and $\odot $ represents the Hadamard product. ${\varvec{W}}_{hr}$, ${\varvec{W}}_{xr}$, ${\varvec{W}}_{h {\tilde{h}}}$, ${\varvec{W}}_{x {\tilde{h}}}$ are the weight matrices associated with the hidden state, the input state, the hidden-to-candidate hidden state and the input-to-candidate hidden state, respectively. ${\varvec{b}}_r$ and ${\varvec{b}}_{{\tilde{h}}}$ are the biases applied to the sigmoid function in the reset gate and the hyperbolic tangent function, respectively.

The update gate (which has its weights and biases) similarly operates on ${\varvec{x}}_t$ and ${\varvec{h}}_{t-1}$: it linearly interpolates the previous hidden state ${\varvec{h}}_{t-1}$ and the candidate hidden state $\tilde{{\varvec{h}}}_t$ to update the memory-like hidden state ${\varvec{h}}_t$ which is then passed to the next cell:

$$\begin{aligned} {\varvec{u}}_t&=\sigma \left( {\varvec{W}}_{hu} {\varvec{h}}_{t-1}+{\varvec{W}}_{x u} {\varvec{x}}_t+{\varvec{b}}_u\right) \end{aligned}$$

(D-2a)

$$\begin{aligned} {\varvec{h}}_t&={\varvec{u}}_t \odot {\varvec{h}}_{t-1}+\left( 1-{\varvec{u}}_t\right) \odot \tilde{{\varvec{h}}}_t+{\varvec{b}}_h \end{aligned}$$

(D-2b)

where ${\varvec{W}}_{hu}$ and ${\varvec{W}}_{xu}$ are the weights applied onto the hidden state and input state in the update gate. ${\varvec{b}}_u$ and ${\varvec{b}}_h$ are the two biases associated to the sigmoid function and the generation of the current hidden state. The cell output at the current time step $\hat{{\varvec{y}}}_t$ is then obtained by linearly transforming the hidden state:

$$\begin{aligned} \hat{{\varvec{y}}}_t ={\varvec{W}}_{h y} {\varvec{h}}_t+{\varvec{b}}_y \end{aligned}$$

(D-3)

where ${\varvec{W}}_{hy}$ and ${\varvec{b}}_y$ are the weights and biases associated with the current output state $\hat{{\varvec{y}}}_t$. We note that all the weights and biases of the GRU networks are iteratively updated by BPTT during training.

E Experimental Material Characterization

For the microstructural simulations in Appendix C we assume the microstructure only contains porosity and the matrix material (i.e., aluminum alloy A356). So, in this section, we briefly discuss the experimental characterization process that can be used to obtain the effective elastoplastic and damage properties of the matrix material, see Fig. 23. Our experiment consists of several steps. In the first step, we melt aluminum A356 ingots in a furnace which is pre-heated to about $800^{\circ }\,\hbox {C}$. During the melting process, we apply degassing [80] to remove gases (e.g., hydrogen contents) and gas-induced porosity before casting as tensile coupons. In the second step, we apply a standard T6 heat treatment to improve the A356 alloy’s strength and toughness. The heat treatment involves a high temperature treatment at $540^{\circ }\,\hbox {C}$ for 8 hours to dissolve alloy elements into aluminum matrix, a quenching process to freeze alloy elements within the solid solution, and an artificial aging process at about $155^{\circ }\,\hbox {C}$ for 3.5 hours to precipitate alloy elements and form grain structures. We also perform composition analysis and find that our A356 alloy contains about $92.05\%$ aluminum (weight fraction), $6.72\%$ silicon, $0.09\%$ steel, $0.0028\%$ magnesium, and other alloy elements. In the third step, we use X-ray computed tomography (CT) to inspect the porosity defect in tensile coupons to ensure the cast alloy is free of pores. Finally, we perform the tensile test on the tensile coupons and measure their averaged elastoplastic and damage parameters (which are provided in Sect. 4.1).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Deng, S., Hosseinmardi, S., Wang, L. et al. Data-driven physics-constrained recurrent neural networks for multiscale damage modeling of metallic alloys with process-induced porosity. Comput Mech 74, 191–221 (2024). https://doi.org/10.1007/s00466-023-02429-1

Download citation

Received: 03 May 2023
Accepted: 02 December 2023
Published: 11 January 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s00466-023-02429-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Data-driven physics-constrained recurrent neural networks for multiscale damage modeling of metallic alloys with process-induced porosity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Predicting mechanical fields near cracks using a progressive transformer diffusion model and exploration of generalization capacity

Learning solutions of thermodynamics-based nonlinear constitutive material models using physics-informed neural networks

Accelerated Micromechanical Response Prediction Using a Deep Network-Based Surrogate Model

Notes

References

Acknowledgements