Abstract
Optimizing multiple, non-preferential objectives for mixed variable, expensive black-box problems is important in many areas of engineering and science. The expensive, noisy, black-box nature of these problems makes them ideal candidates for Bayesian optimization (BO). mixed variable and multi-objective problems, however, are a challenge due to BO’s underlying smooth Gaussian process surrogate model. Current multi-objective BO algorithms cannot deal with mixed variable problems. We present MixMOBO, the first mixed variable, multi-objective Bayesian optimization framework for such problems. Using MixMOBO, optimal Pareto-fronts for multi-objective, mixed variable design spaces can be found efficiently while ensuring diverse solutions. The method is sufficiently flexible to incorporate different kernels and acquisition functions, including those that were developed for mixed variable or multi-objective problems by other authors. We also present HedgeMO, a modified Hedge strategy that uses a portfolio of acquisition functions for multi-objective problems. We present a new acquisition function, SMC. Our results show that MixMOBO performs well against other mixed variable algorithms on synthetic problems. We apply MixMOBO to the real-world design of an architected material and show that our optimal design, which was experimentally fabricated and validated, has a normalized strain energy density \(10^4\) times greater than existing structures.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00158-022-03382-y/MediaObjects/158_2022_3382_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00158-022-03382-y/MediaObjects/158_2022_3382_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00158-022-03382-y/MediaObjects/158_2022_3382_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00158-022-03382-y/MediaObjects/158_2022_3382_Fig4_HTML.png)
Similar content being viewed by others
Data availability
Complete data sets to reproduce any and all experiments which were generated during and analysed during the current study and MixMOBO code are available from the corresponding author on reasonable request.
References
Balandat M, Karrer B, Jiang DR, Daulton S, Letham B, Wilson AG, Bakshy E (2020) A framework for efficient monte-carlo Bayesian optimization, Botorch
Baptista R, Poloczek M (2018) Bayesian optimization of combinatorial structures
Bauer J, Schroer A, Schwaiger R, Kraft O (2016) Approaching theoretical strength in glassy carbon nanolattices. Nat Mater 15(4):438–443
Berger JB, Wadley HNG, McMeeking RM (2017) Mechanical metamaterials at the theoretical limit of isotropic elastic stiffness. Nature 543(7646):533–537
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of the 24th international conference on neural information processing systems, NIPS’11, PP 2546–2554, Red Hook, NY, USA. Curran Associates Inc
Bergstra J, Yamins D, Cox D (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th international conference on machine learning, volume 28 of Proceedings of Machine Learning Research, pP 115–123, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR
Brochu E, Cora VM, de Freitas N (2010) A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning
Brochu E, Hoffman MW, de Freitas N (2011) Portfolio allocation for Bayesian optimization
Chen D, Skouras M, Zhu B, Matusik W (2018a) Computational discovery of extremal microstructure families. Sci Adv 4(1):eaao7005
Chen Y, Huang A, Wang Z, Antonoglou I, Schrittwieser J, Silver D, de Freitas N (2018b) Bayesian optimization in alphago. CoRR. ar**v:abs/1812.06855
Chen W, Watts S, Jackson JA, Smith WL, Tortorelli DA, Spadaccini CM (2019) Stiff isotropic lattices beyond the Maxwell criterion. Sci Adv 5(9):eaaw1937
Daulton S, Balandat M, Bakshy E (2020) Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization, 2020
Daulton S, Eriksson D, Balandat M, Bakshy E (2021) Multi-objective Bayesian optimization over high-dimensional search spaces, 2021
Daxberger E, Makarova A, Turchetta M, Krause A (2020) Mixed-variable Bayesian optimization. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, July. https://doi.org/10.24963/ijcai.2020/365
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197. https://doi.org/10.1109/4235.996017
Deshwal A, Doppa JR (2021) Combining latent space and structured kernels for bayesian optimization over combinatorial spaces. CoRR, abs/2111.01186
Deshwal A, Belakaria S, Doppa JR (2021) Bayesian optimization over hybrid spaces
Fonseca CM, Paquete L, Lopez-Ibanez M (2006) An improved dimension-sweep algorithm for the hypervolume indicator. In: 2006 IEEE international conference on evolutionary computation, PP 1157–1163. https://doi.org/10.1109/CEC.2006.1688440
Frazier PI, Wang J (2015) Bayesian optimization for materials design, Springer Series in Materials Science. Springer, Berlin. pp 45–75. https://doi.org/10.1007/978-3-319-23871-5_3
GPyOptAuthors (2016) GPyOpt: a Bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt
Garrido-Merchán EC, Hernández-Lobato D (2020) Dealing with categorical and integer-valued variables in Bayesian optimization with Gaussian processes. Neurocomputing 380:20–35. https://doi.org/10.1016/j.neucom.2019.11.004
Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: a service for black-box optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’17, pp 1487–1495, New York, NY, USA. Association for Computing Machinery. ISBN 9781450348874. https://doi.org/10.1145/3097983.3098043
Gopakumar S, Gupta S, Rana S, Nguyen V, Venkatesh S (2018) Algorithmic assurance: An active approach to algorithmic testing using Bayesian optimisation. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in Neural Information Processing Systems, vol 31. Curran Associates, New York
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Coello Coello CC (ed) Learning and Intelligent Optimization, pp 507�523. Springer, Berlin
Korovina K, Xu S, Kandasamy K, Neiswanger W, Poczos B, Schneider J, Chembo EX (2020): Bayesian optimization of small organic molecules with synthesizable recommendations. In: Proceedings of the twenty third international conference on artificial intelligence and statistics, volume 108 of Proceedings of Machine Learning Research, pP 3393–3403. PMLR, 26–28 Aug
Krause A, Singh A, Guestrin C (2008) Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. J Mach Learn Res 9(8):235–284
Li R, Emmerich M, Eggermont J, Bovenkamp E, Bäck T, Dijkstra J, Reiber J (2006) Mixed-integer nk landscapes. 4193:42–51. https://doi.org/10.1007/11844297_5
Lindauer M, Eggensperger K, Feurer M, Biedenkapp A, Deng D, Benjamins C, Sass R, Hutter F (2021) Smac3: a versatile Bayesian optimization package for hyperparameter optimization, 2021
Meza Lucas R, Das S, Greer Julia R (2014) Strong, lightweight, and recoverable three-dimensional ceramic nanolattices. Science 345(6202):1322–1326
Mockus J (1994) Application of Bayesian approach to numerical methods of global and stochastic optimization. J Glob Optim 4:347–365
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, New York
Nguyen D, Gupta S, Rana S, Shilton A, Venkatesh S (2019) Bayesian optimization for categorical and category-specific continuous inputs
Oh C, Gavves E, Welling M (2018) BOCK : Bayesian optimization with cylindrical kernels. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, Pp 3868–3877. PMLR, 10–15 Jul
Oh C, Tomczak JM, Gavves E, Welling M (2019) Combinatorial Bayesian optimization using the graph Cartesian product, 2019
Oh C, Gavves E, Welling M (2021) Mixed variable bayesian optimization with frequency modulated kernels
Pelamatti J, Brevault L, Balesdent M, Talbi E-G, Guerin Y (2018) Efficient global optimization of constrained mixed variable problems
Pham M-S, Liu C, Todd I, Lertthanasarn J (2019) Damage-tolerant architected materials inspired by crystal microstructure. Nature 565(7739):305–311
Pyzer-Knapp E (2018) Bayesian optimization for accelerated drug discovery. IBM J Res Dev 11:1–1. https://doi.org/10.1147/JRD.2018.2881731
Qian PZG, Wu H, Jeff Wu CF (2008) Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50(3):383–396
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning, Adaptive computation and machine learning. MIT Press, New York
Ru B, Alvi AS, Nguyen V, Osborne MA, Roberts SJ (2020) Bayesian optimisation over multiple continuous and categorical inputs
Scikit-learn (2021) scikit-optimize. https://scikit-optimize.github.io/stable/
Shaw LA, Sun F, Portela CM, Barranco RI, Greer JR, Hopkins JB (2019) Computationally efficient design of directionally compliant metamaterials. Nat Commun 10(1):1–13
Sheikh HM, Callan T, Hennessy K, Marcus P (2021) Shape optimization methodology for fluid flows using mixed variable Bayesian optimization and design-by-morphing. In: APS division of fluid dynamics meeting abstracts, APS Meeting Abstracts, page A15.004
Sheikh HM, Lee S, Wang J, Marcus PS (2022a) Airfoil optimization using design-by-morphing. https://arxiv.org/abs/2207.11448
Sheikh HM, Callan TA, Hennessy KJ, Marcus PS (2022b) Optimization of the shape of a hydrokinetic turbine’s draft tube and hub assembly using design-by-morphing with Bayesian optimization. ar**v: abs/2207.11451
Shu L, Jiang P, Shao X, Wang Y (2020) A new multi-objective bayesian optimization formulation with the acquisition function for convergence and diversity. J Mech Des 142(9):032020. https://doi.org/10.1115/1.4046508.091703
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms
Song J, Wang Y, Zhou W, Fan R, Bin Y, Yang L, Li L (2019) Topology optimization-guided lattice composites and their mechanical characterizations. Compos Part B 160:402–411
Song J, Zhou W, Wang Y, Fan R, Wang Y, Chen J, Yang L, Li L (2019) Octet-truss cellular materials for improved mechanical properties and specific energy absorption. Mater Des 173:107773
Song J, Michas C, Chen Christopher S, White Alice E, Grinstaff Mark W (2020) From simple to architecturally complex hydrogel scaffolds for cell and tissue engineering applications: Opportunities presented by two-photon polymerization. Adv Healthc Mater 9(1):1901217
Srinivas N, Krause A, Kakade SM, Seeger MW (2012) Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans Inf Theory 58(5):3250–3265. https://doi.org/10.1109/tit.2011.2182033
Stuart K, Simon L (1987) Towards a general theory of adaptive walks on rugged landscapes. J Theoret Biol 128(1):11–45. https://doi.org/10.1016/S0022-5193(87)80029-2
Suzuki S, Shion T, Kazuki S, Masayuki K (2020) Multi-objective Bayesian optimization using pareto-frontier entropy, Tomoyuki Tamura
Tancogne-Dejean T, Diamantopoulou M, Gorji MB, Bonatti C, Mohr D (2018) 3d plate-lattices: an emerging class of low-density metamaterial exhibiting optimal isotropic stiffness. Adv Mater 30(45):1803334
Tea T, Dimo B, Nikolaus H (2019) Mixed-integer benchmark problems for single- and bi-objective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’19, pp 718–726, New York, NY, USA. Association for Computing Machinery. https://doi.org/10.1145/3321707.3321868
Tiao LC, Klein A, Seeger M, Bonilla EV, Archambeau C, Fabio R (2021) Bayesian optimization by density-ratio estimation, Bore
Vangelatos Z, Komvopoulos K, Grigoropoulos C (2020) Regulating the mechanical behavior of metamaterial microlattices by tactical structure modification. J Mech Phys Solids 1:104112
Vangelatos Z, Sheikh HM, Marcus PS, Grigoropoulos CP, Lopez VZ, Flamourakis G, Farsari M (2021) Strength through defects: a novel Bayesian approach for the optimization of architected materials. Sci Adv 7:41. https://doi.org/10.1126/sciadv.abk2218
**a X, Afshar A, Yang H, Carlos MP, Dennis MK, Claudio VDL, Julia R (2019) Electrochemically reconfigurable architected materials. Nature 573(7773):205–213
Yingjie H, Jianqiang H, Yifan X, Fengchun W, Rong C (2011) Contamination control in food supply chain. pp 2678�2681. https://doi.org/10.1109/WSC.2010.5678963
Zhang X, Vyatskikh A, Gao H, Greer JR, Li X (2019) Lightweight, flaw-tolerant, and ultrastrong nanoarchitected carbon. Proc Natl Acad Sci USA 116(14):6665–6672
Zhang Y, Apley DW, Chen W (2020) Bayesian optimization for materials design with mixed quantitative and qualitative variables. Sci Rep 10(1):1–13
Zheng X, Lee H, Weisgraber TH, Shusteff M, DeOtte J, Duoss EB, Kuntz JD, Biener MM, Ge Q, Jackson JA et al (2014) Ultralight, ultrastiff mechanical metamaterials. Science 344(6190):1373–1377
Zhou Q, Qian PZG, Zhou S (2011) A simple approach to emulation for computer models with qualitative and quantitative factors. Technometrics 53(3):266–273
Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms: empirical results. Evol Comput 8(2):173–195. https://doi.org/10.1162/106365600568202
Acknowledgements
The authors would like to thank Chiyu ‘Max’ Jiang, research scientist at Waymo Research, and Professor Uros Seljak, Department of Physics, University of California at Berkeley (UCB) for insightful discussions regarding Bayesian optimization. We would also like to thank Zacharias Vangelatos and Professor Costas P. Grigoropoulos, Department of Mechanical Engineering, University of California at Berkeley (UCB) for the collaboration to design and manufacture architected materials, conduct nanoindentation, SEM, and HIM experiments. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 through allocation TG-CTS190047.
Funding
The authors declare no funding sources that need to be disclosed.
Author information
Authors and Affiliations
Contributions
H.M.S conceptualized the algorithm, designed the methodology and performed experiments under the supervision of P.S.M. H.M.S and P.S.M then wrote and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare conflicts or competing interests that need to be disclosed.
Replication of Results
All the results in this manuscript can be replicated. The complete data sets, MixMOBO algorithm or any other supplementary material and information required for replication are available from the corresponding author on reasonable request.
Additional information
Responsible Editor: Byeng D Youn
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Benchmark Test Functions
Appendix A: Benchmark Test Functions
In this section, we define the benchmark test functions, all of which are set to be maximized during our optimizations.
1.1 Contamination Problem
The contamination problem was introduced by Hu et al. (2011) and is used to test categorical variables with binary categories. The problem aims to maximize the reward function for applying a preventative measure to stop contamination in a food supply chain with D stages. At each \(i^{th}\) stage, where \(i\in [1,D]\), decontamination efforts can be applied. However, this effort comes at a cost c and will decrease the contamination by a random rate \(\Gamma _i\). If no prevention effort is taken, the contamination spreads with a rate of \(\Omega _i\). At each stage i, the fraction of contaminated food is given by the recursive relation:
here \(w_i\in {0,1}\) and is the decision variable to determine if preventative measures are taken at \(i^{th}\) stage or not. The goal is to decide which stages i action should be taken to make sure \(Z_i\) does not exceed an upper limit \(U_i\). \(\Omega _i\) and \(\Sigma _i\) are determined by a uniform distribution. We consider the problem setup with Langrangian relaxation (Baptista and Poloczek 2018):
Here violation of \(Z_k<U_i\) is penalized by \(\rho =1\) and summing the contaminated stages if the limit is violated and our total stages or dimensions are \(D=21\). The cost c is set to be 0.2 and \(Z_1=0.01\). As in the setup for Baptista and Poloczek (2018), we use \(T=100\) stages, \(U_i=0.1\), \(\lambda =0.01\) and \(\epsilon =0.05\).
1.2 Encrypted Amalgamated
Analytic test functions generally cannot mimic mixed variables. To map the continuous output of a function into N discrete ordinal or categorical variables, the continuous range of the test function’s output is first discretized into N discrete subranges by selecting \((N-1)\) break points, often equally spaced, within the bounds of the range. Then, the continuous output variable is assigned the integer round-off value of the subrange defined by its surrounding pair of break points. If necessary, the domain of the test function’s output is first mapped into a larger domain so that each subrange has a unique integer value. To mimic ordinal variables, we are done, but for categorical variables, a random vector for each categorical variable is then generated which scrambles or ‘encrypts’ the indices of these values, thus creating random landscapes as is the case for categorical variables with a latent space. The optimization algorithm only sees the encrypted space and the random vector is only used when evaluating the black-box function.
We also define a new test function that we call the Amalgamated function, a piece-wise function formed from commonly used analytical test functions with different features (for more details on these functions we refer to Tušar et al. (2019)). The Amalgamated function is non-convex and anisotropic, unlike conventional test functions where isotropy can be exploited.
For \(i=1...n\), \(k=\)mod\((i-1,7)\):
where
To create the Encrypted Amalgamated function, for categorical and ordinal variables, equally spaced points are taken within the bounds defined above. For our current work, we use a \(D=13\) with 8 categorical and 3 ordinal variables with 5 states each, and 2 continuous variables.
1.3 NK Landscapes
NK Landscapes were introduced by Kauffman and Levin (1987) as a way of creating optimization problems with categorical variables. N describes the number of genes or number of dimensions D and K is the number of epistatic links of each gene to other genes, which describes the ‘ruggedness’ of the landscape. A large number of random landscapes can be created for given N and K values. The global optimum of a generated landscape for experimentation can only be computed through complete enumeration. The landscape cost for any vector is calculated as an average of each component cost. Each component cost is based on the random values generated for the categories, not only by its own alleles, but also by the alleles in the other genes connected through the random epistasis matrix, with K probability or ruggedness. A \(K=1\) ruggedness translates to a fully connected genome.
The NK Landscapes from Kauffman and Levin (1987) were formulated only for binary variables. They were extended by Li et al. (2006) for multi-categorical problems, which is the formulation we use. Details of the NK Landscape test-functions we use can be found in Li et al. (2006). For the current study, we use \(N=8\) with 4 categories each and ruggedness \(K=0.2\).
1.4 Rastringin
Rastringin function is a commonly used non-convex optimization function (Tušar et al. 2019) with a large number of local optima. It is defined as:
We use \(D=9\) for testing with 6 ordinal with 5 discrete states and 3 continuous variables. The ordinal variables are equally spaced within the bounds.
1.5 Encrypted Syblinski-Tang
We use the Syblinski-Tang function (Tušar et al. 2019), an isotropic non-convex function. The function is considered difficult to optimize because many search algorithms get ‘stuck’ at a local optimum. For use with categorical variables, we encrypt it as described previously. The Syblinski-Tang function, in terms of input vector \({\vec {w}}\), is defined as:
For the current study, this function was tested with \(D=10\) categorical variables and 5 categories for each variable.
1.6 Encrypted ZDT6
ZDT benchmarks are a suite of multi-objective problems, suggested by Zitzler et al. (2000), and most commonly used for testing such problems. We use ZDT6, which is non-convex and non-uniform in its parameter space. We again modify the function by encrypting it to work with categorical problems. ZDT6 is defined as:
Here \(w_1 \in [0,1]\) and \(w_i =0\) for \(i = 2,\dots ,D\). The function was tested for \(D=10\) with 5 categories each. We note that to evaluate the performance of MixMOBO, we compared it against the NSGA-II variant Deb et al. (2002) that can deal with mixed variables (by running ZDT4 in a mixed variable setting and ZDT6 with categorical variables). No encryption is necessary for GAs. GAs required, on average, \(10^2\) more function calls compared to MixMOBO.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sheikh, H.M., Marcus, P.S. Bayesian optimization for mixed-variable, multi-objective problems. Struct Multidisc Optim 65, 331 (2022). https://doi.org/10.1007/s00158-022-03382-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00158-022-03382-y