Abstract
Exact combinatorial search algorithms have applications in several areas of computational algebra, AI, discrete optimization, etc. These problems are compute-intensive and have a highly irregular search tree. Most of the earlier efforts to parallelize these algorithms used a fixed degree of parallelism during runtime. We show that such an approach leads to poor resource utilization as the parallel run-time efficiency of an irregular search application varies over time. We propose DiGTreeS, a distributed resilient framework for generalized tree search that supports elastic scaling. It features an easy-to-use API for expressing combinatorial search and hides away the system concerns such as load balancing, fault tolerance, and elastic scaling. We evaluate the DiGTreeS framework for different scaling strategies and show its effectiveness on four representative problem instances: Traveling Salesman Problem, 0–1 Knapsack, N-queens, and Generic State Space Search Application.
Similar content being viewed by others
Data availability
Data and/or code will be made available on request.
Notes
It is a technique where the elasticity controller reacts to the change in the system and makes decisions about scaling operations [6].
A task refers to an unexplored portion of the subtree.
ThreadMXBean returns the user-level CPU time for the current thread if CPU time measurement is enabled; \(-1\) otherwise.
CPLEX is a commercial ILP solver by IBM.
Hybrid scaling is the combination of both upscaling and downscaling.
We calculate the percentage deviation as the ratio of difference of current value and mean value to the mean value, i.e., \(\text {deviation} = \frac{|\text {current\,value} - \text {mean\,value}|}{\text {mean\,value}} \times 100.\)
Execution starts with 10 workers as workers may fail at the very beginning of the execution when the number of workers is small (i.e., less than 4).
It is the technique to anticipate future changes in the system and act accordingly before it occurs [6].
References
Paschos VT (2014) Applications of combinatorial optimization. Wiley, Hoboken
Archibald B, Maier P, Stewart R, Trinder P (2019) Implementing yewpar: a framework for parallel tree search. In: Euro-Par 2019: Parallel Processing: 25th International Conference on Parallel and Distributed Computing, Göttingen, Germany, August 26–30, 2019, Proceedings 25. Springer, pp 184–196
Goldreich O (2010) P, NP, and NP-completeness: the basics of computational complexity. Cambridge University Press, Cambridge
Kehrer S, Blochinger W (2020) Equilibrium: an elasticity controller for parallel tree search in the cloud. J Supercomput 76:9211–9245
Yasugi M, Muraoka D, Hiraishi T, Umatani S, Emoto K (2019) Hope: a parallel execution model based on hierarchical omission. In: Proceedings of the 48th International Conference on Parallel Processing, pp 1–11
Rampérez V, Soriano J, Lizcano D, Lara JA (2021) Flas: a combination of proactive and reactive auto-scaling architecture for distributed services. Futur Gener Comput Syst 118:56–72
Haussmann J, Blochinger W, Kuechlin W (2019) Cost-efficient parallel processing of irregularly structured problems in cloud computing environments. Clust Comput 22(3):887–909
Rosa Righi R, Rodrigues VF, Rostirolla G, Costa CA, Roloff E, Navaux POA (2018) A lightweight plug-and-play elasticity service for self-organizing resource provisioning on parallel applications. Futur Gener Comput Syst 78:176–190
Vizel Y, Weissenbacher G, Malik S (2015) Boolean satisfiability solvers and their applications in model checking. Proc IEEE 103(11):2021–2035
Yang J, He Q (2018) Scheduling parallel computations by work stealing: a survey. Int J Parallel Prog 46:173–197
**e F, Davenport A (2010) Massively parallel constraint programming for supercomputers: Challenges and initial results. In: International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming. Springer, pp 334–338
Herbst NR, Kounev S, Reussner R (2013) Elasticity in cloud computing: what it is, and what it is not. In: 10th International Conference on Autonomic Computing (ICAC 13), pp 23–27
Hunt P, Konar M, Junqueira FP, Reed B (2010) Zookeeper: wait-free coordination for internet-scale systems. In: USENIX Annual Technical Conference, vol 8
Kreps J, Narkhede N, Rao J et al. (2011) Kafka: a distributed messaging system for log processing. In: Proceedings of the NetDB, vol 11. Athens, Greece, pp 1–7
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp 1–10. https://doi.org/10.1109/MSST.2010.5496972
Gupta A, Faraboschi P, Gioachin F, Kale LV, Kaufmann R, Lee B-S, March V, Milojicic D, Suen CH (2014) Evaluating and improving the performance and scheduling of hpc applications in cloud. IEEE Trans Cloud Comput 4(3):307–321
Bui P, Rajan D, Abdul-Wahid B, Izaguirre J, Thain D (2011) Work queue+ python: a framework for scalable scientific ensemble applications. In: Workshop on Python for High Performance and Scientific Computing at Sc11
Rosa Righi R, Rodrigues VF, Da Costa CA, Galante G, De Bona LCE, Ferreto T (2015) Autoelastic: automatic resource elasticity for high performance applications in the cloud. IEEE Trans Cloud Comput 4(1):6–19
Archibald B, Maier P, Stewart R, Trinder P, De Beule J (2017) Towards generic scalable parallel combinatorial search. In: Proceedings of the International Workshop on Parallel Symbolic Computation, pp 1–10
Poldner M, Kuchen H (2008) Algorithmic skeletons for branch and bound. In: Software and Data Technologies: First International Conference, ICSOFT 2006, Setúbal, Portugal, September 11–14, 2006, Revised Selected Papers 1. Springer, pp 204–219
Bungart M, Fohry C (2017) A malleable and fault-tolerant task pool framework for x10. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, pp 749–757
Johnson DS, McGeoch LA (1997) The traveling salesman problem: a case study in local optimization. Local Search Comb Optim 1(1):215–310
Salkin HM, De Kluyver CA (1975) The knapsack problem: a survey. Naval Res Logist Q 22(1):127–144
Bell J, Stevens B (2009) A survey of known results and research areas for n-queens. Discret Math 309(1):1–31
Khaund A, Sharma AM, Tiwari A, Garg S, Kailasam S (2023) Rd-fca: a resilient distributed framework for formal concept analysis. J Parall Distrib Comput 179:104710
Archibald B, Maier P, McCreesh C, Stewart R, Trinder P (2018) Replicable parallel branch and bound search. J Parall Distrib Comput 113:92–114
Prim RC (1957) Shortest connection networks and some generalizations. Bell Syst Tech J 36(6):1389–1401
Kizilateş G, Nuriyeva F (2013) On the nearest neighbor algorithms for the traveling salesman problem. In: Advances in Computational Science, Engineering and Information Technology: Proceedings of the Third International Conference on Computational Science, Engineering and Information Technology (CCSEIT-2013), KTO Karatay University, June 7–9, 2013, Konya, Turkey-Volume 1. Springer, pp 111–118
Bersani MM, Bianculli D, Dustdar S, Gambi A, Ghezzi C, Krstić S (2014) Towards the formalization of properties of cloud-based elastic systems. In: Proceedings of the 6th International Workshop on Principles of Engineering Service-Oriented and Cloud Systems, pp 38–47
David P (2005) Where are the hard knapsack problems? Comput Oper Res 32(9):2271–2284
Zangeneh A, Jadid S, Rahimi-Kian A (2010) Normal boundary intersection and benefit-cost ratio for distributed generation planning. Eur Trans Electr Power 20(2):97–113
Author information
Authors and Affiliations
Contributions
MAJ helped in conceptualization, methodology, writing—original draft, writing—reviewing and editing, software. SK contributed to conceptualization, methodology, writing—reviewing and editing. BG and VS were involved in conceptualization and methodology.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jamal, M.A., Kailasam, S., Goyal, B. et al. DiGTreeS: a distributed resilient framework for generalized tree search. J Supercomput 80, 15006–15037 (2024). https://doi.org/10.1007/s11227-024-06017-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06017-9