Abstract
Graphlet enumeration is a fundamental problem to discover interesting patterns hidden in graphs. It has many applications in science including Biology and Chemistry. In this paper, we present a novel approach to discover these patterns with queries, in a parallel database system. Our solution is based on an efficient partitioning strategy based on randomized vertex coloring, that guarantees perfect load balancing and accurate graphlet enumeration (complete and consistent). To the best of our knowledge, our work is the first to provide an abstract and efficient database solution with queries to enumerate both 3-vertex and 4-vertex patterns on large graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, A., Enns, K., Thomo, A.: Triangle enumeration for billion-scale graphs in RDBMS. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 226, pp. 160–173. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75075-6_13
Al-Amin, S.T., Ordonez, C., Bellatreche, L.: Big data analytics: exploring graphs with optimized SQL queries. In: Elloumi, M., et al. (eds.) DEXA 2018. CCIS, vol. 903, pp. 88–100. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99133-7_7
Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: Proceedings of WWW (2004)
Bröcheler, M., Pugliese, A., Subrahmanian, V.: COSI: cloud oriented subgraph identification in massive social networks. In: Proceedings of IEEE ASONAM (2010)
Charbey, R., Prieur, C.: Stars, holes, or paths across your Facebook friends: a graphlet-based characterization of many networks. Netw. Sci. 7(4), 476–497 (2019)
Deshpande, M., Kuramochi, M., Wale, N., Karypis, G.: Frequent substructure-based approaches for classifying chemical compounds. IEEE TKDE 17(8), 1036–1050 (2005)
Farouzi, A., Bellatreche, L., Ordonez, C., Pandurangan, G., Malki, M.: A scalable randomized algorithm for triangle enumeration on graphs based on SQL queries. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 141–156. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_12
Jachiet, L., Genevès, P., Gesbert, N., Layaida, N.: On the optimization of recursive relational queries: application to graph queries. In: Proceedings of ACM SIGMOD (2020)
Klauck, H., Nanongkai, D., Pandurangan, G., Robinson, P.: Distributed computation of large-scale graph problems. In: Proceedings of ACM-SIAM SODA (2015)
Lan, M., Wu, X., Theodoratos, D.: Answering graph pattern queries using compact materialized views. In: Proceedings of DOLAP (2022)
Liu, X., Santoso, Y., Srinivasan, V., Thomo, A.: Distributed enumeration of four node graphlets at quadrillion-scale. In: Proceedings of SSDBM (2021)
Marcus, D., Shavitt, Y.: Rage - a rapid graphlet enumerator for large networks. Comput. Netw. 56(2), 810–819 (2012)
Milenković, T., Przulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6, CIN–S680 (2008)
Ordonez, C., Cabrera, W., Gurram, A.: Comparing columnar, row and array DBMSs to process recursive queries on graphs. Inf. Syst. 63, 66–79 (2017)
Park, H., Silvestri, F., Pagh, R., Chung, C., Myaeng, S., Kang, U.: Enumerating trillion subgraphs on distributed systems. ACM TKDD 12(6), 71:1–71:30 (2018)
Wernicke, S., Rasche, F.: FANMOD: a tool for fast network motif detection. Bioinformatics 22(9), 1152–1153 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Farouzi, A., Zhou, X., Bellatreche, L., Malki, M., Ordonez, C. (2023). Parallel Pattern Enumeration in Large Graphs. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14146. Springer, Cham. https://doi.org/10.1007/978-3-031-39847-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-39847-6_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39846-9
Online ISBN: 978-3-031-39847-6
eBook Packages: Computer ScienceComputer Science (R0)