Abstract
Understanding the decisions of tree-based ensembles and their relationships is pivotal for machine learning model interpretation. Recent attempts to mitigate the human-in-the-loop interpretation challenge have explored the extraction of the decision structure underlying the model taking advantage of graph simplification and path emphasis. However, while these efforts enhance the visualisation experience, they may either result in a visually complex representation or compromise the interpretability of the original ensemble model. In addressing this challenge, especially in complex scenarios, we introduce the Decision Predicate Graph (DPG) as a model-specific tool to provide a global interpretation of the model. DPG is a graph structure that captures the tree-based ensemble model and learned dataset details, preserving the relations among features, logical decisions, and predictions towards emphasising insightful points. Leveraging well-known graph theory concepts, such as the notions of centrality and community, DPG offers additional quantitative insights into the model, complementing visualisation techniques, expanding the problem space descriptions, and offering diverse possibilities for extensions. Empirical experiments demonstrate the potential of DPG in addressing traditional benchmarks and complex classification scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.2870052
Aria, M., Cuccurullo, C., Gnasso, A.: A comparison among interpretative proposals for random forests. Mach. Learn. Appl. 6, 100094 (2021). https://doi.org/10.1016/j.mlwa.2021.100094
Brandes, U.: On variants of shortest-path betweenness centrality and their generic computation 30(2), 136–145 (2008). https://doi.org/10.1016/j.socnet.2007.11.001
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Chimatapu, R., Hagras, H., Starkey, A., Owusu, G.: Explainable AI and fuzzy logic systems. In: Fagan, D., Martín-Vide, C., O’Neill, M., Vega-Rodríguez, M.A. (eds.) TPNC 2018. LNCS, vol. 11324, pp. 3–20. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04070-3_1
Chipman, H., George, E., McCulloch, R.: Making sense of a forest of trees. In: Proceedings of the 30th Symposium on the Interface, vol. 29 (1998)
Dedja, K., Nakano, F.K., Pliakos, K., Vens, C.: BELLATREX: building explanations through a LocaLly AccuraTe rule EXtractor. IEEE Access 11, 41348–41367 (2023). https://doi.org/10.1109/ACCESS.2023.3268866
Deng, H.: Interpreting tree ensembles with inTrees. Int. J. Data Sci. Anal. 7(4), 277–287 (2019). https://doi.org/10.1007/s41060-018-0144-8
Dwivedi, R., et al.: Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Comput. Surv. 55(9), 194:1–194:33 (2023). https://doi.org/10.1145/3561048
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Florio, A.M., Martins, P., Schiffer, M., Serra, T., Vidal, T.: Optimal decision diagrams for classification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 7577–7585 (2023). https://doi.org/10.1609/aaai.v37i6.25920
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Gossen, F., Steffen, B.: Algebraic aggregation of random forests: towards explainability and rapid evaluation. Int. J. Softw. Tools Technol. Transfer 25(3), 1–19 (2021). https://doi.org/10.1007/s10009-021-00635-x
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5) (2018). https://doi.org/10.1145/3236009
Gulowaty, B., Woźniak, M.: Extracting interpretable decision tree ensemble from random forest. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Haddouchi, M., Berrado, A.: A survey of methods and tools used for interpreting random forest. In: 2019 1st International Conference on Smart Systems and Data Science (ICSSD), pp. 1–6 (2019). https://doi.org/10.1109/ICSSD47982.2019.9002770
Hanif, A., Zhang, X., Wood, S.: A survey on explainable artificial intelligence techniques and challenges. In: 2021 IEEE 25th International Enterprise Distributed Object Computing Workshop (EDOCW), pp. 81–89 (2021). https://doi.org/10.1109/EDOCW52865.2021.00036, ISSN: 2325-6605
Hara, S., Hayashi, K.: Making tree ensembles interpretable: a bayesian model selection approach. In: Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, pp. 77–85. PMLR (2018), ISSN: 2640-3498
Hastie, T., Tibshirani, R., Friedman, J.: Additive models, trees, and related methods. In: The Elements of Statistical Learning. SSS, pp. 295–336. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7_9
Hatwell, J., Gaber, M.M., Azad, R.M.A.: CHIRPS: explaining random forest classification. Artif. Intell. Rev. 53(8), 5747–5788 (2020). https://doi.org/10.1007/s10462-020-09833-6
Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. vol. 1, pp. 278–282 vol.1 (1995). https://doi.org/10.1109/ICDAR.1995.598994
Ignatov, D., Ignatov, A.: Decision stream: Cultivating deep decision trees. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 905–912. IEEE (2017). https://doi.org/10.1109/ICTAI.2017.00140
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4768–4777. NIPS’17, Curran Associates Inc. (2017)
Malekloo, A., Ozer, E., AlHamaydeh, M., Girolami, M.: Machine learning and structural health monitoring overview with emerging technology and high-dimensional data source highlights. Struct. Health Monit. 21(4), 1906–1955 (2022). https://doi.org/10.1177/14759217211036880
Mashayekhi, M., Gras, R.: Rule extraction from random forest: the RF+HC methods. In: Barbosa, D., Milios, E. (eds.) CANADIAN AI 2015. LNCS (LNAI), vol. 9091, pp. 223–237. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18356-5_20
Mienye, I.D., Sun, Y.: A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access 10, 99129–99149 (2022). https://doi.org/10.1109/ACCESS.2022.3207287
Mones, E., Vicsek, L., Vicsek, T.: Hierarchy measure for complex networks. PLoS ONE 7(3), e33799 (2012). https://doi.org/10.1371/journal.pone.0033799
Murtovi, A., Bainczyk, A., Nolte, G., Schlüter, M., Steffen, B.: Forest GUMP: a tool for verification and explanation. Int. J. Softw. Tools Technol. Transfer 25(3), 287–299 (2023). https://doi.org/10.1007/s10009-023-00702-5
Nakahara, H., **guji, A., Sato, S., Sasao, T.: A random forest using a multi-valued decision diagram on an FPGA. In: 2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL), pp. 266–271 (2017). https://doi.org/10.1109/ISMVL.2017.40, ISSN: 2378-2226
Needham, S., Dowe, D.L.: Message length as an effective Ockham’s razor in decision tree induction. In: International Workshop on Artificial Intelligence and Statistics, pp. 216–223. PMLR (2001), ISSN: 2640-3498
Neto, M.P., Paulovich, F.V.: Explainable matrix - visualization for global and local interpretability of random forest classification ensembles. IEEE Trans. Visual Comput. Graph. 27(2), 1427–1437 (2020). https://doi.org/10.1109/TVCG.2020.3030354
Oliver, J.: Decision graphs - an extension of decision trees. Citeseer (1992)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. In: Proceedings of the National Academy of Sciences. vol. 101, pp. 2658–2663 (2004). https://doi.org/10.1073/pnas.0400054101
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007). https://doi.org/10.1103/PhysRevE.76.036106
Ribeiro, M.T., Singh, S., Guestrin, C.: "why should i trust you?": Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. KDD ’16, Association for Computing Machinery (2016). https://doi.org/10.1145/2939672.2939778
Ribeiro, M., Singh, S., Guestrin, C.: Anchors: High-precision model-agnostic explanations. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 1527–1535 (2018)
Silva, O., Silva, A., Moreira, I., Nacif, J., Ferreira, R.: RDSF: Everything at same place all at once - a random decision single forest. In: Anais do XIII Simpósio Brasileiro de Engenharia de Sistemas Computacionais (2023)
Tan, P.J., Dowe, D.L.: MML inference of decision graphs with multi-way joins and dynamic attributes. In: Gedeon, T.T.D., Fung, L.C.C. (eds.) AI 2003. LNCS (LNAI), vol. 2903, pp. 269–281. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-24581-0_23
Van Assche, A., Blockeel, H.: Seeing the forest through the trees: learning a comprehensible model from an ensemble. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 418–429. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_39
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Information Science and Statistics, Springer-Verlag (2005). https://doi.org/10.1007/0-387-27656-4
Zhao, X., Wu, Y., Lee, D.L., Cui, W.: iForest: interpreting random forests via visual analytics. IEEE Trans. Visual Comput. Graphics 25(1), 407–416 (2019). https://doi.org/10.1109/TVCG.2018.2864475
Zhou, Y., Hooker, G.: Interpreting models via single tree approximation (2016)
Zhu, B., Shoaran, M.: Tree in tree: from decision trees to decision graphs. Adv. Neural. Inf. Process. Syst. 34, 13707–13718 (2021)
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Arrighi, L., Pennella, L., Marques Tavares, G., Barbon Junior, S. (2024). Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2154. Springer, Cham. https://doi.org/10.1007/978-3-031-63797-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-63797-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63796-4
Online ISBN: 978-3-031-63797-1
eBook Packages: Computer ScienceComputer Science (R0)