Log in

Exploring the interpretability of legal terms in tasks of classification of final decisions in administrative procedures

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

Nodaways, diverse artificial intelligence techniques have been applied to analyse datasets in the legal domain. Precisely, several studies aim at predicting the decision to help the competent authority resolve a specific legal process. However, AI-based prediction algorithms are usually black-box, and explaining why the algorithm predicted a label remains challenging. Therefore, this paper proposes a 5-step methodology for analysing legal documents from the agency responsible for resolving administrative sanction procedures related to consumer protection. Our methodology starts with corpus collection, pre-processing, and TF vectorisation. Later, fifteen machine and deep learning algorithms were tested, and the best-performing one was selected based on quality metrics. Interpretability is emphasised, with the SHAP scores used to explain predictions. The results show that our methodology contributes to the understanding the decisive influence of legal terms and their connection to the decision made by the competent authority. By providing tools for legal professionals to make more informed decisions, develop effective legal strategies, and ensure fairness and transparency in the legal decision-making process, this methodology has broad implications for various legal areas beyond disputes, including administrative procedures like bankruptcies and unfair competition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The dataset is available at https://github.com/huvaso/Interpretability_Legal_Domain

Code Availability

For reproducibility purposes, the code is available on https://github.com/huvaso/Interpretability_Legal_Domain

Notes

  1. National Institute for the Defense of Competition and the Protection of Intellectual Property https://www.gob.pe/indecopi

References

  • Ables, J., Kirby, T., Anderson, W., Mittal, S., Rahimi, S., Banicescu, I., Seale, M.: Creating an explainable intrusion detection system using self organizing maps. In: 2022 IEEE Symposium Series on Computational Intelligence (SSCI), 404. IEEE (2022)

  • Abulaish, M., Sah, A.K.: A text data augmentation approach for improving the performance of CNN. In: 2019 11th International Conference on Communication Systems & Networks (COMSNETS), 625. IEEE (2019)

  • Alam, S., Yao, N.: The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Comput. Math. Organ. Theory 25, 319 (2019)

    Article  Google Scholar 

  • Bhambhoria, R., Dahan, S., Zhu, X.: Investigating the State-of-the-Art Performance and Explainability of Legal Judgment Prediction. In: Canadian Conference on AI (2021)

  • Bhambhoria, R., Liu, H., Dahan, S., Zhu, X.: Interpretable low-resource legal decision making. In: Proceedings of the AAAI Conference on Artificial Intelligence 36, 11819 (2022)

  • Costa, J.A.F., Dantas, N.C.D., Silva, E.D.S.: Evaluating Text Classification in the Legal Domain Using BERT Embeddings. In: International Conference on Intelligent Data Engineering and Automated Learning, 51. Springer (2023)

  • Danowski, J.A., Yan, B., Riopelle, K.: A semantic network approach to measuring sentiment. Qual. & Quant. 55, 221 (2021)

    Article  Google Scholar 

  • de Arriba-Pérez, F., García-Méndez, S., González-Castaño, F.J., González-González, J.: Explainable machine learning multi-label classification of Spanish legal judgements. J. K. Saud Univ.-Comput. Inf. Sci. 34, 10180 (2022)

    Google Scholar 

  • Deliu, N.: Reinforcement learning for sequential decision making in population research. Qual. & Quant. 1 (2023)

  • Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. ar**v preprintar**v:1702.08608 (2017)

  • Durand, C., Peña Ibarra, L.P., Rezgui, N., Wutchiett, D.: How to combine and analyze all the data from diverse sources: a multilevel analysis of institutional trust in the world. Qual. & Quant., 1 (2021)

  • Garreau, D., Luxburg, U.: Explaining the explainer: a first theoretical analysis of LIME. In: International conference on artificial intelligence and statistics, 1287. PMLR (2020)

  • González-González, J., de Arriba-Pérez, F., García-Méndez, S., Busto-Castiñeira, A., González-Castaño, F.J.: Automatic explanation of the classification of Spanish legal judgments in jurisdiction-dependent law categories with tree estimators. J. K. Saud Univ.-Comput. Inf. Sci. 35, 101634 (2023)

    Google Scholar 

  • Graziani, M., Dutkiewicz, L., Calvaresi, D., Amorim, J.P., Yordanova, K., Vered, M., Nair, R., Abreu, P.H., Blanke, T., Pulignano, V., et al.: A global taxonomy of interpretable AI: unifying the terminology for the technical and social sciences. Artif. Intell. Rev. 56, 3473 (2023)

    Article  Google Scholar 

  • Ha, C., Tran, V.-D., Van, L.N., Than, K.: Eliminating overfitting of probabilistic topic models on short and noisy text: The role of dropout. Int. J. Approx. Reason. 112, 85 (2019)

    Article  Google Scholar 

  • He, C., Tan, T.-P., Xue, S., Tan, Y.: Explaining legal judgments: A multitask learning framework for enhancing factual consistency in rationale generation. J. K. Saud Univ.-Comput. Inf. Sci. 35, 101868 (2023)

    Google Scholar 

  • Krzeszewska, U., Poniszewska-Marańda, A., Ochelska-Mierzejewska, J.: Systematic comparison of vectorization methods in classification context. Appl. Sci. 12, 5119 (2022)

    Article  Google Scholar 

  • Lessmann, S., Baesens, B., Seow, H.-V., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 247, 124 (2015)

    Article  Google Scholar 

  • Lisboa, P., Saralajew, S., Vellido, A., Fernández-Domenech, R., Villmann, T.: The coming of age of interpretable and explainable machine learning models. Neurocomputing 535, 25 (2023)

    Article  Google Scholar 

  • Liu, L., Zhang, W., Liu, J., Shi, W., Huang, Y.: Interpretable charge prediction for legal cases based on interdependent legal information. In: 2021 International Joint Conference on Neural Networks (IJCNN), 1. IEEE (2021)

  • Lossio-Ventura, J.A., Morzan, J., Alatrista-Salas, H., Hernandez-Boussard, T., Bian, J.: Clustering and topic modeling over tweets: A comparison over a health dataset. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 1544 (2019). https://doi.org/10.1109/BIBM47256.2019.8983167

  • Lossio-Ventura, J.A., Gonzales, S., Morzan, J., Alatrista-Salas, H., Hernandez-Boussard, T., Bian, J.: Evaluation of clustering and topic modeling methods over health-related tweets and emails. Artif. Intel. Med. 117, 102096 (2021)

    Article  Google Scholar 

  • Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017)

  • Luo, C.F., Bhambhoria, R., Dahan, S., Zhu, X.: Prototype-Based Interpretability for Legal Citation Prediction. (2023) ar**v preprintar**v:2305.16490

  • Medvedeva, M., Wieling, M., Vols, M.: Rethinking the field of automatic prediction of court decisions. Artif. Intel. Law 31, 195 (2023)

    Article  Google Scholar 

  • Moosbauer, J., Herbinger, J., Casalicchio, G., Lindauer, M., Bischl, B.: Explaining hyperparameter optimization via partial dependence plots. Adva. Neural Inf. Process. Syst. 34, 2280 (2021)

    Google Scholar 

  • Murdoch, W.J., Singh, C., Kumbier, K., Abbasi-Asl, R., Yu, B.: Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. 116, 22071 (2019)

    Article  Google Scholar 

  • Neupane, S., Ables, J., Anderson, W., Mittal, S., Rahimi, S., Banicescu, I., Seale, M.: Explainable intrusion detection systems (x-ids): A survey of current methods, challenges, and opportunities. IEEE Access 10, 112392 (2022)

    Article  Google Scholar 

  • Nowak, A.S., Radzik, T.: The Shapley value for n-person games in generalized characteristic function form. Games Econ. Behav. 6, 150 (1994)

    Article  Google Scholar 

  • Rani, D., Kumar, R., Chauhan, N.: Study and Comparision of Vectorization Techniques Used in Text Classification. In: 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), 1. IEEE (2022)

  • Roelofs, R.: Measuring Generalization and overfitting in Machine learning. University of California, Berkeley (2019)

    Google Scholar 

  • Solanke, A.A.: Explainable digital forensics AI: Towards mitigating distrust in AI-based digital forensics analysis using interpretable models. Forensic Sci. Int.: Digit. Investig. 42, 301403 (2022)

    Google Scholar 

  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929 (2014)

    Google Scholar 

  • Sun, X., Ren, X., Ma, S., Wang, H.: Meprop: sparsified back propagation for accelerated deep learning with reduced overfitting. In: International Conference on Machine Learning, 3299. PMLR (2017)

  • Suresh, A., Wu, C.-H., Grossglauser, M.: It’s all relative: interpretable models for scoring bias in documents. (2023) ar** study and cross-benchmark evaluation. Comput. Sci. Rev. 39, 100357 (2021)

    Article  Google Scholar 

  • Wysmułek, I., Tomescu-Dubrow, I., Kwak, J.: Ex-post harmonization of cross-national survey data: advances in methodological and substantive inquiries. Qual. & Quant. 1 (2021)

  • Zhong, H., Wang, Y., Tu, C., Zhang, T., Liu, Z., Sun, M.: Iteratively questioning and answering for interpretable legal judgment prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence 34, 1250 (2020)

  • Zhou, J., Troyanskaya, O.G.: An analytical framework for interpretable and generalizable single-cell data analysis. Nat. methods 18, 1317 (2021)

    Article  Google Scholar 

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

The authors have contributed in equal measure to the drafting of this document.

Corresponding author

Correspondence to Hugo Alatrista-Salas.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alcántara Francia, O.A., Nunez-del-Prado, M. & Alatrista-Salas, H. Exploring the interpretability of legal terms in tasks of classification of final decisions in administrative procedures. Qual Quant (2024). https://doi.org/10.1007/s11135-024-01882-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11135-024-01882-1

Keywords