GaiusT: supporting the extraction of rights and obligations for regulatory compliance

Zeni, Nicola; Kiyavitskaya, Nadzeya; Mich, Luisa; Cordy, James R.; Mylopoulos, John

doi:10.1007/s00766-013-0181-8

GaiusT: supporting the extraction of rights and obligations for regulatory compliance

Original Article
Published: 20 September 2013

Volume 20, pages 1–22, (2015)
Cite this article

Requirements Engineering Aims and scope Submit manuscript

Nicola Zeni¹,
Nadzeya Kiyavitskaya¹,
Luisa Mich²,
James R. Cordy³ &
…
John Mylopoulos¹

2168 Accesses
Explore all metrics

Abstract

Ensuring compliance of software systems with government regulations, policies, and laws is a complex problem. Generally speaking, solutions to the problem first identify rights and obligations defined in the law and then treat these as requirements for the system under design. This work examines the challenge of develo** tool support for extracting such requirements from legal documents. To address this challenge, we have developed a tool called GaiusT. The tool is founded on a framework for textual semantic annotation. It semiautomatically generates elements of requirements models, including actors, rights, and obligations. We present the complexities of annotating prescriptive text, the architecture of GaiusT, and the process by which annotation is accomplished. We also present experimental results from two case studies to illustrate the application of the tool and its effectiveness relative to manual efforts. The first case study is based on the US Health Insurance Portability and Accountability Act, while the second analyzes the Italian accessibility law for information technology instruments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

Towards Automated GDPR Compliance Checking

Contratto – A Method for Transforming Legal Contracts into Formal Specifications

Notes

Named after GaiusTerentilius Harsa, a plebeian tribune who played an instrumental role in establishing for the first time in ancient Rome a formal code of laws through the Twelve Tablets (462BC).
http://wordnet.princeton.edu/.
http://thesaurus.reference.com/.
XMI is the standard language used for representing Unified Modeling Language (UML) models (http://www.omg.org/technology/documents/formal/xmi.htm).
http://www.w3.org/RDF.
http://www.w3.org/OWL.
http://www.magicdraw.com.
http://www.visual-paradigm.com.
http://protege.stanford.edu/.
http://lucene.apache.org/java/docs/.
http://wordnet.princeton.edu/.
http://thesaurus.reference.com/.
http://wikipedia.org/.
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/.
http://www.codeproject.com/KB/cs/PDFToText.aspx and http://www.codeproject.com/KB/cs/DocToText.aspx.
http://jxml2owl.projects.semwebcentral.org/.
The stem of each keyword is considered.
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt.
U.S. Public Law 104–191, 110 Stat. (1996).
http://www.whitehouse.gov/omb/memoranda/fy2007/m07-07.pdf; http://www.archives.gov/federal-register/write/handbook/ddh.pdf; http://www.archives.gov/federal-register/write/legal-docs/.
- Recall is a measure of how well the tool performs in finding relevant items \(\frac{TP}{TP+FN}\);
- Precision is a measure of how well the tool performs in not returning irrelevant items \(\frac{TP}{TP+FP}\);
- Fallout is a measure of how quickly precision drops as recall is increased \(\frac{FP}{FP+TN}\);
- Accuracy is a measure of how well the tool identifies relevant items and rejects irrelevant ones \(\frac{TP+TN}{N}\);
- Error is a measure of how much the tool is prone to accept irrelevant items and reject relevant ones \(\frac{FP+FN}{N}\);
- F-measure is a harmonic mean of recall and precision \(\frac{2 \times Recall \times Precision}{Recall+Precision}\)
where TP is the number of items correctly assigned to the category; FP is the number of items incorrectly assigned to the category; FN is the number of items incorrectly rejected from the category; TN is the number of items correctly rejected from the category; and N is the total number of items N = TP + FP + FN + TN.
The complete results and resources are available at https://docs.google.com/file/d/0B7VFCr6GDi-sZE10MWJfRTRkYWc/edit?usp=sharing.
English version reports the following note: The published text was translated in English by the Information Systems Accessibility Office at CNIPA—National Organism for ICT in Public Administration—with the sole aim of facilitating a better comprehension of it. The translation does not have official status; therefore, the only official text is the one published in the Official Gazette of the Italian Republic, in Italian.
Aspects are abstractions used to modularize cross-cutting concerns in software development. Examples include such concerns as security, distribution, functionality, and real-time constraints.
http://annozilla.mozdev.org/index.html.
http://www.cs.umd.edu/projects/plus/SHOE.
http://www.keeness.net/yawas/index.htm.
http://www.mindswap.org/2005/SMORE.

References

Alchourrón C, Bulygin E (1971) Normative systems. Springer, Wien
Book MATH Google Scholar
Antón AI (1996) Goal-based requirements analysis. In: Proceedings of the 2nd international conference on requirements engineering (ICRE’96), IEEE. IEEE Computer Society, Washington, DC, USA, pp 136–144
Antón AI, Earp JB, Carter RA (2003) Precluding incongruous behavior by aligning software requirements with security and privacy policies. Inf Softw Technol 45(14):967–977
Article Google Scholar
Antón AI, Earp JB, He Q, Stufflebeam W, Bolchini D, Jensen C (2004) Financial privacy policies and the need for standardization. IEEE Secur Privacy 2(2):36–45
Article Google Scholar
Antoniou G, Billington D, Maher MJ (1999) On the analysis of regulations using defeasible rules. In: Proceedings of the 32nd annual Hawaii international conference on system sciences (HICSS’99), vol 6. IEEE Computer Society, Washington, DC, USA, p 6033
Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555–596
Article Google Scholar
Bing J (1987) Designing text retrieval systems for conceptual searching. In: Proceedings of the 1st international conference on artificial intelligence and law (ICAIL’87), pp 43–51
Breaux TD, Antón AI (2005) Analyzing goal semantics for rights, permissions, and obligations. In: Proceedings of the 13th international requirements engineering conference (IEEE’05), IEEE Computer Society, Washington, DC, USA, pp 177–186
Breaux TD, Antón AI (2005) Deriving semantic models from privacy policies. In: Proceedings of the 6th IEEE international workshop on policies for distributed systems and networks (POLICY’05), IEEE Computer Society, Washington, DC, USA, pp 67–76
Breaux TD, Antón AI (2005) Mining rule semantics to understand legislative compliance. In: Proceedings of the 2005 ACM workshop on privacy in the electronic society (WPES’05). ACM Press, New York, NY, pp 51–54
Breaux TD, Antón, AI (2008) Analyzing regulatory rules for privacy and security requirements. IEEE Trans Softw Eng 34(1):5–20
Article Google Scholar
Breaux TD, Antón AI, Spafford EH (2009) A distributed requirements management framework for legal compliance and accountability. Comput Secur 28(1-2):8–17
Article Google Scholar
Breaux TD, Vail MW, Antón AI (2006) Towards regulatory compliance: Extracting rights and obligations to align requirements with regulations. In: Proceedings of the 14th IEEE international requirements engineering conference (RE’06), IEEE Computer Society, Washington, DC, USA, pp 46–55
Cleland-Huang J, Settimi R, Zou X, Solc P (2006) The detection and classification of non-functional requirements with application to early aspects. In: Proceedings of the 14th international requirements engineering conference (RE’06), IEEE Computer Society, Washington, DC, USA, pp 36–45
Cordy JR (2003) Generalized selective xml markup of source code using agile parsing. In: Proceedings of the 11th IEEE international workshop on program comprehension (IWPC’03), IEEE Computer Society, Washington, DC, USA, p. 144
Cordy JR (2006) The TXL source transformation language. Sci Comput Program 61(3):190–210
Article MATH MathSciNet Google Scholar
Dini L, Peters W, Liebwald D, Schweighofer E, Mommers L, Voermans W (2005) Cross-lingual legal information retrieval using a wordnet architecture. In: Proceedings of the 10th international conference on artificial intelligence and law, ICAIL’05. ACM, New York, NY, pp 163–167
van Engers TM, van Gog R, Sayah K (2004) A case study on automated norm extraction. In: Proceedings of the 17th annual conference of legal knowledge and information systems (Jurix’04). Elsevier Science Publishers B. V., Amsterdam, pp 49–58
Garigliano R, Morgan R, Smith M (1993) The LOLITA system as a contents scanning tool. In: Proceedings of the 13th international conference on artificial intelligence, expert systems and natural language processing (ICAI’93)
Geoffrey N (1990) The linguistics of punctuation. Lecture notes 18. Center for the Study of Language of Information, Stanford, CA
Ghanavati S, Amyot D, Peyton L (2011) A systematic review of goal-oriented requirements management frameworks for business process compliance. In: Fourth international workshop on requirements engineering and law (RELAW), 2011, pp 25–34. doi:10.1109/RELAW.2011.6050270
Grant S, Skillicorn D, Cordy JR (2008) Topic detection using independent component analysis. In: Proceedings of the workshop on link analysis, counterterrorism and security (LACTS’08), pp 23–28
Groza T, Handschuh S, Möller K, Decker S (2007) Salt- semantically annotated LaTeX for scientific publications. In: Proceedings of the 4th European conference on the semantic web (ESWC’07), Lecture notes in computer science, vol 4519. Springer, Berlin, pp 518–532. doi:10.1007/978-3-540-72667-8_37
Horty JF (2001) Agency and deontic logic. Oxford University Press, New York, NY
Book MATH Google Scholar
Hripcsak G, Rothschild AS (2005) Technical brief: agreement, the f-measure, and reliability in information retrieval. JAMIA 12(3):296–298
Google Scholar
Italian Parliament: Stanca Act, Law. no. 4, January 9 2004: provisions to support the access to information technologies for the disabled. Gazzetta Ufficiale 13, Rome, 17 January 2004. http://www.pubbliaccesso.gov.it/normative/legge_20040109_n4.htm
Jeremy C Maxwell, AIA, Swire P (2011) Discovering conflicting software requirements by analyzing legal cross-references. In: 19th IEEE international requirements engineering conference (RE 2011), Trento. To be published
Kiyavitskaya N (2006) Tool support for semantic annotation. Ph.D. thesis, University of Trento, Department of Information Engineering and Computer Science
Kiyavitskaya N, Zannone N (2008) Requirements model generation to support requirements elicitation: the secure tropos experience. Autom Softw Eng 15(2):149–173. doi:10.1007/s10515-008-0028-6
Article Google Scholar
Kiyavitskaya N, Zeni N, Breaux TD, Antón AI, Cordy JR, Mich L, Mylopoulos J (2008) Automating the extraction of rights and obligations for regulatory compliance. In: Proceedings of the 27th international conference on conceptual modeling (ER’08), Lecture notes in computer science, vol 5231. Springer, Berlin, pp 154–168. doi:10.1007/978-3-540-87877-3_13
Kiyavitskaya N, Zeni N, Cordy JR, Mich L, Mylopoulos J (2009) Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl Eng 68(12):1470–1492. doi:10.1016/j.datak.2009.07.012
Article Google Scholar
Kiyavitskaya N, Zeni N, Mich L, Cordy JR, Mylopoulos J (2006) Text mining through semi automatic semantic annotation. In: Proceedings of practical aspects of knowledge management (PAKM’06), Lecture notes in computer science, vol 4333. Springer, Berlin, pp 143–154
Kiyavitskaya N, Zeni N, Mich L, Cordy JR, Mylopoulos J (2007) Annotating accommodation advertisements using cerno. In: ENTER, pp 389–400
Lazzarotti J (2011) Automating hipaa compliance tracking and audit preparation http://www.workplaceprivacyreport.com/2011/11/articles/hipaa-1/automating-hipaa-compliance-tracking-and-audit-preparation/
Mann WC, Matthiessen CMIM, Thompson SA (1992) Rhetorical structure theory and text analysis. In: Mann WC, Thompson SA (eds) Discourse description: diverse linguistic analyses of a fund-raising text. Amsterdam and Philadelphia, John Benjamins, pp 39–78
Mich L (1996) NL-OOPS: from natural language to object oriented requirements using the natural language processing system lolita. Nat Lang Eng 2(2):161–187
Article Google Scholar
Moulin B, Rousseau D (1990) Knowledge acquisition from prescriptive texts. In: Proceedings of the 3rd international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE’90). ACM Press, New York, NY, pp 1112–1121
Nakamura M, Nobuoka S, Shimazu A (2008) Towards translation of legal sentences into logical forms. In: Proceedings of the 2007 conference on new frontiers in artificial intelligence (JSAI’07), Lecture notes in computer science, vol 4914. Springer, Berlin, pp 349–362
Nute D (1987) Defeasible reasoning. In: Proceedings of the 20th Hawaii international conference on systems science (HICSS’87), pp 470–477. IEEE Press, New York
Otto PN, Antón AI (2007) The role of law in requirements engineering. Technical report TR-2007-07, North Carolina State University
Otto PN, Antón AIAI (2009) Managing legal texts in requirements engineering design requirements engineering: a ten-year perspectives. Springer, Berlin, pp 374–393
Overmyer SP, Lavoie B, Rambow O (2001) Conceptual modeling through linguistic analysis using LIDA. In: Proceedings of the 23rd international conference on software engineering (ICSE’01). IEEE Computer Society, Washington, DC, pp 401–410
Periklis A, Panayiotis T, Renée, JM, Kenneth CS (2004) Limbo: Scalable clustering of categorical data. In: Proceedings of the 9th international conference on extending database technology (EDBT’04), Lecture notes in computer science, vol 2992. Springer, Berlin, pp 123–146
Pietrosanti E, Graziadio B (1999) Advanced techniques for legal document processing and retrieval. Artif Intel Law 7(4):341–361
Article Google Scholar
Pizzo A (2007) Pensiero pratico e logica deontica: assenza o presenza di razionalitá (in Italian). http://www.filosofia.it/ Online; accessed 25 February 2008
Power R, Scott D, Bouayad-Agha N (2003) Document structure. Comput Linguist 29(2):211–260. doi:10.1162/089120103322145315
Article Google Scholar
Presidenza Consiglio dei Ministri (2001) Guida alla redazione dei testi normativi. Gazzetta Ufficiale (in Italian) 101(2):1–80. http://www.guritel.it/free-sum/ARTI/2001/05/03/sommario.html
Reubenstein HB, Waters RC (1991) The requirements apprentice: Automated assistance for requirements acquisition. IEEE Trans Softw Eng 17(3):226–240
Article Google Scholar
Sampaio A, Chitchyan R, Rashid A, Rayson P (2005) EA-Miner: a tool for automating aspect-oriented requirements identification. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering (ASE’05). ACM Press, New York, NY, pp 352–355
Sarcevic S (1997) New approach to legal translation. Kluwer Law International, Dordrecht
Google Scholar
Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of international conference on new methods in language processing (ICNMLP’94). Manchester, UK. http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/
Shruti A (2007) Relational views of XML for the semantic web. Master’s thesis, School of Computing, Queen’s University at Kingston, Canada. http://hdl.handle.net/1974/736
Souza V, Zeni N, Kiyavitskaya N, Andritsos P, Mich L, Mylopoulos J (2008) Automating the generation of semantic annotation tools using a clustering technique. In: Proceedings of the 13th international conference on natural language and information systems (NLDB’08), Lecture notes in computer science, vol 5039. Springer, Berlin, pp 91–96. doi:10.1007/978-3-540-69858-6_10
Taylor SL, Dahl DA, Lipshutz M, Weir C, Norton LM, Nilson RW, Linebarger MC (1994) Integrating natural language understanding with document structure analysis. Artif Intel Rev 8(2–3):255–276. http://dblp.uni-trier.de/db/journals/air/air8.html#TaylorDLWNNL94
US Federal Register (1998) Document drafting handbook. Federal Agency. http://www.nara.gov/fedreg
US Goverment (2003) Standards for privacy of individually identifiable health information, 45 CFR part 160, Part 164 subpart E. In Federal Register 68(34):8334–8381
Uusitalo E, Raatikainen M, Mannisto T, Tommila T (2011) Structured natural language requirements in nuclear energy domain towards improving regulatory guidelines. In: Fourth international workshop on requirements engineering and law (RELAW), 2011, pp 67–73. doi:10.1109/RELAW.2011.6050274
Viegas E (1998) Multilingual computational semantic lexicons in action: the WYSINNWYG approach to NLP. In: Proceedings of the 17th international conference on computational linguistics (COLING’98). Association for Computational Linguistics, Morristown, NJ, pp 1321–1327. doi:10.3115/980691.980784
Wilson WM, Rosenberg LH, Hyatt LE (1997) Automated analysis of requirement specifications. In: Proceedings of the 19th international conference on software engineering (ICSE’97). ACM Press, New York, NY, pp 161–171
von Wright GH (1963) Norm and action: a logical enquiry. Routledge & Kegan Paul, London
Google Scholar
Yacoub S, Peiro JA (2005) Identification of document structure and table of content in magazine archives. In: Proceedings of the 8th international conference on document analysis and recognition (ICDAR’05). IEEE Computer Society, Washington, DC, pp 1253–1259. doi:10.1109/ICDAR.2005.133
Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retr 1(1–2):69–90
Article Google Scholar
Zeni N, Kiyavitskaya N, Mich L, Mylopoulos J, Cordy JR (2007) A lightweight approach to semantic annotation of research papers. In: Natural language processing and information systems. Springer, Berlin, pp 61–72

Download references

Acknowledgments

This work has been supported by the ERC advanced grant 267856 “Lucretius: Foundations for Software Evolution” (unfolding during the period of April 2011–March 2016) http://www.lucretius.eu.

Author information

Authors and Affiliations

Department of Information Engineering and Computer Science, University of Trento, Via Sommarive, 14, 38123, Povo, TN, Italy
Nicola Zeni, Nadzeya Kiyavitskaya & John Mylopoulos
Department of Industrial engineering, University of Trento, Via Mesiano, 77, 38123, Trento, TN, Italy
Luisa Mich
School of Computing, Queens University, Kingston, ON, K7L 3N6, Canada
James R. Cordy

Authors

Nicola Zeni
View author publications
You can also search for this author in PubMed Google Scholar
Nadzeya Kiyavitskaya
View author publications
You can also search for this author in PubMed Google Scholar
Luisa Mich
View author publications
You can also search for this author in PubMed Google Scholar
James R. Cordy
View author publications
You can also search for this author in PubMed Google Scholar
John Mylopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicola Zeni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeni, N., Kiyavitskaya, N., Mich, L. et al. GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Requirements Eng 20, 1–22 (2015). https://doi.org/10.1007/s00766-013-0181-8

Download citation

Received: 22 November 2012
Accepted: 09 August 2013
Published: 20 September 2013
Issue Date: March 2015
DOI: https://doi.org/10.1007/s00766-013-0181-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

GaiusT: supporting the extraction of rights and obligations for regulatory compliance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

Towards Automated GDPR Compliance Checking

Contratto – A Method for Transforming Legal Contracts into Formal Specifications

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

GaiusT: supporting the extraction of rights and obligations for regulatory compliance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GaiusT 2.0: Evolution of a Framework for Annotating Legal Documents

Towards Automated GDPR Compliance Checking

Contratto – A Method for Transforming Legal Contracts into Formal Specifications

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation