Log in

GaiusT: supporting the extraction of rights and obligations for regulatory compliance

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Ensuring compliance of software systems with government regulations, policies, and laws is a complex problem. Generally speaking, solutions to the problem first identify rights and obligations defined in the law and then treat these as requirements for the system under design. This work examines the challenge of develo** tool support for extracting such requirements from legal documents. To address this challenge, we have developed a tool called GaiusT. The tool is founded on a framework for textual semantic annotation. It semiautomatically generates elements of requirements models, including actors, rights, and obligations. We present the complexities of annotating prescriptive text, the architecture of GaiusT, and the process by which annotation is accomplished. We also present experimental results from two case studies to illustrate the application of the tool and its effectiveness relative to manual efforts. The first case study is based on the US Health Insurance Portability and Accountability Act, while the second analyzes the Italian accessibility law for information technology instruments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Named after GaiusTerentilius Harsa, a plebeian tribune who played an instrumental role in establishing for the first time in ancient Rome a formal code of laws through the Twelve Tablets (462BC).

  2. http://wordnet.princeton.edu/.

  3. http://thesaurus.reference.com/.

  4. XMI is the standard language used for representing Unified Modeling Language (UML) models (http://www.omg.org/technology/documents/formal/xmi.htm).

  5. http://www.w3.org/RDF.

  6. http://www.w3.org/OWL.

  7. http://www.magicdraw.com.

  8. http://www.visual-paradigm.com.

  9. http://protege.stanford.edu/.

  10. http://lucene.apache.org/java/docs/.

  11. http://wordnet.princeton.edu/.

  12. http://thesaurus.reference.com/.

  13. http://wikipedia.org/.

  14. http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/.

  15. http://www.codeproject.com/KB/cs/PDFToText.aspx and http://www.codeproject.com/KB/cs/DocToText.aspx.

  16. http://jxml2owl.projects.semwebcentral.org/.

  17. The stem of each keyword is considered.

  18. http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt.

  19. U.S. Public Law 104–191, 110 Stat. (1996).

  20. http://www.whitehouse.gov/omb/memoranda/fy2007/m07-07.pdf; http://www.archives.gov/federal-register/write/handbook/ddh.pdf; http://www.archives.gov/federal-register/write/legal-docs/.

    • Recall is a measure of how well the tool performs in finding relevant items \(\frac{TP}{TP+FN}\);

    • Precision is a measure of how well the tool performs in not returning irrelevant items \(\frac{TP}{TP+FP}\);

    • Fallout is a measure of how quickly precision drops as recall is increased \(\frac{FP}{FP+TN}\);

    • Accuracy is a measure of how well the tool identifies relevant items and rejects irrelevant ones \(\frac{TP+TN}{N}\);

    • Error is a measure of how much the tool is prone to accept irrelevant items and reject relevant ones \(\frac{FP+FN}{N}\);

    • F-measure is a harmonic mean of recall and precision \(\frac{2 \times Recall \times Precision}{Recall+Precision}\)

    where TP is the number of items correctly assigned to the category; FP is the number of items incorrectly assigned to the category; FN is the number of items incorrectly rejected from the category; TN is the number of items correctly rejected from the category; and N is the total number of items N = TP + FP + FN + TN.

  21. The complete results and resources are available at https://docs.google.com/file/d/0B7VFCr6GDi-sZE10MWJfRTRkYWc/edit?usp=sharing.

  22. English version reports the following note: The published text was translated in English by the Information Systems Accessibility Office at CNIPA—National Organism for ICT in Public Administration—with the sole aim of facilitating a better comprehension of it. The translation does not have official status; therefore, the only official text is the one published in the Official Gazette of the Italian Republic, in Italian.

  23. Aspects are abstractions used to modularize cross-cutting concerns in software development. Examples include such concerns as security, distribution, functionality, and real-time constraints.

  24. http://annozilla.mozdev.org/index.html.

  25. http://www.cs.umd.edu/projects/plus/SHOE.

  26. http://www.keeness.net/yawas/index.htm.

  27. http://www.mindswap.org/2005/SMORE.

References

  1. Alchourrón C, Bulygin E (1971) Normative systems. Springer, Wien

    Book  MATH  Google Scholar 

  2. Antón AI (1996) Goal-based requirements analysis. In: Proceedings of the 2nd international conference on requirements engineering (ICRE’96), IEEE. IEEE Computer Society, Washington, DC, USA, pp 136–144

  3. Antón AI, Earp JB, Carter RA (2003) Precluding incongruous behavior by aligning software requirements with security and privacy policies. Inf Softw Technol 45(14):967–977

    Article  Google Scholar 

  4. Antón AI, Earp JB, He Q, Stufflebeam W, Bolchini D, Jensen C (2004) Financial privacy policies and the need for standardization. IEEE Secur Privacy 2(2):36–45

    Article  Google Scholar 

  5. Antoniou G, Billington D, Maher MJ (1999) On the analysis of regulations using defeasible rules. In: Proceedings of the 32nd annual Hawaii international conference on system sciences (HICSS’99), vol 6. IEEE Computer Society, Washington, DC, USA, p 6033

  6. Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555–596

    Article  Google Scholar 

  7. Bing J (1987) Designing text retrieval systems for conceptual searching. In: Proceedings of the 1st international conference on artificial intelligence and law (ICAIL’87), pp 43–51

  8. Breaux TD, Antón AI (2005) Analyzing goal semantics for rights, permissions, and obligations. In: Proceedings of the 13th international requirements engineering conference (IEEE’05), IEEE Computer Society, Washington, DC, USA, pp 177–186

  9. Breaux TD, Antón AI (2005) Deriving semantic models from privacy policies. In: Proceedings of the 6th IEEE international workshop on policies for distributed systems and networks (POLICY’05), IEEE Computer Society, Washington, DC, USA, pp 67–76

  10. Breaux TD, Antón AI (2005) Mining rule semantics to understand legislative compliance. In: Proceedings of the 2005 ACM workshop on privacy in the electronic society (WPES’05). ACM Press, New York, NY, pp 51–54

  11. Breaux TD, Antón, AI (2008) Analyzing regulatory rules for privacy and security requirements. IEEE Trans Softw Eng 34(1):5–20

    Article  Google Scholar 

  12. Breaux TD, Antón AI, Spafford EH (2009) A distributed requirements management framework for legal compliance and accountability. Comput Secur 28(1-2):8–17

    Article  Google Scholar 

  13. Breaux TD, Vail MW, Antón AI (2006) Towards regulatory compliance: Extracting rights and obligations to align requirements with regulations. In: Proceedings of the 14th IEEE international requirements engineering conference (RE’06), IEEE Computer Society, Washington, DC, USA, pp 46–55

  14. Cleland-Huang J, Settimi R, Zou X, Solc P (2006) The detection and classification of non-functional requirements with application to early aspects. In: Proceedings of the 14th international requirements engineering conference (RE’06), IEEE Computer Society, Washington, DC, USA, pp 36–45

  15. Cordy JR (2003) Generalized selective xml markup of source code using agile parsing. In: Proceedings of the 11th IEEE international workshop on program comprehension (IWPC’03), IEEE Computer Society, Washington, DC, USA, p. 144

  16. Cordy JR (2006) The TXL source transformation language. Sci Comput Program 61(3):190–210

    Article  MATH  MathSciNet  Google Scholar 

  17. Dini L, Peters W, Liebwald D, Schweighofer E, Mommers L, Voermans W (2005) Cross-lingual legal information retrieval using a wordnet architecture. In: Proceedings of the 10th international conference on artificial intelligence and law, ICAIL’05. ACM, New York, NY, pp 163–167

  18. van Engers TM, van Gog R, Sayah K (2004) A case study on automated norm extraction. In: Proceedings of the 17th annual conference of legal knowledge and information systems (Jurix’04). Elsevier Science Publishers B. V., Amsterdam, pp 49–58

  19. Garigliano R, Morgan R, Smith M (1993) The LOLITA system as a contents scanning tool. In: Proceedings of the 13th international conference on artificial intelligence, expert systems and natural language processing (ICAI’93)

  20. Geoffrey N (1990) The linguistics of punctuation. Lecture notes 18. Center for the Study of Language of Information, Stanford, CA

  21. Ghanavati S, Amyot D, Peyton L (2011) A systematic review of goal-oriented requirements management frameworks for business process compliance. In: Fourth international workshop on requirements engineering and law (RELAW), 2011, pp 25–34. doi:10.1109/RELAW.2011.6050270

  22. Grant S, Skillicorn D, Cordy JR (2008) Topic detection using independent component analysis. In: Proceedings of the workshop on link analysis, counterterrorism and security (LACTS’08), pp 23–28

  23. Groza T, Handschuh S, Möller K, Decker S (2007) Salt- semantically annotated LaTeX for scientific publications. In: Proceedings of the 4th European conference on the semantic web (ESWC’07), Lecture notes in computer science, vol 4519. Springer, Berlin, pp 518–532. doi:10.1007/978-3-540-72667-8_37

  24. Horty JF (2001) Agency and deontic logic. Oxford University Press, New York, NY

    Book  MATH  Google Scholar 

  25. Hripcsak G, Rothschild AS (2005) Technical brief: agreement, the f-measure, and reliability in information retrieval. JAMIA 12(3):296–298

    Google Scholar 

  26. Italian Parliament: Stanca Act, Law. no. 4, January 9 2004: provisions to support the access to information technologies for the disabled. Gazzetta Ufficiale 13, Rome, 17 January 2004. http://www.pubbliaccesso.gov.it/normative/legge_20040109_n4.htm

  27. Jeremy C Maxwell, AIA, Swire P (2011) Discovering conflicting software requirements by analyzing legal cross-references. In: 19th IEEE international requirements engineering conference (RE 2011), Trento. To be published

  28. Kiyavitskaya N (2006) Tool support for semantic annotation. Ph.D. thesis, University of Trento, Department of Information Engineering and Computer Science

  29. Kiyavitskaya N, Zannone N (2008) Requirements model generation to support requirements elicitation: the secure tropos experience. Autom Softw Eng 15(2):149–173. doi:10.1007/s10515-008-0028-6

    Article  Google Scholar 

  30. Kiyavitskaya N, Zeni N, Breaux TD, Antón AI, Cordy JR, Mich L, Mylopoulos J (2008) Automating the extraction of rights and obligations for regulatory compliance. In: Proceedings of the 27th international conference on conceptual modeling (ER’08), Lecture notes in computer science, vol 5231. Springer, Berlin, pp 154–168. doi:10.1007/978-3-540-87877-3_13

  31. Kiyavitskaya N, Zeni N, Cordy JR, Mich L, Mylopoulos J (2009) Cerno: light-weight tool support for semantic annotation of textual documents. Data Knowl Eng 68(12):1470–1492. doi:10.1016/j.datak.2009.07.012

    Article  Google Scholar 

  32. Kiyavitskaya N, Zeni N, Mich L, Cordy JR, Mylopoulos J (2006) Text mining through semi automatic semantic annotation. In: Proceedings of practical aspects of knowledge management (PAKM’06), Lecture notes in computer science, vol 4333. Springer, Berlin, pp 143–154

  33. Kiyavitskaya N, Zeni N, Mich L, Cordy JR, Mylopoulos J (2007) Annotating accommodation advertisements using cerno. In: ENTER, pp 389–400

  34. Lazzarotti J (2011) Automating hipaa compliance tracking and audit preparation http://www.workplaceprivacyreport.com/2011/11/articles/hipaa-1/automating-hipaa-compliance-tracking-and-audit-preparation/

  35. Mann WC, Matthiessen CMIM, Thompson SA (1992) Rhetorical structure theory and text analysis. In: Mann WC, Thompson SA (eds) Discourse description: diverse linguistic analyses of a fund-raising text. Amsterdam and Philadelphia, John Benjamins, pp 39–78

  36. Mich L (1996) NL-OOPS: from natural language to object oriented requirements using the natural language processing system lolita. Nat Lang Eng 2(2):161–187

    Article  Google Scholar 

  37. Moulin B, Rousseau D (1990) Knowledge acquisition from prescriptive texts. In: Proceedings of the 3rd international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE’90). ACM Press, New York, NY, pp 1112–1121

  38. Nakamura M, Nobuoka S, Shimazu A (2008) Towards translation of legal sentences into logical forms. In: Proceedings of the 2007 conference on new frontiers in artificial intelligence (JSAI’07), Lecture notes in computer science, vol 4914. Springer, Berlin, pp 349–362

  39. Nute D (1987) Defeasible reasoning. In: Proceedings of the 20th Hawaii international conference on systems science (HICSS’87), pp 470–477. IEEE Press, New York

  40. Otto PN, Antón AI (2007) The role of law in requirements engineering. Technical report TR-2007-07, North Carolina State University

  41. Otto PN, Antón AIAI (2009) Managing legal texts in requirements engineering design requirements engineering: a ten-year perspectives. Springer, Berlin, pp 374–393

  42. Overmyer SP, Lavoie B, Rambow O (2001) Conceptual modeling through linguistic analysis using LIDA. In: Proceedings of the 23rd international conference on software engineering (ICSE’01). IEEE Computer Society, Washington, DC, pp 401–410

  43. Periklis A, Panayiotis T, Renée, JM, Kenneth CS (2004) Limbo: Scalable clustering of categorical data. In: Proceedings of the 9th international conference on extending database technology (EDBT’04), Lecture notes in computer science, vol 2992. Springer, Berlin, pp 123–146

  44. Pietrosanti E, Graziadio B (1999) Advanced techniques for legal document processing and retrieval. Artif Intel Law 7(4):341–361

    Article  Google Scholar 

  45. Pizzo A (2007) Pensiero pratico e logica deontica: assenza o presenza di razionalitá (in Italian). http://www.filosofia.it/ Online; accessed 25 February 2008

  46. Power R, Scott D, Bouayad-Agha N (2003) Document structure. Comput Linguist 29(2):211–260. doi:10.1162/089120103322145315

    Article  Google Scholar 

  47. Presidenza Consiglio dei Ministri (2001) Guida alla redazione dei testi normativi. Gazzetta Ufficiale (in Italian) 101(2):1–80. http://www.guritel.it/free-sum/ARTI/2001/05/03/sommario.html

  48. Reubenstein HB, Waters RC (1991) The requirements apprentice: Automated assistance for requirements acquisition. IEEE Trans Softw Eng 17(3):226–240

    Article  Google Scholar 

  49. Sampaio A, Chitchyan R, Rashid A, Rayson P (2005) EA-Miner: a tool for automating aspect-oriented requirements identification. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering (ASE’05). ACM Press, New York, NY, pp 352–355

  50. Sarcevic S (1997) New approach to legal translation. Kluwer Law International, Dordrecht

    Google Scholar 

  51. Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of international conference on new methods in language processing (ICNMLP’94). Manchester, UK. http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

  52. Shruti A (2007) Relational views of XML for the semantic web. Master’s thesis, School of Computing, Queen’s University at Kingston, Canada. http://hdl.handle.net/1974/736

  53. Souza V, Zeni N, Kiyavitskaya N, Andritsos P, Mich L, Mylopoulos J (2008) Automating the generation of semantic annotation tools using a clustering technique. In: Proceedings of the 13th international conference on natural language and information systems (NLDB’08), Lecture notes in computer science, vol 5039. Springer, Berlin, pp 91–96. doi:10.1007/978-3-540-69858-6_10

  54. Taylor SL, Dahl DA, Lipshutz M, Weir C, Norton LM, Nilson RW, Linebarger MC (1994) Integrating natural language understanding with document structure analysis. Artif Intel Rev 8(2–3):255–276. http://dblp.uni-trier.de/db/journals/air/air8.html#TaylorDLWNNL94

  55. US Federal Register (1998) Document drafting handbook. Federal Agency. http://www.nara.gov/fedreg

  56. US Goverment (2003) Standards for privacy of individually identifiable health information, 45 CFR part 160, Part 164 subpart E. In Federal Register 68(34):8334–8381

  57. Uusitalo E, Raatikainen M, Mannisto T, Tommila T (2011) Structured natural language requirements in nuclear energy domain towards improving regulatory guidelines. In: Fourth international workshop on requirements engineering and law (RELAW), 2011, pp 67–73. doi:10.1109/RELAW.2011.6050274

  58. Viegas E (1998) Multilingual computational semantic lexicons in action: the WYSINNWYG approach to NLP. In: Proceedings of the 17th international conference on computational linguistics (COLING’98). Association for Computational Linguistics, Morristown, NJ, pp 1321–1327. doi:10.3115/980691.980784

  59. Wilson WM, Rosenberg LH, Hyatt LE (1997) Automated analysis of requirement specifications. In: Proceedings of the 19th international conference on software engineering (ICSE’97). ACM Press, New York, NY, pp 161–171

  60. von Wright GH (1963) Norm and action: a logical enquiry. Routledge & Kegan Paul, London

    Google Scholar 

  61. Yacoub S, Peiro JA (2005) Identification of document structure and table of content in magazine archives. In: Proceedings of the 8th international conference on document analysis and recognition (ICDAR’05). IEEE Computer Society, Washington, DC, pp 1253–1259. doi:10.1109/ICDAR.2005.133

  62. Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retr 1(1–2):69–90

    Article  Google Scholar 

  63. Zeni N, Kiyavitskaya N, Mich L, Mylopoulos J, Cordy JR (2007) A lightweight approach to semantic annotation of research papers. In: Natural language processing and information systems. Springer, Berlin, pp 61–72

Download references

Acknowledgments

This work has been supported by the ERC advanced grant 267856 “Lucretius: Foundations for Software Evolution” (unfolding during the period of April 2011–March 2016) http://www.lucretius.eu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicola Zeni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeni, N., Kiyavitskaya, N., Mich, L. et al. GaiusT: supporting the extraction of rights and obligations for regulatory compliance. Requirements Eng 20, 1–22 (2015). https://doi.org/10.1007/s00766-013-0181-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-013-0181-8

Keywords

Navigation