Abstract
This chapter aims to give a reasonably comprehensive definition and motivation for the various aspects of the generic XML language and also to illustrate these aspects with some existing XML dialects or vocabularies. We describe elements, attributes, child elements, and the hierarchical structure of XML. We talk about “well-formedness” of an XML document and how to identify errors in a document’s structure. We discuss the use of namespaces and end with a brief discussion of validating documents with respect to DTDs and XML Schema. Readers already familiar with all aspects of XML can skip this chapter and read about the functions used to work with XML in R, which are the subject of each of Chapters 3, 4, 5, and 6.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Apple, Inc. Numbers for iOS: Supported file formats. http://support.apple.com/kb/HT4642, 2011.
Bert Bos, Tantek Celik, Ian Hickson, and Hakon Wium Lie. Cascading style sheets, level 2, revision 1 (CSS 2.1) specification. Worldwide Web Consortium, 2011. http://www.w3.org/TR/CSS2/.
Tim Bray, Dave Hollander, Andrew Layman, Richard Tobin, and Henry Thompson. Namespaces in XML 1.0. Worldwide Web Consortium, 2009. http://www.w3.org/TR/REC-xmlnames/.
James Clark. nXML mode: An addon for GNU Emacs. http://www.thaiopensource.com/nxml-mode/, 2004.
Data Mining Group. Predictive Model Markup Language. http://www.dmg.org/pmmlv3-2.html, 2011.
Economic Commission for Europe. Common open standards for the exchange and sharing of socio-economic data and metadata: The SDMX initiative. http://sdmx.org/docs/2002/wp11.pdf, 2002.
European Central Bank. Euro foreign exchange reference rates. http://www.ecb.int/stats/exchange/eurofxref/html/index.en.html, 2011.
European Central Bank. SDMX-ML and SDMX-EDI (GESMES/TS): The ECB statistical representation standards. http://www.ecb.int/stats/services/sdmx/html/index.en.html, 2011.
David Fallside and Priscilla Walmsley. XML schema, Part 0: Primer. Worldwide Web Consortium, 2004. http://www.w3.org/TR/xmlschema-0/.
R. Gentleman, Elizabeth Whalen, W. Huber, and S. Falcon. graph: A package to handle graph data structures. http://cran.r-project.org/package=graph, 2011. R package version 1.33.0.
Google, Inc. Keyhole markup language (KML) reference. https://developers.google.com/kml/documentation/kmlreference, 2010.
Google, Inc. Google Earth: A 3D virtual earth browser, version 6. http://www.google.com/earth/, 2011.
Google, Inc. Google Maps: A Web map** service application. http://maps.google.com/, 2011.
Google, Inc. Google documents list API: Allows developers to create, retrieve, update, and delete Google Docs. http://code.google.com/apis/documents/, 2012.
Google, Inc. Google Sky: An online outer-space viewer. http://www.google.com/sky/, 2012.
John Gruber. Markdown: A text-to-HTML conversion tool for Web writers. http://daringfireball.net/projects/markdown/, 2004.
Elliotte Rusty Harold andW. Scott Means. XML in a Nutshell. O’Reilly Media, Inc., Sebastopol, CA, 2004.
David Hunter, Jeff Rafter, Joe Fawcett, Eric van der Vlist, Danny Ayers, Jon Duckett, Andrew Watt, and Linda McKinnon. Beginning XML. Wiley Publishing, Inc., Indianapolis, IN, fourth edition, 2007.
Bill Kennedy and Chuck Musciano. HTML and XHTML: The Definitive Guide. O’Reilly Media, Inc., Sebastopol, CA, 2006.
B. N. Lawrence, R. Lowry, P. Miller, H. Snaith, and A. Woolf. Information in environmental data grids. Philosophical Transactions of the Royal Society A: Mathematical, Physical, and Engineering Sciences, 367:1003–1014, 2009.
LibreOffice; The Document Foundation. Calc: The LibreOffice spreadsheet program. http://www.libreoffice.org/features/calc/, 2011.
R.G. Mann, R.M. Baxter, R. Carroll, Q. Wen, O.P. Buneman, B. Choi, W. Fan, R.W.O. Hutchison, and S.D. Viglas. XML Data in the virtual observatory. Astronomical Data Analysis Software and Systems XIV, 347:223, 2005.
Deborah Nolan, Roger Peng, and Duncan Temple Lang. Enhanced dynamic documents for reproducible research. In M.F. Ochs, J.T. Casagrande, and R.V. Davuluri, editors, Biomedical Informatics for Cancer Research, pages 335–346. Springer-Verlag, New York, 2009.
Deborah Nolan and Duncan Temple Lang. Learning from the statistician’s lab notebook. In Data and Context in Statistics Education: Towards an Evidence-based Society. Proceedings of the Eighth International Conference on Teaching Statistics (ICOTS8, July, 2010), Ljubljana, Slovenia. Voorburg, 2010.
Open Geospatial Consortium, Inc. OGC KML standards. http://www.opengeospatial.org/standards/kml/, 2010.
Eric Raymond. DocBook demystification HOWTO, revision v1.3. The Linux Documentation Project, 2004. http://en.tldp.org/HOWTO/DocBook-Demystification-HOWTO/.
Frank Rice. Introducing the Office (2007) Open XML file formats. http://msdn.microsoft.com/en-us/library/aa338205(v=office.12).aspx, 2006.
Yakov Shafranovich. Common format and MIME type for comma-separated values (CSV) files. http://tools.ietf.org/html/rfc4180, 2011.
Richard Stallman. GNU Emacs: An extensible, customizable text editor. http://www.gnu.org/software/emacs/, 2008.
Statistical Data and Metadata Exchange Initiative. SDMX information model: UML conceptual design (version 2.0). http://www.sdmx.org/docs/2_0/SDMX_2_0SECTION_02_InformationModel.pdf, 2005.
Bob Stayton. DocBook XSL: The Complete Guide. Sagehill Enterprises, Santa Cruz, CA, fourth edition, 2007.
Alex Szalay, Jim Gray, Ani Thakar, Bill Boroski, Roy Gai, Nolan Li, Peter Kunszt, Tanu Malik, Wil O’Mullane, Maria Nieto-Santisteban, Jordan Raddick, Chris Stoughton, and Jan van den Berg. The SDSS DR1 SkyServer: Public access to a terabyte of astronomical data. http://cas.sdss.org/dr6/en/skyserver/paper/, 2002.
Duncan Temple Lang. RTidyHTML: Tidy HTML documents. http://www.omegahat.org/RTidyHTML, 2011. R package version 0.2-1.
Duncan Temple Lang. XML: Tools for parsing and generating XML within R and S-PLUS. http://www.omegahat.org/RSXML, 2011. R package version 3.4.
Duncan Temple Lang. XMLSchema: R facilities to read XML schema. http://www.omegahat.org/XMLSchema, 2012. R package version 0.7-0.
United Nations Statistical Commission. Report on the thirty-ninth session. (Supplement No. 4, E/2008/24). http://unstats.un.org/unsd/statcom/doc08/DraftReport-English.pdf, 2008.
US Food and Drug Administration. Structured product labeling resources. http://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/ default.htm, 2012.
Eric van der Vlist. XML Schema. O’Reilly Media, Inc., Sebastopol, CA, 2002.
W3Schools, Inc. XML tutorial. http://www.w3schools.com/xml/default.asp, 2011.
W3Schools, Inc. DTD tutorial. http://www.w3schools.com/dtd/default.asp, 2012.
Priscilla Walmsley. Definitive XML Schema. Prentice Hall PTR, Upper Saddle River, NJ, 2001.
Norman Walsh and Leonard Muellner. DocBook: The Definitive Guide. O’Reilly Media, Inc., Sebastopol, CA, first edition, 1999. http://www.docbook.org/tdg5/.
Worldwide Web Consortium. Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/REC-xml/, 2008.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Nolan, D., Lang, D.T. (2014). An Introduction to XML . In: XML and Web Technologies for Data Sciences with R. Use R!. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7900-0_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7900-0_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7899-7
Online ISBN: 978-1-4614-7900-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)