| XML & DTDs Computing Resources >> Tutorials >> Web Development >> XML & DTDs |
XML AnatomyIf you have ever done HTML coding, creating an XML document will seem very familiar. Like HTML, XML is based on SGML, Standard Generalized Markup Language, and designed for use with the Web. If you haven't coded in HTML before, after creating an XML document, you should find creating HTML documents easy. Note: If you are interested in learning HTML, please visit one of our following HTML tutorials: XML documents, at a minimum, are made of two parts: the prolog and the content. The prolog or head of the document usually contains the administrative metadata about the rest of document. It will have information such as what version of XML is used, the character set standard used, and the DTD, either through a link to an external file or internally. Content is usually divided into two parts, that of the structural markup and content contained in the markup, which is usually plain text. Let's take a look at a simple prologue for an XML document: <?xml version="1.0" encoding="iso-8859-1"?> <?xml declares to a processor that this is where the XML document begins. version="1.0" declares which recommended version of XML the document should be evaluated in. encoding="iso-8859-1" identifies the standardized character set that is being used to write the markup and content of the XML. Note: XML currently has two versions out: 1.0 and 1.1. For more information, visit the W3C group, which developed the XML standard. This tutorial deals with primarily with XML version 1.0. Note: For more information about standard character sets, see http://www.iana.org/assignments/character-sets The structural markup consists of elements, attributes, and entities; however, this tutorial will primarily focus on elements and attributes. Elements have a few particular rules: 1. Element names can be any mixture of characters, with a few exceptions. However, element names are case sensitive, unlike HTML. For instance, <elementname> is different from <ELEMENTNAME>, which is different from <ElementName>. Note: The characters that are excluded from
element names in XML are
2. Elements containing content must have closing and opening tags. <elementName> (opening) </elementName> (closing) Note that the closing tag is the exact same as the opening tag, but with a backslash in front of it. The content within elements can be either elements or character data. If an element has additional elements within it, then it is considered a parent element; those contained within it are called child elements. For example,
So in this example, <elementName> is the parent element. <anotherElement> is the child of elementName, because it is nested within elementName. Elements can have attributes attached to them in the following format:
While attributes can be added to elements in XML, there are a couple of reasons to use attributes sparingly:
We recommend using attributes for information that isn't absolutely necessary for interpreting the document or that has a predefined number of options that will not change in the future. When using attributes in XML, the value of the attributes must always be contained in quotes. The quotes can be either single or double quotes. For example, the attribute version=”1.0” in the opening XML declaration could be written version=’1.0’ and would be interpreted the same way by the XML parser. However, if the attribute value contains quotes, it is necessary to use the other style of quotation marks to indicate the value. For example, if there was an attribute name with a value of John “Q.” Public then it would need to be marked up in XML as name=‘John “Q” Public’, using the symbols for quotes to enclose the attribute value that is not being used in the value itself. There are some rules regarding the order of opening and closing elements, but that will be covered later in the tutorial. For now, let's try creating a simple XML document. |
|