XML & DTDs
Computing Resources >> Tutorials >> Web Development >> XML & DTDs  

Introduction and Objectives

XML Anatomy

Creating a Simple XML Document

"Well Formed" vs. Valid

Is Your Markup Well-Formed?

Creating a DTD

Validating with a DTD

XML Resources

Evaluate this tutorial

Creating a DTD

Why would you want to create a DTD?

The benefits of DTDs are that it allows you to create numerous documents and make sure that the information contained in them will be comparable. For example, all the information about dates are in tags called <date> rather than <time>, <dates>, <Date> or <DATE>. By creating XML documents that meet a DTD's requirements, you can also share information between institutions.

Here's a "real life" example of using DTDs. The Society of American Archivists and the Library of Congress created Encoded Archival Description (EAD) for the purpose of encoding finding aids. The first version of EAD adhered to SMGL standards; however, with the popularity of XML, newer versions are XML-compliant.

Archival institutions like the Center for American History or Nettie Lee Benson Latin American Collection have created their finding aids using an EAD DTD. Automatic harvesters can use these finding aids to generate an online catalog of what is available at these archives. For an example of this, see the Texas Archival Resources Online or TARO. TARO is a collection of all the finding aids of archival repositories in Texas that are encoded in EAD.

Note: This section shows you how to create an external DTD file. However, DTDs can also be placed internally in an XML document.

Rules for Creating DTDs

When creating a DTD, you need to define all the elements and attributes you'll have in the XML documents. So let's create a DTD for our message XML documents.

Some syntax to remember when creating DTDs are the following:
Symbol Meaning Example
,
AND
header (sender, recipient*, date)
|
OR
message (email | letter)
()
Occurs only Once
(email | letter)
+
must occur at least once
(header, subject?, text+)
?
occurs either once or not at all
(header, recipient* , date?)
*
can occur zero or more times
(sender, recipient*, date)

Elements are declared in the following manner:

<!ELEMENT elementName ( elementParts ) >

Attributes are declared like this:

<!ATTLIST elementName attributeName attributeType attributeDefault >

So when creating a DTD for our message XML files, we would have something like this:

<!ELEMENT message ( email | letter ) >
<!ELEMENT letter ( letterhead, text ) >
<!ELEMENT email (header, subject?, text+) >
<!ATTLIST letter reply ( yes | no ) "no" >
<!ATTLIST email reply ( yes | no ) "no" >
<!ELEMENT header ( sender, recipient*, date?) >
<!ELEMENT subject ( #PCDATA) >
<!ELEMENT letterhead ( sender, recipient*, date ) >
<!ELEMENT sender ( #PCDATA ) >
<!ELEMENT recipient ( #PCDATA ) >
<!ELEMENT date ( #PCDATA ) >
<!ELEMENT text ( #PCDATA | salutation )* >
<!ELEMENT salutation ( #PCDATA ) >

Explaination of DTD structure

(click on image for larger view)

An important thing to remember when making DTDs is that unless you use | when defining the element parts, the order of the elements you have within that section is required in your XML. So in a letter, the element <letterhead> must occur before the element <text >.

There are some tools that will automatically generate DTDs from XML. HitSoftware, a W3Group member, has created this XML to DTD tool. So now that you've created a DTD, how do you validate your XML against it?

next section >

 

Watch the video
screenshot
Choose format/speed:

real media dial-up | broadband
real media dial-up | broadband

Entire tutorial (with captions)
real
media dial-up | broadband
windows media dial-up | broadband

Flash version of tutorial
segment | entire

html transcript

© 2004 Jacob Cleary | iSchool | UT Austin | webmaster