Assignment 3 - Markup and Metadata

What is the relationship of markup and metadata in a digital library?

A markup language consists of codes called tags that are used to identify various elements within a text document. These elements may represent structure of the document (e.g. sections, subsections, headings, paragraphs) or presentation attributes (e.g. page size, fonts, line spacing). While these tags do not represent any specific formatting instructions, they often act as access points for applying formatting instructions. For example, placing text inside opening and closing <b> tags

<b>some text here</b>

does not inherently mean the text will be presented in a bold font. Instead, it marks the contents of the tag as a discreet element that can be accessed to provide the appropriate formatting instructions to that element within a particular context.

Metadata is "structured information that describes the attributes of information packages for the purposes of identification, discovery, and sometimes management" (Taylor, 2004, p.139). By creating structured information elements within an information package, metadata provides access points that allow for controlling administrative, structural, and descriptive functions. A markup language is one way to represent metadata that is both machine-readable and easily understandable by humans.

Figure 1 - example of metadata represented in xml markup

<!DOCTYPE Archive SYSTEM "">
    <Metadata name="lastmodified">1054758400</Metadata>
    <Metadata name="gsdlsourcefilename">import/lostworld.html</Metadata>
    <Metadata name="gsdldoctype">indexed_doc</Metadata>
    <Metadata name="Language">en</Metadata>
    <Metadata name="Encoding">iso_8859_1</Metadata>

Digital libraries utilize metadata, represented using markup language tags, to manage and use digital objects. Metadata can be categorized depending on their uses. Some metadata assists in navigation by providing access to the structured markup (e.g. sections or paragraphs). Other metadata aids in resource discovery by adding searchable fields or categories. Finally, metadata can be used by digital libraries to define policy rules or provide information required for administration and preservation.

How are markup and metadata created in a digital library?

Metadata for a digital library (represented as markup) can be created explicitly by humans, or extracted automatically. The main tradeoff between these two methods is typically one of reliability versus cost. In either case, it is important to have well defined standards to maintain consistent control of information packages. To this end, many metadata schemas have been developed. These schemas are designed to provide the details required by users within a particular domain. Some examples of metadata schemas include The Dublin Core, METS (Metadata Encoding and Transmission Standard), and TEI (Text Encoding Initiative). When these (or other) metadata standards are used, it is possible to expose the metadata of information packages to be harvested by service providers and create a single point of access to multiple collections.


Taylor, A. (2004). The organization of information. Westport, CT: Libraries Unlimited.

return to my class homepage