Archivists need to invent new methods, policies, and standards for conducting archival enterprise and they must do so quickly. This paper outlines standard practices used in archival enterprise, presents arguments for change that are being discussed by archivists, and suggests policy areas which merit action from the archival community. These areas include intellectual property rights, preservation of endangered electronic records, and standards development for locating information and making it available on the Internet.
Archival practice and tradition
Archival practice encompasses understanding records of various types; knowing where they have come from, what they are made of, where they have been, what services they perform, how they can be organized, managed, and preserved; and making them available for use. Archivists not only care for the records of the past but also document the present. The terminology archivists associate with their basic tasks include appraisal, arrangement and description, preservation, and access. Appraisal refers to determining the value of the record, not in financial terms, but in terms of continuing evidential and informational value to the agencies and organizations which created them and to society. Appraisal encompasses the practice of selecting items for retention and the necessary practice of weeding out and destroying other items. We live in an age of abundance but not all records can be saved. Arrangement and description refers to techniques used to establish order and describe the content of the record with care and attention given to the original order developed by the creator in the course of everyday business, termed provenance. Provenance stems from the practice of diplomatics and is connected to the ability to authenticate records, the ability to assure their archival integrity and trustworthiness, and the ability to connect them to their creator. A prime principle in archival enterprise is to honor and preserve the record group as a whole, providing context for future users by maintaining how the originator of records perceived their order and relationship, whether the group is taken from a file drawer in a corporate office or from a site on the World Wide Web. Preservation refers to actions taken by archivists to lengthen the life of the records in order to make them available for use by researchers or other audiences seeking primary information. Making records available for use refers to reference service and utilization of primary and secondary sources in archive reading rooms and, today, on the Internet. It also refers to outreach activities which announce the availability of collections and materials for use via printed finding aids, and, in an increasing, but still limited number of cases, delivering the records and finding aids across the Internet. Archivists are very much in the business of preserving today’s electronic record in collaboration with other communities of historians, records managers, librarians, and social scientists. These communities are concerned with issues of intellectual property rights, preservation of records, and public policy promoting free public access to records.
People who have written provocatively on these issues include David Bearman, President of the consulting firm Archives and Museum Informatics, Margaret Hedstrom, a leading educator in the field of archival enterprise, and Avra Michelson and Jeff Rothenberg, researchers at the National Archives and Records Administration and the RAND Corporation.
David Bearman advocates a revolution in the use of traditional archival methods. He maintains that archivists can’t possibly know or predict what will be of societal value in the future and that the physical materials we strive to preserve violate the laws of thermodynamics; they will all degrade and be destroyed eventually. He challenges that traditional arrangement and description techniques are much too labor intensive and expensive to continue in their present form and that potential users of archives quite simply don’t. He advocates that archivists turn their attentions instead to developing models for best practices, defining Internet protocols and reference standards for making records accessible, assuming responsibility and accountability for records as vital evidence, creating highly specific metadata about transaction contexts, and proactively publicizing records or evidence making them available to new audiences. (Bearman, 1995).
Margaret Hedstrom advocates the identification of approaches to access in the digital age that best satisfy users’ needs and improve the process and results of research at reasonable costs. Building blocks she advocates for integrated access systems include the new standard, Encoded Archival Description Document Type Definition (EAD DTD), which uses Standard Generalized Markup Language (SGML), to produce browsable and searchable on-line finding aids for use on the Internet. She encourages the profession to invent specific strategies and management methods that will protect the integrity of records while enhancing access to their contents such as the development of metadata, (information about information), which will provide knowledge of the relationships among documents, the circumstances that gave rise to the creation of the documents, their use, and the authenticating chain of custody from the originator to the current custodian of the documents. In order to preserve access to software-dependent records, she advocates preserving software as an important intellectual and cultural resource, perhaps by developing a network of software and hardware archival sites that will maintain the ability to read and make available potentially illegible electronic records. She encourages archives to retain free accessibility to the public, but to carefully document the copyright status of holdings and the provisions for requesting permission to impose user fees for personal, scholarly, or educational uses of archives. (Hedstrom, 1997).
Avra Michelson and Jeff Rothenberg refer to the emergence and use of information technology as this century’s most significant development affecting archival practice. They study it from the perspective of information technology’s impact on scholarly practice and recommend that archivists (1) establish a presence on the Internet/NREN, (2) make source materials available for research use on the Internet, (3) create documentation strategies for network-mediated scholarship and the development of research and education networks, (4) develop archival methods suitable for operation with NREN, (5) take user methods and future computational capacity into account in establishing policies on the management of software-dependent records, and (6) recognize and reward initiatives that advance archival management of electronic records and respond to scholarly use of information technology or promote a network-mediated archival practice. (Michelson & Rothenberg, 1992).
This paper asserts that archivists have the opportunity to draw upon ideas promoted by these leaders to promulgate new network-mediated archival practices in three interlaced areas: (1) use of the Internet for making documents available and accessible to wider audiences, (2) use of policies and information technologies to address the difficulties of preservation and migration of data, (3) use of standards for locating and describing information sources and increasing the precision and accuracy of locating information.
In order to use the Internet to make documents widely available, a balance between intellectual property rights and accessibility has to be more clearly drawn. In order to use information technologies for preservation, organizations need to develop policies which provide for archival preservation up front in systems design. In order to make archival information accessible, adoption of standards for locating and describing electronic records needs to take place. Archivists share an important role in the development of policies in these three areas which will lead to the intellectual preservation of the electronic record.
1). Use of the Internet
Deciding what materials are eligible for placement on the Internet challenges the archivist to find a balance between copyright laws and ethical conduct for archivists in protecting people’s rights to their property and the people’s right to access information. Archivists need to lobby for more liberal policies in the copyright arena that will balance the public’s access rights and the proprietors’ rights. An example of the limitations imposed in the day to day work of the archivist are described by Jodi L. Allison-Bunnell at the time she decided to create a microfilm preservation copy of the Katherine Anne Porter papers at the University of Maryland Libraries where she is an archivist. (Allison-Bunnell, 1995).
This collection of papers is heavily used for research and represents a fragile, deteriorating paper collection that can best be preserved using traditional archival preservation microfilming practices. Archival microfilming results in three copies of the film being made, the preservation copy, the production master, and the use copy. In creating the preservation copy, the University archivists faced the dilemma of violating copyright law, which stipulates one copy can be made for preservation purposes, not three. It is generally agreed that although it is a violation of the copyright law, the three copies required using accepted preservation standards would be permitted provided the materials are deteriorating and the work is not currently in print.
But the archivists also wanted to extend copying of the collection to also create a copy for interlibrary loan to the research community. This use constitutes publication and distribution of a work to the public through lending, a right specifically reserved for the copyright holder to protect the holder’s first right to publication. Many of the items contained in archival collections are unpublished works and under current interpretation of law, it is not legal for archivists to circulate unpublished works through lending. This interpretation of the copyright law made by the Copyright Office protects the authors of unpublished works often found in manuscript collections in the form of correspondence, but it also interferes with the archivist’s basic mission to serve researchers and the public by making available source materials.
Correspondence copying and dissemination poses another set of problems related not just to the intellectual property rights of the letter writer, but also to that writer’s privacy rights. Would microfilming invade the privacy of the correspondents? Even though Katherine Anne Porter gave her collection to the University, the copyright holder of the correspondence is the individual letter writer and copyright restrictions apply to protect those writers. The Archivist Code of Ethics states: "Archivists respect the privacy of individuals who created, or are the subject of, documentary materials of long-term value, especially those who had no voice in the disposition of the materials." (Society of American Archivists, 1992). This need to be cautious has prompted the University of Texas at Austin to offer a database of individuals holding literary rights to the works of some authors. This provides some assistance to archivists and researchers needing to secure permissions in order to follow copyright law. Still, it is difficult for the archivist to act precisely within copyright law and serve the needs of the research community and the public by adhering to the required practice of contacting each letter writer to secure their permission to copy their correspondence.
Preservation microfilming is now the accepted standard for correct preservation of archival records and it is technologically feasible to digitize these materials as well, creating in the process the accepted microfilm preservation copy and the digitized record for broad dissemination of electronic source material and images on the Internet, but the interpretations concerning what materials are legally eligible for publication on the Internet present a barrier for our institutions. Clearly archivists need to proactively work on liberalizing the copyright law in order to fully and legally participate as users of the Internet, particularly in the case of unpublished works. In order to make source materials available on the Internet, policies which provide a balance between the intellectual property rights of individuals and the dissemination and copying of information needs to be developed and tested.
2). Use of policies and information technologies to address the difficulties of preservation and migration of data
Information needs to be preserved if it is to be available for use. Traditionally, archivists have used environmental monitoring controls to help extend the life of paper and when the fragility of the paper documents has reached a point of deterioration, have turned to preservation microfilming as a migration measure, migrating the information to another format to preserve it. Stories of the 1960 U. S. Census Bureau data that can only be read on two computers in the world today, and the story of the computerized index to a million Vietnam war records entered on a hybrid motion picture film carrier that can no longer be read inspire archivists to do more and to do it quickly.
Archival material is often the byproduct of administrative actions and transactions in large organizations. If the evidential value produced using electronic records is not ensured by building controls into the administrative and information systems in use today, law and scholarship will have no use for the information. A guarantee of trustworthiness, a true description of the circumstances of their creation and a record of their path toward preservation needs to be ironclad to provide impartial evidence of actions and provide a reliable account of those actions. Archivists are assigned the responsibility to ensure the integrity and authenticity of the material and preserve its meaning in context. In order to do this, archivists need to participate early on in the design of electronic systems and organizational practice to require that the ability to capture provenance is built into systems. Policies and practices that require elements defining the record’s creator, the record structure, the functions, and the administrative context of the record are essential to the electronic record in order for it is be useful for administrative, legal, and cultural purposes. Controls and policies need to be in place early on in system design and records creation, not just to guarantee authenticity, but also, in today’s environment where electronic information is distributed freely and copied many times over, they are needed to guarantee confidentiality and privacy. Archivists are in the position of being able to spell out the requirements for intellectual preservation using version control, digital date stamping, and encryption techniques for dealing with complex object-oriented systems as well as less complex records and to work collaboratively with information technologists to put these controls in place early in the design process so that complete information can migrate from one system to the next.
Functional requirements definition work addressing these issues is being explored at the University of Pittsburgh. This work sets out requirements to which organizations are asked to adhere, including being aware of the jurisdictional legal and administrative requirements for recordkeeping, developing policies that promote accountable recordkeeping, assigning organizational responsibility for the records, and implementing formal methods for the management and accuracy of complete documentation to accompany hardware and software. Included also are requirements for creating records, maintaining records, accessing records, exporting records, and redacting records. This work represents a possible model of best practices for business and industry to follow in creating not just a comprehensive corporate memory, but records that are trustworthy. Many archivists have become convinced that part of the solution to preserving the electronic record lies in being involved at the front-end of the records creation process and pre-determining, through appraisal assessments, which business units or governmental records are likely to have enduring and continuing value. The focus on appraisal has now shifted toward broad analysis of organizational functions and business processes that represent records that warrant preservation. Some of the most compelling reasons for working with system designers to fashion policies to manage and preserve electronic records include clear legal requirements to create and retain records, the need for ongoing access to quality records over time, the high degree of risk associated with poor recordkeeping, the high visibility of records offered to the public, the culture of stewardship, and the acceptance of the electronic record as the "official record."
Organizational behaviors that ignore or overlook the evidential and historical significance of electronic records need to change, and policies need to be developed that require the use of structural elements that offer sufficient description to meet not just system design requirements but archival requirements as well -- description of content and identifiers which place the information in context and provide authentication information to accompany the records. If records are not preserved and migrated to new information technologies so that they can be relied upon as authentic, accurate, and complete, they will be useless as evidence in disputes with trading partners, with taxing authorities, and in the courts. The value of these records to researchers and the public will be diminished.
The technical solutions for capturing and saving multidimensional files and complex records demand partnership with researchers and information scientists. The starting point for maintaining electronic records is that they be identifiable as records, but they also need to be legible and then accessible to users. This implies that software and hardware dependencies be acknowledged and solutions found to create software independence, and that appropriate security levels need to be maintained to copy or view a record so that its trustworthiness can be maintained as the data migrates to new systems.
Innovations diffusion and education are important in building solutions to preservation and migration problems challenging archives. Discussions are held on listservs and new research is inspired to build a documentary heritage from today’s electronic records. Archivists are becoming aware that their responsibility includes producing research in information technologies and policy formulation which will enable electronic records to become a part of the archival record accessible to all who want to draw upon it as authoritative information.
Archives have long held the position of being able to assure and authenticate records. Archivists need to continue this tradition by helping to develop security models that are appropriate for use with electronic records and by developing methods of documenting the custodial arrangements used as the electronic record migrates from one format to another over its lifetime of evolving information technologies. Assuring that the electronic record will be legible or useful once it is located is absolutely critical and more problematic. Some dated records are being kept with the hope that a future development will unlock the software and hardware dependencies and help recreate documentation to use the data. A more proactive action aimed at solving the problems of obsolescence is to migrate records to an open standards format, working to ensure that the standards maintain the content and the context of the record or migrating records to newer versions of the software as they are created, again maintaining the authentic content and the context of the original creation and its use. Complex records such as those on the World Wide Web present another challenge. Records published on the Web can have behaviors that include links to other records or embedded programs that act by themselves. What does it take to archive a World Wide Web page containing graphics, audio, and Java applets?
A great many questions are not yet answered and a sense of urgency about finding answers for successfully preserving the electronic record is driven by the knowledge that we could lose a large part of our current cultural and social context by failing to take action soon. But even once the problems concerning preservation and migration of the electronic record are brought under control, the equally important problem of locating the records needed for examination remains open for refinement.
3). Use of Standards
Standards are necessary. Fred Stielow, an archivist and information technologist, states that "The historical lessons are clear: Electronic preservation has a chance of success only at the place where standards exist and where we can reasonably project some constancy over time." (Stielow, 1992). Archivists are becoming versed in the value of open systems and the standards issued by the American National Standards Institute, the International Standards Organization (ISO), and the Consultative Committee on Telephone and Telegraph, and are learning about the publishers’ use of Standard Generalized Markup Language and the Information Resources Dictionary System. Archivists face the daunting task of keeping abreast of the constant evolution of standards through active participation in their formulation while continuing to revamp existing systems in this changing environment.
One new standard evolving directly from archivists themselves is the Encoded Archival Description (EAD) Document Type Definition (DTD). This is a non-proprietary standard for machine readable finding aids created by archivists working with librarians, museum curators, and curators of manuscript repositories to support the use of archival collections. Finding aids are the tools generally constructed by archivists to describe bodies of material and lead researchers to materials of interest. EAD DTD goes beyond providing the information that has been provided by traditional machine readable cataloging records, the MARC record.
The EAD DTD standard accommodates registers and inventories of any length describing the full range of archival holdings, including textual and electronic documents, visual materials, and sound recordings. It supports the conversion of existing finding aids from print, word processing, and data base formats as well as the creation of new finding aids. This standard is designed to endure changing hardware and software platforms because it is based on a platform-independent standard, and markup conformant with SGML, ISO 8879, that is supported by a large number of software products available for a variety of operating systems.
The working group on the EAD DTD has produced clear application guidelines that encourage users barely acquainted with SGML to apply the DTD routinely in their normal work. The encoding scheme does not define or prescribe the intellectual content for finding aids, but it does define content designations. It identifies the essential data elements within traditional paper based finding aids and establishes codes and conventions necessary for capturing and distinguishing information in electronic form. The standard is designed to facilitate interchange and portability across the Internet and will provide a uniformity of presentation for information placed on the Internet. Related or complementary standards such as the Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange and US MARC formats used by libraries can be employed as appropriate with EAD DTD. As this standard evolves into a widely used and accepted tool in archival enterprise, more information will be delivered in on-line environments about archival holdings.
Retrieval aids such as metadata that assist in more precisely identifying records of interest to the user are also needed to make records accessible. With the explosive growth of material being made available on the Internet comes the need to be able to find the material. Locator services such as the Government Information Locator Service are evolving to assist with the problem of locating governmental source material.
Another effort is the creation of metadata records that describe networked resources using a small group of basic elements nicknamed the Dublin Core. Rather than depend upon a small number of abstractors, indexers, and catalogers to create electronic descriptions of electronic information, the Dublin Core effort is aimed at attracting legions of authors, creators, and information providers to describe the resources themselves at the time of creation in order to expand the ability to locate information with precision on the Internet. The Dublin Core is composed of thirteen elements including subject, title, author(s), publisher, other agents such as editors or transcribers, date, object type, form, identifier, relationship to other objects, source, language, and coverage. These elements all represent properties or characteristics of the object that can be determined at the time of creation.
The Dublin Core encourages publishers and authors to provide metadata in a form that would facilitate automated resource discovery tools in locating sources and information and would also encourage the creation of network publishing tools that contain a template for easily embedding metadata elements at the time of records creation. Dublin Core work is a collaborative venture between government information providers, the library community, archivists, publishers, document vendors, the Internet Engineering Task Force, SGML vendors, and others working on text encoding and standards making. Participants who are reviewing and experimenting with the Dublin Core metadata include, among others, Los Alamos National Laboratory, Indiana University, O’Reilly Associates, Bunyip Systems, Georgia Institute of Technology, Library of Congress, OCLC Internet Resources Cataloging Project, SoftQuad, and Concordia University. Netscape and Microsoft are both backing a newer proposal that builds on the Dublin Core elements using eXtensible Markup Language (XML). This wide range of collaborators and participants share the goal of creating a framework for allowing sites to publish and share not just content, but guidance in finding the content with precision and accuracy.
Conclusion
Archivists need to move quickly to pioneer solutions to these problems before the cultural and social documentary heritage in our present information systems disappears. Current laws and regulations both impede progress and mandate that we develop functional requirements and routines for evaluating, capturing, and preserving electronic data. Failure to do so will be expensive, as illustrated by a Federal Court that cited the White House and the acting Archivist of the United States for civil contempt for failure to protect and preserve the computer records of the Bush and Clinton Administrations. Fines starting at $50,000 and rising to $200,000 a day will last until the Administration takes action to preserve those records. This tangible cost pales in comparison with the cost of not deliberately fostering the principles of archival enterprise in the Internet community even as technological change accelerates.
Three areas have been presented in this paper for action.
(1) Public laws designed to protect privacy and ownership of property must continue to be observed, but the archivists’ participation in their change and formulation needs to be increased to ensure that information of enduring value remains publicly available.
(2) The guiding public policy goals of electronic recordkeeping must include ensuring that records are trustworthy. Archivists need to engage in front-end systems design work and policy development to ensure that those items which represent evidential value can be counted on as authentic and trustworthy, legible, and accessible while at the same time holding private and confidential those materials which are governed by law.
(3) Standards need to be implemented to make information accessible across platforms and with the aid of tools that will add precision and accuracy to the process of locating information. Those who now routinely use the Internet to conduct discussions, disseminate their work, and conduct research expect that archival information will be available to them wherever they are.
The early days of the Internet may appear to future historians as a pre-literate society. Very little of it will remain for study whether of the Internet itself or the societal trends and changes that are occurring because of the Internet, if we do not act to change organizational behaviors and develop policies to protect and preserve archival records along with the new information technologies. Much must be done by archivists in influencing the direction of copyright law, in system design, and in standards development. The end of work is not one of the problems faced by the archival community.
Selected Bibliography
Allison-Bunnel, J. L. (1995). Access in the time of Salinger: Fair use and the papers of ;Katherine Anne Porter. American Archivist: 58(3), 270-282.
Alschuler, L. (1997). Netscape proposes metadata framework. WebWeek: 3(19), 25.
Barry, R. K. (August, 1996). Development of the Encoded Archival Description Document Type Definition. [On-line]. Available: http://www.loc.gov/ead/eadback.html
Bearman, D. (1995). Archival strategies. American Archivist: 58(4), 380-413.
Duranti, L. (1994). Commentary. American Archivist: 57(1), 36-41.
Hakala, J., Husby, O., & Koch, T. (April, 1996). Warwick framework and Dublin core set provide a comprehensive infrastructure for network resource description. Metadata Workshop II, Warwick, UK, [On-line]. Available: http://www.bibsys.no/warwick.html
Hedstrom, M. (1997). How do we make electronic archives usable and accessible? Documenting the Digital Age Conference, [On-line]. Available: http://dtda.mci.com/prsenta/hedstr02.htm.
Michelson, A. & Rothenberg, J. (1992). Scholarly communication and the information technology: Exploring the impact of changes in the research process on archives. American Archivist: 55(2), 236-315.
Rosenoer, J. (1997). Cyberlaw: The law of the Internet. New York: Springer-Verlag.
Society of American Archivists. (1992). Code of Ethics for Archivists. Chicago: Society of American Archivists. Also: [On-line]. Available: http://www.archivists.org/vision/ethics.html.
Stielow, F. J. (1992). Archival theory and the preservation of electronic media: Opportunities and standards below the cutting edge. American Archivist 55(2), 332-343.
Weibel, S., Godby, J., Daniel, R., & Miller, E. (1995). OCLC/NCSA Metadata Workshop Report, [On-line]. Available: http://www.oclc.org:5046/oclc/research/conferences/metadata/dublin_core_report.
| Return to Table of Contents |
This page is created and maintained
by Sue Soy ssoy@ischool.utexas.edu
Last Updated 12/11/98
© Copyright 1996 Susan K. Soy
Please feel free
to copy and distribute freely for academic purposes with this notice and attribution.
All other rights reserved.