Introduction
to Preservation for Libraries and Archives LIS 392.P1
Course 44550
Fall
2000
![]()
Will Permanence Exist in the Digital Age?
Introduction
Throughout the archival, preservation, and records management world, concerned professionals discuss shared responsibility for the long-term care of digital assets. Our planning documents include statements promoting persistent access paths that will keep project data alive long after projects are completed, will preserve the new voices of the electronic age, and make them accessible over time. We marvel at the thought of terabytes of storage and are perplexed by the enormity of the increasingly complex datastores of digital and electronic information that contain the record of our being. We debate whether the traditional principles of appraisal apply when examining these digital objects and wonder if we can afford the task of caring for our heritage as documented in the electronic record. Auditors come to us with questions that cause us to consider risks such as the business risk of not acquiring materials, the risk of failing to make collections available to users, the risk of not preserving the collections in our archives, and the risk of exposing the collections to mutilation or accidental loss. The auditors are asking these questions in reference to the electronic objects and digital assets in our care and we are not yet prepared to answer their questions.
This paper discusses experiments and studies that explore long-term preservation issues in the digital age. This paper suggests that we are at the beginning stages in the effort to establish permanence for digital records and concludes that we have not yet reached the goal of ensuring permanence in the digital age. We are, in fact, still defining permanence and identifying the questions that need to be asked.
Permanence Defined
Archivists and others use the word permanent frequently, but have difficulty explaining what it is. Dictionaries define permanence using phrases like "a state of being permanent."[1] Permanence is related to words such as durability, lasting, abiding, remaining without essential change, and continued existence.[2] Peter O'Toole carefully considered the word in his often referenced 1989 essay "On the Idea of Permanence." He points out that the word permanence, although frequently used by archivists, does not appear in basic glossaries used by archivists and that the word seems to carry different meanings within the archival profession. Records managers, however, do define permanent records with certainty. Permanent records, according to records management, are those that are kept indefinitely.[3] O'Toole makes the point that over time, permanence has acquired varied and shifting meanings for archivists and that archivists distinguish between (1) the permanence of the information and (2) the permanence of the original form of the material.
William J. Barrow's work studying permanent paper created an awareness in the archival profession of conservation practice and, with the availability of new chemical treatments and programs of study for conservators, the possibility of preserving records so that they are physically permanent became a reality. Along with the practice of conservation came the realization that the cost of creating permanent collections is high and that the task is an enormous one with no end in sight. Professionals began to measure conservation costs versus the worthiness of items, factoring in the subjective "intrinsic value" of an item. A special committee convened by the National Archives tried to define criteria for assessing items deserving of preservation in their original format because of their intrinsic value.[4] Discussion continues. Even today, we lack a consensus definition for the word permanence and O'Toole's conclusion that the term has little clarity still holds true. No entry appears in "An Archival Glossary for the Millennium"[5]recently prepared by Vogt-O'Connor of the National Park Service. However, the word is referenced in her definitions of "durable naming conventions" and "archival quality" where she refers to "permanent and durable names and locations for long-lived electronic objects" and uses the descriptors "durable" and "long-lived."
What and how much electronic data can we afford to preserve for the long term? Will it be preserved in terms of (1) the information content or (2) the physical form? Some, like Microsoft's Nathan Myhrvold, believe the answer lies in collecting it all now and relying on future inventions and inventors to sort through these data and make those determinations.[6] Stewart Brand, the author of the Whole Earth Catalog, believes that "Silicon Valley hasn't stepped up to this, because, as they say, there is no business case for archives."[7] Others, like Terry Kuny, express fear that we have lost enormous amounts of digital information already and that we are creating the "digital dark ages" because of our failure to address real needs.[8] Kuny suggests that we take immediate action to perform digital triage, develop rescue operations, learn to cope with multiple document formats, find a method to manage rights and access controls for electronic objects, and, work together to do digital preservation as a public good, even if it is a thankless task. He also advocates for research studies that address both the information content and format preservation issues.
Three Studies Addressing Permanence of the Digital Record
Mapping the Problem --Time & Bits: Managing Digital Continuity
The Getty Conservation Institute and the Getty Information Institute joined forces to host a conference in 1998 to formulate answers to the provocative question "What are the long-term implications if we rely on current digital technology to preserve our cultural memory?" They invited individuals from the Long Now Foundation, members from both Getty Institutes, and individual technologists, archivists, and futurists to discuss the question. During the conference, this group segmented issues into four broad problem areas: (1) technical profiling, (2) socio-economic factors, (3) organizational contexts, and (4) legal constraints. They pointed out that, increasingly, the value of documents lies in their linked relationships with one another and to other resources and that we need strategies that will persist over time to preserve these links. They discussed the growing practice of protecting intellectual property using licensing agreements that fail to address the right to preserve the information or the right to use it after it has been preserved. Specific short-term suggestions for preservation action from this conference include saving files in the most common file formats (avoiding compression whenever feasible), using standard color bars for image capture, documenting and logging changes to digital objects, and creating and saving as much metadata as possible to describe the object. Long term strategies need to recognize that bits and atoms are very different and require different preservation and protection strategies. The conference report explains that we are just beginning to understand the differences:
"Unlike print, which has had 500 years to create institutional contexts, digital documents are still in the early stages of innovation. Typically, in the first phase of innovation a new technology will imitate an older one; thus digitized documents are created from printed originals as a means to create new value by expanded access. But new social and organizational context made possible by digital documents are still emerging, only suggested by terms like "virtual community" and "distance education." In this sense, the organizational contexts which are responsible for digital documents remain to be defined and founded."[9]
This group further commented that cultures that pride themselves on the accumulation of knowledge from the past should be cognizant of preserving new knowledge for the future and need to keep focused on the ultimate goal to maintain continuity of content. Best practices, they conclude, will involve preservation of a variety of formats, including hard copy or analog magnetic storage as well as digital data.
The Internet Archive Experiment
In April of 1996, Brewster Kahle founded a digital Internet library called the Internet Archive. Kahle defines Internet libraries as libraries that store in digital form the Internet and beyond. He created the Internet Archive to preserve a record of digital materials and provide free, permanent access to this record to researchers, historians, scholars, and members of the general public.[10] Kahle uses the term permanent to mean forever. His Internet Archive is designed to be a vast collection of digital data that is protected from damage and destruction.
The potential damaging or destructive threats to the Internet Archive are (1) the consequences of accidents and data degradation and (2) the inaccessibility of data as formats become obsolete. Kahle's shield against accidents is the strategy of maintaining copies of the Archive’s collections at multiple sites. Migration, to shield against degradation that occurs over time with storage media, will occur at least every 10 years at the Internet Archive sites. The Internet Archive is also collecting and preserving software and emulators for future researchers to use in their research as data formats become obsolete and difficult to locate.
The Internet Archive collections include World Wide Web pages, FTP sites, and Usenet bulletin boards. Donors provide digital collections to the Archive and Web collections are acquired using a web crawling robot, software that automatically collects pages from publicly accessible Web servers. The robot examines each page for links to other pages that it can collect. When it finds more links on those pages, it follows them and collects them.
Collections like these "born digital" collections are of great interest to librarians, preservationists, and archivists because they pose new and complex issues. Recently, the Library of Congress contracted with the Internet Archive. The two entities, working together, hope to gain experience in long term curatorial practice while harvesting and archiving the "Web experience" of the historic U.S. National elections of year 2000. The contents of nearly 150 sites are being harvested daily to document this election. The intent of the joint project is to use these data to help archivists create appropriate selection policies for digital media and provide real experience in how to preserve society's cultural artifacts and continue the process of collecting and preserving the collective memory of society.
Brewster Kahle uses an almost desperate voice when discussing the need to ensure permanence in the digital age. He states:
"It will take many years before an infrastructure that assures Internet preservation becomes well established--and for questions involving intellectual property issues to resolve themselves. For our part, we feel that it is important to proceed with the collection of the archival material because it can never be recovered in the future. And the opportunity to capture a record of the birth of a new medium will then be lost."[11]
InterPARES Research Study, An International Collaborative Project
The InterPARES Study is a major international research initiative in which archival scholars, computer engineering scholars, national archival institutions and private industry representatives are collaborating to develop the theoretical and methodological knowledge required for the permanent preservation of authentic records created in electronic systems.[12] The InterPARES Project home base is the University of British Columbia, Canada under the leadership of Luciana Duranti and Terry Eastwood. This project is notable for several reasons, but discussed here because of its highly collaborative nature and because the developers believe the long-term preservation of authentic electronic records is centered on the nature and meaning of the record itself. They believe the integral components which identify and authenticate a record have not changed just because the form is now electronic or digital. InterPARES Project leaders use the existing tools of archival science and diplomatics to develop knowledge essential to the permanent preservation of digital assets. The project intends to produce model strategies, policies, and standards that promote long-term preservation of the authentic electronic record.
Terry Eastwood is investigating appraisal. He references Harold Naugler's 1984 work that identified six important factors to consider in appraisal of electronic records:
Eastwood outlines the difficult and weighty issues related to legal factors the appraiser must consider by referencing the Guide for Managing Electronic Records From An Archival Perspective, a publication of the International Council on Archives, Committee on Electronic Records. This 1997 publication lists these legal factors for archivists to consider:
Archivists must convince organization leaders that they have a legitimate role in electronic records management processes and that they need to be included along with other advisors when decisions are made to create new data management systems. The magnitude of these tasks explains in part why so few organizations have actually instituted appraisal procedures for the electronic records in the organization and why best practices are still difficult to locate and underscores the value of the InterPARES Project where appraisal and other areas such as system design requirements using a wide range of record types and metadata models and being put into practice and evaluated. The InterPARES Project addresses the issues of long-term accessibility and authentication of electronic records in testbeds that are drawn from actual archival records collections in the United States. Each one of the InterPARES Project groups is generating a long list of important issues for study and research.
Archivists know they need to be present and at work at the start of the electronic records life cycle and many now argue for a continuum approach to appraisal that includes inserting a re-appraisal process usually at the point of data migration. Archivists agree, for the most part, that they work in a field that is closely related in principle and practice to records management. Discussion about this entwined and complex relationship causes Eastwood to raise even more questions:
Conclusion
This paper reports on only a fraction of the complex issues and topics discussed in the literature. Threads that weave through current literature suggest that we must jump in, must become proactive, and must interact with users and creators even though we do not yet have all of the answers that we need to ensure that permanence will exist in the digital age. The literature suggests that we must be highly selective in the appraisal process now that 80% of the information created in the workplace is created electronically.[16] Kuny concludes that the challenge our archivists, preservationists, and records professionals face is a sociological one that must be addressed at the "desktop," the point of creation by individuals.
The research needed, the implementation of the practices that will result from the research, and the ongoing maintenance of those practices will require financial support beyond what we can envision today. What we can do now is continue to press the point within our sphere of influence that collective memory is worth the rescue and preservation effort and cost. We can develop guidelines for appraisal and selection that are sensible and easy to follow, press for interim measures to preserve information, work together to create documentation standards, participate in collaborative research, and finally, promote the importance of preservation at every opportunity.
Archivists, preservation professionals, librarians, records managers, and historians understand the value and importance of today's digital assets and the peril we face if we do not preserve these objects for the future. It may take society longer to recognize the value of its collective memory assets, it could result in corporations suffering expensive losses before the value of preserving and protecting digital assets can be calculated, it could take extensive litigation before rights management issues are prescribed in workable and just ways, and it could take inventions that we haven't yet imagined to mold solutions to this very difficult problem of creating permanent, enduring, and long-lived digital age records. Until then, we need to encourage collaborative efforts using all of our various professional orientations to explore and define permanence and identify ways of ensuring it will exist in the digital age.
Footnotes
[1] Merriam-Webster Online Dictionary and Thesaurus. 2000. Available: http://www.m-w.com/ [Accessed 11/11/2000].
[2] Oxford English Dictionary Online. 2nd Edition, 1989. Available: http://dictionary.oed.com/cgi/entry/00175862 [Accessed 10/28/2000].
[3] Peter O'Toole. "On the Idea of Permanance." American Archivist, (52, Winter, 1989), p. 12.
[4] Ibid. p. 22.
[5] Vogt-O'Connor, Diane. "An Archival Glossary for the Millennium." Cultural Resources Management (22, 199), p. 46-48. Available: http://www.cr.nps.gov/crm [Accessed 11/05/2000].
[6] Nathan Myhrvold. "Why Preserve the Internet" Documenting the Digital Age Conference. San Francisco, February, 1997.
[7] Dan Gillmor. "Preserving Our History for Future" Mercury Center News. July 1, 2000. Available: http://www.mercurycenter.com/svtech/columns/gillmore/docs/dg070200.htm [Accessed 07/23/2000].
[8] Terry Kuny. "The Digital Dark Ages? Challenges in the Preservation of Electronic Information." Information Preservation News: A Newsletter of the IFLA Core Programme for Preservation and Conservation, (17, May, 1998). Available: http://www.ifla.org/VI/4/news/17-98.htm [Accessed 07/30/2000].
[9] Margaret MacLean and Ben H. Davis, Eds. Time & Bits: Managing Digital Continuity. (Santa Monica, CA, J. Paul Getty Trust 1998), pp. 15-19.
[10] The Internet Archive. http://www.archive.org/ [Accessed 11/11/2000].
[11] Brewster Kahle. "Preserving the Internet." Scientific American, March, 1997, http://www.sciam.com/0397issue/0397kahle.html [Accessed 11/11/2000].
[12]The InterPARES Project. Available: http://www.interpares.org [Accessed 11/11/2000].
[13] Harold Naugler, The Archival Appraisal of Machine Readable Records: A RAMP Study with Guidelines (Paris, UNESCO, 1984, p. 8.
[14] International Council on Archives, Committee on Electronic Records, Guide for Managing Electronic Records From An Archival Perspective (Paris: International Council on Archives, February 1997), p. 13
[15] Terry Eastwood. Appraisal of Electronic Records: A Review of the Literature in English. (Vancouver, British Columbia: InterPERAS Project, Appraisal Task Force), May 30, 2000. p. 12, 15.
[16] Charles Dollar. Authentic Electronic Records: Strategies for Long Term Access. (Chicago: Cohasset Associates), 2000.
Cited and Referenced Works
Conway, Paul. Preservation in the Digital World. Washington, DC: Commission on Preservation and Access, 1996.
De Stefano, Paula. "Digitization for Preservation and Access." In Preservation: Issues and Planning, Eds. Paul Banks and Roberta Pilette. Chicago: American Library Association, 2000.
Dollar, Charles. Authentic Electronic Records: Strategies for Long-Term Access. Chicago: Cohasset Associates, Inc., 2000.
Garrett, John, and Donald Waters. Preserving Digital Information: Report of the Task Force on Archiving Digital Information. Commissioned by the Commission on Preservation and Access and the Research Libraries Group, Inc. Mountain View, CA: Research Libraries Group, Inc., 1996.
Gilliland-Swetland, Anne. Enduring paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment. Washington, DC: Council on Library and Information Resources, 2000.
Gillmor, Dan. "Preserving Our History for Future." Mercury Center News. July 1, 2000. Available: http://www.mercurycenter.com/svtech/columns/gillmor/docs/dg07200.htm [Accessed 11/10/2000].
InterPARES Project: International Research on Permanent Authentic Records in Electronic Systems. Vancouver, British Columbia: University of British Columbia, 2000. Available: http://www.interpares.org [Accessed 11/11/2000]. http://is.gseis.ucla.edu/us-interpares/ [Accessed 11/25/2000].
International Council on Archives, Committee on Electronic Records. Guide for Managing Electronic Records From an Archival Perspective. Paris: International Council on Archives, February, 1997.
Internet Archive. Available: http://www.archive.org [Accessed 11/11/2000].
Kahle, Brewster. "Preserving the Internet." Scientific American. March, 1997. Available: http://www.sciam.com/0397issue/0397kahle.html [Accessed 11/11/2000].
Kuny, Terry. "The Digital Dark Ages? Challenges in the Preservation of Electronic Information." Information Preservation News: A Newsletter of the IFLA Core Programme for Preservation and Conservation. May 17, 1998. Available: http://www.ifla.org/VI/4/news/17-98.htm [Accessed 07/30/2000].
Lyman, Peter, and Brewster Kahle. "Archiving Digital Cultural Artifacts: Organizing an Agenda for Action." D-Lib Magazine. July/August, 1998. Available: http://www.dlib.org/dlib/july98/07lyman.html [Accessed 11/11/2000].
MacLean, Margaret, and Ben H. Davis. Time & Bits: Managing Digital Continuity. Santa Monica, CA: J. Paul Getty Trust, Getty Conservation Institute, 1998.
Merriam-Webster Online Dictionary and Thesaurus. Available: http://www.m-w.com [Accessed 11/11/2000].
Myhrvold, Nathan. "Why Archive the Internet?" Documenting the Digital Age Conference. Presented by MCI, History Associates, Microsoft, and the National Science Foundation, San Francisco: February, 1997. No longer available at http://dtda.mci.com [Accessed 3/29/1997].
Naugler, Harold. The Archival Appraisal of Machine Readable Records: A RAMP Study With Guidelines. Paris: UNESCO, 1984.
O'Toole, James M. "On the Idea of Permanence." The American Archivist 52 (Winter, 1989): 10-25.
Oxford English Dictionary. 2nd Edition, 1989. Available: http://dictionary.oed.com [Accessed 10/28/2000].
Price, Laura and Abby Smith. Managing Cultural Assets From a Business Perspective. Washington, DC: Council on Library and Information Resources, 2000.
Society of American Archivists. Basic Principles for Managing Intellectual Property in the Digital Environment: An Archival Perspective. Chicago: Society of American Archivists, 1997.
Vogt-O'Connor, Diane. "An Archival Glossary for the Millennium." Cultural Resources Management 22.2 (1999): 46-52. Available: http://www.cr.nps.gov/crm [Accessed 11/05/2000].
---. "Is the Record of the 20th Century at Risk?" Cultural Resources Management 22.2 (1999). Available: http://www.cr.nps.gov/crm [Accessed 11/05/2000].