Assignment 8 - Preservation Strategies for Digital Objects

Describe the issues involved with various preservation strategies for digital objects.
Are virtual machines the answer?

Just as we are still struggling to define what constitutes a document in the digital context (Buckland, 1997, Levy, 1994) we are also still struggling to determine the best strategies for preserving digital objects. The three primary strategies, refreshing, migration, and emulation, address particular pieces of the problem, but none of these solutions alone provide the panacea for the preservation of digital objects.

Refreshing
The first and most simplistic strategy for preserving digital objects is refreshing of the digital objects. Refreshing simply creates redundant and periodic copies of digital objects so as to assure the validity of the data upon the medium. Digital objects are fragile and deteriorate rapidly, especially when compared with paper, but can be reliably copied indefinitely. The ability to create exact duplicates is an advantage over paper where copies can introduce differences, some of which are not subtle depending on what is being duplicated or the method being employed. However this only addresses a part of the problem. While refreshing can effectively maintain the integrity of the data, the data itself can quickly become meaningless as the technology used to access, format, and present the data quickly change. Unless a digital storage medium is created that is not vulnerable to rapid deterioration, refreshing will remain a necessary component of any preservation strategy, but it is not enough by itself.

Migration
A very similar strategy that attempts to address the hardware and software obsolescence problem is migration. Like refreshing, it involves copying the digital objects, but in migration, the objects are moved and translated into the current technology. On a small scale (time and size), migration can be an effective strategy, but it does have limitations. Introducing new hardware and/or software means that the copy my no longer be identical. The other major problem is in managing this process. As the scope of documents being preserved increases, so do the systems across which they are migrated and the potential for compatibility issues.

Emulation (Virtual machines)
One strategy that tries to avoid the complexities of translation introduced by a migration scheme is called emulation. Emulation relies on a Universal Virtual Computer (UVC) that is able to decode data independent of hardware of software configurations. This strategy has two main approaches. One approach involves archiving the program used to manipulate the data as well as the data itself. Lorie (2001) correctly points out that this method requires non-trivial architecture description is overly burdensome for data archiving, and isolates the data within the emulated environment. The other approach, which is advocated by Lorie, is to include methods to extract metadata and data that are available to an application within the UVC and return the data in an understandable way.

The data archival approach is an improvement over program archiving in that it greatly simplifies the components that must be emulated and provides a way to transport usable data across systems once it has been extracted. However there are still considerable obstacles to overcome before this becomes a viable solution. Primarily, additional technological work needs to be performed to demonstrate the validity of the concepts even beyond Lorie's initial work in this area (2002) and an agreement among hardware manufacturers and software developers to perform the necessary work for their respective machines. Until these factors are addressed, emulation as a preservation strategy will remain just a promising idea.

The issues involved with preservation of digital objects are complex. At a minimum, preservation must involve some strategy to maintain the integrity of the data (refreshing or migrating) until a more stable storage media is discovered. It should also include some method of translating the data into a usable form as technology continues to change rapidly (migration or emulation). Whatever method or combination of methods are chosen, they should ultimately be chosen under the consideration of how this data will be used (Levy, 1998).

References

Buckland, M. What is a document? JASIS, 48(9), 1997. pp. 804-809.

Levy, D., Heroic measures: Reflections on the possibility and purpose of digital preservation, Digital Libraries, 1998. pp. 152-161.

Levy, D., Fixed or fluid? Document stability and new media. Proceedings of the European Conference on hypertext technology '94, pp. 24-31. 1994. Edinburgh, Scotland: ACM.

Lorie, R., Long term preservation of digital information, JCDL, 2001. pp. 346-352.

Lorie, R., A methodology and system for preserving digital data, JCDL, 2002. Pp. 312-319.

return to my class homepage