random library quotation Link: Publications Forum Link: About DLF Link: News
Link: Digital Collections Link: Digital Production Link: Digital Preservation Link: Use, users, and user support Link: Build: Digital Library Architectures, Systems, and Tools





Contact Info:

Digital Library Federation
1752 N St NW
Suite 800
Washington DC 20036

Digital Preservation

Building on the work of the Commission on Preservation and Access (CPA), CLIR and the DLF remain committed to maintaining long-term access to the digital intellectual and scholarly record. They have a particular interest in practical initiatives and in research into most poorly understood areas. This page links to CLIR, DLF, and CPA preservation initiatives, research reports, and related information resources.

Preservation initiatives

The Global Digital Format Registry

ACADEMIC INSTITUTIONS ARE beginning to create digital institutional repositories into which the intellectual capital of a college or university can be preserved for reuse—gathering up not just the articles and books of the completed scholarly endeavor but also the data sets, presentations, and course-related materials that faculty generate. As this process moves forward, it becomes obvious that these institutions also need to save information about the many computer formats in which this mass of material expressed itself.

In fall 2002, a small group of Digital Library Federation (DLF) members—spearheaded by Harvard University and the Massachusetts Institute of Technology (MIT)—began the work of designing a central, shared registry of digital formats that all participating institutions may one day contribute to and use.

Libraries, which take naturally to such collaborative work, knew immediately that the need was bounded neither to DLF members nor to U.S. institutions. DLF therefore reached out to others in the field. By the time the first face-to-face meeting was held in early 2003, the Format Registry Team had secured interest and representation from Bibliothque nationale de France, Harvard University, the Joint Information Systems Committee of the Higher and Further Education Councils in the United Kingdom, JSTOR, the Library of Congress, MIT, the National Archives and Records Administration, the National Archives of Canada, the National Institute of Standards and Technology, New York University, the Online Computer Library Center, the University of Pennsylvania, Stanford University, the British Library, the California Digital Library, the Internet Architecture Board, the Internet Engineering Task Force, the Research Libraries Group, and the Public Records Office in the United Kingdom.

Over the course of two long meetings and a flurry of e-mails, we have made remarkable progress toward the design of a global digital format registry. We have developed examples of how such a registry would be used to test the emerging design (i.e., "use cases"); decided what constitutes a "format" and what is merely a derivative form of a format; and articulated a series of services that can be built on top of an authoritative central registry. For example, the registry could be used to verify that what one takes into a repository is in fact the format that the human depositor says it is (better to know this at the point of ingest than to discover much later that a set of files described as TIFF images is actually something quite different). Or, the service could tell the user that the format that has just been loaded is unknown to it and therefore needs to be registered (an act that benefits all users of the service).

In August 2003, we presented our emerging design at the International Federation of Library Associations and Institutions conference in Berlin. Attendees strongly supported such a service and offered some valuable feedback on how it must work—and in how many places it must be housed—to be trusted and used on a global scale.

Much work remains to be done to build this service out, to establish a business model to sustain it, to develop a prototype and test it in the real world, and to create the mechanisms to populate and use it. Nonetheless, the work that has been done in the very lively planning stages suggests that we are well on our way to filling a critical gap in our international digital preservation architecture.

Currently, the University of Pennsylvania is developing a prototype registry service to test some design hypotheses for a format registry. Fred (Format Registry Demonstration) allows interested parties to contribute, view, and maintain format information. Fred is not itself intended to be the global format registry, but rather a testbed for ideas on how to design, build, and maintain such a registry. For more information see Building a robust knowledge base for digital formats (John Mark Ockerbloom, University of Pennsylvania). And further information on the GDFR Initiative and its participants is available at http://hul.harvard.edu/gdfr/.

National Digital Information Infrastructure Preservation Program

In December 2000, the U.S. Congress directed the Library of Congress to create the National Digital Information Infrastructure and Preservation Program (NDIIPP) and provided up to $100 million for this purpose. The primary goals of NDIIPP are to identify and preserve significant digital content that is at risk; support the improvement of tools, models, and methods for digital preservation; develop a national digital collection and preservation strategy; and establish a network of partners committed to these goals. Since its inception, NDIIPP has worked closely with DLF and its allies to promote mutual digital preservation objectives.

Eight lead institutions, assisted by over two dozen collaborating entities, have joined NDIIPP in developing a national partnership to capture digital materials of cultural and historical value and to improve techniques for preserving digital content over time. Many of these organizations are DLF members. The preservation projects include such topics as establishing national standards to preserve digital public television programs, developing web archiving tools to preserve government and political information, and saving at-risk content related to Southern culture and history.

NDIIPP and the National Science Foundation (NSF) joined forces to establish a Digital Archiving and Long-Term Preservation (DIGARCH) research program, and awarded $3 million among 10 projects researching the long-term management of digital information. Three projects are led by DLF member institutions. DIGARCH funds cutting-edge research in digital preservation with emphasis on digital repository models; tools, technologies, and processes; and organizational, economic, and policy issues. Projects range from preserving complex data types, like oceanographic data from deep-sea submersibles, to planning for preservation requirements; and from automating metadata capture to creating incentives for creators to deposit with archives. All DIGARCH projects are expected to produce study results in one year.

The Section 108 Study Group, named after the section of the U.S. Copyright Act that provides limited exceptions for libraries and archives, is reexamining those exemptions in light of how digital technologies have transformed the ways that copyrighted works are created, disseminated, preserved, archived, and made accessible. Current copyright law does not adequately address many of the issues unique to digital media. The group will recommend how to revise copyright law in order to balance the needs of copyright holders, libraries, and archives.

Preservation of electronic scholarly journals
A practical initiative to identify and build consensus around appropriate archival practices and to facilitate the development of lasting digital archival repositories for electronic scholarly journals. The pages include a web site for a program funded by the Andrew W. Mellon Foundation to plan long-term archival solutions for electronic scholarly journals.

Research reports

Risk Management of Digital Information: A File Format Investigation (June 2000)
This report by Gregory W. Lawrence, William R. Kehoe, Oya Y. Rieger, William H. Walters, and Anne R. Kenney is based on an investigation conducted by Cornell University Library to assess the risks to digital file formats during migration. The report includes a workbook that will help library staff identify potential risks associated with migrating digital information. Each section of the workbook opens with a brief issue summary; this is followed by questions that will guide users in completing a risk assessment. The appendixes also include two case studies for migration: one for image files and the other for numeric files.
Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation (January 1999)
A report by Jeff Rothenberg, computer scientist at the RAND Corporation, documenting and assessing existing models of digital archiving and developing his theories of software emulation.
Into the Future: On the Preservation of Knowledge on the Electronic Age
Film (including accompanying discussion guide and a compendium of other resources) produced by CLIR to inform a variety of publics about issues of preservation in the electronic age, to articulate what might be at stake for our society, and to point to ways that individuals and groups can work together to find solutions to the challenges posed.

Preserving Digital Information, Report of the Task Force on Archiving of Digital Information (May 1996)
Report by Donald Waters and John Garrett recommending specific actions that the Commission on Preservation and Access and the Research Libraries Group, Inc., and other organizations could undertake to help develop reliable systems for preserving access to digital information. A considerable portion of the report explores the nature of "information objects in the digital landscape." The report proposes creation of a distributed structure for collecting digital information resources, protecting their integrity over the long term and retaining them for future use. It concludes that the significant challenges in preserving digital information are not so much organizational or technological as legal and economic.

Related information resources

Preserving Access to Digital Information (PADI)
We are pleased to recommend this subject gateway to digital preservation resources.

return to top >>