Draft benchmark for digital reproductions of printed books and serial publications
30 July 2001
This document recommends a minimum benchmark for digital reproductions of printed texts and serial publications. It also outlines the importance, rationale, and implications of such a benchmark.
Work defining the benchmark grew out of the DLF's investigation into the need for and functional specification of a service through which libraries could register information about the digitally reformatted book and serial publications they had produced (see http://www.diglib.org/collections/reg/reg.htm).
Although the registry service envisaged by the DLF is not exclusive (it will be able to record information about the large and valuable legacy of digitized books and serials) its existence and use will provide an opportunity to identify and build consensus around minimum characteristics that might be expected generally of a faithful reproduction created from this day forward. The benchmark recommended in this document is intended, in part, to launch that consensus-building process.
The recommendation has been prepared by a working group of the Digital Library Federation (DLF) and is being circulated to DLF member institutions for their review, comment, and ultimate endorsement (a report of the group's work is available from the DLF's website).
The review period will last three months from 31 July to 31 October, 2001 during which time comments should be sent to email@example.com.
Should the recommendation prove acceptable to the DLF membership, the benchmark as revised will be posted on the DLF website and notified to the broader community alongside an indication of the DLF's endorsement.
1. What is a preservation digital master
A preservation digital master is a digital facsimile that is a faithful rendering of a printed text (including texts with illustrations and rare and early printed texts).
A preservation digital master must include digital page images.
The page images of a digital preservation master will have or exceed the following minimum level characteristics
Preservation digital masters must have descriptive, structural and administrative metadata, and the metadata must be made available in well-documented formats. Structural metadata must include page level information e.g. as required for page turning and related application software. A minimum list of structural metadata elements is recommended in the appendix.
Preservation digital masters may include machine-readable text as follows:
corrected OCR that is below 99.995% accurate,
corrected text (keyboarded or OCR) that is at or above 99.995% accurate
As well as:
text that is encoded (at any level, e.g. as specified in TEI Text Encoding in Libraries. Guidelines for Best Encoding Practices. Version 1.0, July 30, 1999)
2. Why is it important to build consensus around preservation digital masters
By agreeing to a minimum level benchmark for a preservation digital master, libraries and other organizations can reduce the risk involved in the production and maintenance of digitized texts while inspiring confidence in and encouraging their use.
Because a preservation digital master will be considered by the community as a digital object that is able to meet anticipated current and future needs, an organization creating the preservation digital master can invest in digitization secure in the knowledge that it will not be forced to re-digitize the object at some future date even as production techniques improve.
Users, meantime, will develop confidence in preservation digital masters because they have a minimum level of well-known and consistent properties, and they will support a wide variety of uses (including uses not possible with printed texts).
As access to printed texts shifts increasingly to digital preservation masters and their derivatives, collection managers may begin to investigate alternative means for responsibly and non-redundantly preserving the printed texts (or artifacts) from which they are produced; for example, establishing a network of specialist print repositories.
In particular, by building consensus around the characteristics of preservation digital masters, libraries and other organizations that produce and support access to printed texts will be able more effectively to:
It is important also to be specific about what consensus about preservation digital masters will not, should not, and is not intended to do.
3. Rationale behind recommended benchmarks
3.1. Book illustrations
3.2. Rare and early printed materials
4. Implementation issues
Which benchmark levels are selected and applied in a digitization project, indeed, whether that effort actually digitizes at or above the benchmark level will be determined locally with respect to a number of factors:
How libraries characterize early and rare printed materials will be a local decision based on local collections and collection expertise. The Rare Books and Manuscript Section of ACRL has issued guidelines on the selection of general collection materials for transfer to special collections that provide useful criteria for determining what constitutes an early or rare printed item:
Books may possess intellectual value, artifactual value, or both. Items with artifactual value include finely printed or bound books, those containing plates, valuable maps, or manuscripts, annotations, drawings or other original art work, including tipped-in photographs, or those published prior to a certain date (e.g., before 1800). Other categories on which there is wide, but not always general agreement, include:
a. fine bindings;
b. early publishers' bindings;
c. extra-illustrated volumes;
d. books with significant provenance;
e. books with decorated endpapers;
f. fine printing;
g. printing on vellum or highly unusual paper;
h. volumes or portfolios containing unbound plates;
i. books with valuable maps or plates;
j. books by local authors of particular note;
k. material requiring security (e.g., books in unusual formats, erotica or materials that are difficult to replace)
l. novels with duskjackets containing important information (e.g., test, illustrative design, and prices).
The rarity and importance of individual books are not always self-evident. Some books, for example, were produced in circumstances which virtually guarantee their rarity (e.g., Confederate imprints). Factors affecting importance and rarity can include the following:
Appendix. Draft list of structural metadata elements that should be required for preservation digital masters
<![if !supportEmptyParas]> <![endif]>
<![if !supportEmptyParas]> <![endif]>