Benchmark for digital reproductions of monographs and serials
As endorsed by the DLF

Original prepared on July 30, 2001
Revised on January 25, 2002

Contents

  1. Introduction
  2. What is a digital master?
  3. Benchmark digital master for a digitally reformatted monograph and serial
  4. Notes

1. Introduction

This document defines a minimum benchmark for digital reproductions of monographs and serials. The case for such a benchmark is made in an article by Greenstein and George that is available RLG's DigiNews.

Work on the benchmark grew out of the DLF's investigation into the need for and functional specification of a service through which libraries could register information about the monographs and serials that they had digitally reformatted (see http://www.diglib.org/collections/reg/reg.htm).

Although the registry is not exclusive (it will record information about digitally reformatted materials as well as those that are born digital, and about masters that meet agreed benchmarks as well as those that do not), it provides an important opportunity to identify and build consensus around minimum characteristics that might be expected of certain kinds of digital objects.

The benchmark defined in this document refers to digital objects that are created by reformatting printed monographs and serials, and that are intended as faithful reproductions of the underlying source documents.

Companion documents may be developed defining benchmarks for other digital masters - for example, those that may apply to serial publications or books that are born digital.

The benchmark has been prepared and endorsed by the DLF (a report on that work is available from the DLF's website).

2. What is a digital master?

Digital masters are digital object that are optimally formatted and described with a view to their quality (functionality and use value), persistence (long-term access), and interoperability (e.g. across platforms and software environments).

This document benchmarks digital masters for digitally reformatted monographs and serials. This type of digital master has all of the qualities described above. In addition, it is intended as a faithful rendering of the underlying source document, for example, with respect to its completeness, the quality of the image (including its tonality and color), and the ability to reproduce pages in their correct (that is their original) sequence. As a faithful rendering, a digital master will support production of a printed page facsimile that is a legible facsimile when produced in the same size as the original (that is, 1:1).

The benchmark acknowledges that what ultimately constitutes legibility and fidelity is a subjective decision. In part for this reason, the benchmark refers minimally to file formats, compression, and metadata. It does not provide production-level guidance, for example on how to deal with missing pages, to "clean up" foxing or blemishes on the digital master, or to select an appropriate dpi for fonts or source pages of different size. Such guidance, it is hoped, will evolve through experience and may be attached as companion documentation to this benchmark.

3. Benchmark digital master for a digitally reformatted monograph and serial

A digital master for a reformatted monograph or serial must include digital page images.

At least one version of the digital master's page images will have or exceed the following minimum level characteristics.
Black and white (may include simple line drawings, de-screened halftones) Grayscale Color
600 dpi, 1-bit or bitonal TIFF images[1].

Images must be sized and saved at 1:1 scale to the dimensions of the original page.

Images must be saved uncompressed or with lossless compression (e.g. ITU-T6, LZW, CPC). Where images are compressed they must be made available in the Group-4 format. The images may be dithered up from 400 optical dpi 1-bit images

300 dpi, 8-bit grayscale uncompressed TIFF, or lossless compressed image (e.g. JPEG2000).

Images must be sized and saved at 1:1 scale to the dimensions of the original page.

The dpi specification will relate directly to the font-size and page dimensions of the original source document, and to local definitions of legibility and fidelity. In many cases, 400 dpi will be preferred. Where larger pages are concerned (for example, those exceeding 7 inches in the long dimension), the lower dpi specification may be required).

300 dpi, 24-bit color uncompressed TIFF, or lossless compressed images (e.g. JPEG2000).

Allowed color spaces include RGB, sRGB, PhotoYCC, YCC, CIELab, and CMYK, with RGB and YCC being recommended as preferred for digital masters. Images must be sized and saved at 1:1 scale to the dimensions of the original page.

The dpi specification will relate directly to the font-size and page dimensions of the original source document, and to local definitions of legibility and fidelity. It may also relate to the perceived artifactual value of the source object or the extent to which its physical characteristics such as foxing, etc., are perceived of as conveying some important information or meaning.

Digital masters for digitally reformatted monographs and serials must have descriptive, structural and administrative metadata, and the metadata must be made available in well-documented formats.

Structural metadata should be sufficiently detailed to allow reconstruction of the sequence of the original artifact. A minimum list of structural metadata elements is being developed as a companion to this benchmark.

Preservation digital masters may include machine-readable (keyboard or OCR) text. That text may be corrected or uncorrected. If it is corrected, the accuracy level will be specified (e.g. as 99.995%). Such text may be encoded (at any level, e.g. as specified in TEI Text Encoding in Libraries. Guidelines for Best Encoding Practices. Version 1.0, July 30, 1999)

4. Notes

1. 600 dpi will capture roman scripts down to 6-point type with the microfilm Q1 equivalent of 8. Smaller text, scripts with fine lines and small dots and other diacritics (like italics, Arabic, etc.) need higher resolution to be captured completely.