Registry of Digital Reproductions of Paper-based Monographs
and Serials
Functional Requirements
Dale Flecker
December, 2001
Introduction
This document offers a functional specification for a registry
that records information about digital reproductions of
monographs and serials. It has been produced as part of the DLF's
work to define requirements for and encourage the development of
such a registry service. An introduction to that work and
additional documentation on it is available from http://www.diglib.org/collections/reg/reg.htm.
Data registered
The Registry must have the ability to record (or, in some
instances, to link to) the following information:
Bibliographic Description. Reproductions should be
described following bibliographic conventions used in
contemporary cataloging "utilities" and integrated library
systems. Given that the original materials are very likely
already cataloged in traditional library format, this should be
an easy and inexpensive process based on copying (without
updating for contemporary practice) existing data.
Precise Holdings. For multi-volume and "continuing"
works such as journals, a description should be recorded in
standard holdings notation of the precise issues and volumes that
have been digitized. If forms of "compressed" holding statements
are used, they MUST be understood to imply that every volume and
issue of the encompassed run has been digitized. If a
"compressed" holding is used for an incomplete run, an
unambiguous indication of incompleteness must be included.
Information About Use Copy. The following information
should be recorded about the use copy (a network-accessible, but
not necessarily free, copy of every registered object must be
available as a condition for inclusion in the Registry):
- a URL or URN providing a persistent link to the use
copy;
- a textual description or a pointer (such as a URL or a
standard identifier) to a description of the terms and conditions
for access to the use copy;
- the technical format or a pointer (such as a URL or a
standard identifier) to the format, if the materials are not
simply available through a standard web interface.
Information About Archival Master Copy. The following
information should be recorded about the archival master copy
(because persistence is an assumed responsibility of the
registering agency, a Master Copy must be
described):
- a persistent identifier for the master object (this does not
need to be an "actionable" link such as an URL or URN -- just an
unambiguous identifier that the owner will recognize);
- a textual description of the accessibility of the Master Copy
(who can access it and under what terms and conditions);
- a description or a pointer (such as a URL or a standard
identifier) to a description of the technical standards used in
creating the Master Copy (note that this is a key
element - it is expected that materials will be digitized
following many differing practices. In order for other
institutions to rely on a master, they need to be satisfied that
is of sufficient quality. The expectation is that there will be
standard best practice guidelines created by the digital library
community that libraries can simply "point" to via identifier or
URL when appropriate. When community standards or best practices
are being followed, it is highly desirable that the name or
identifier of such practices be recorded rather than a URL-based
link.);
- the technical format of the Master copy (again, this is
likely recorded as a pointer to a description of the format
used).
- a description or a pointer to a description (such as a URL or
a standard identifier) of the repository practices being followed
in the storage and maintenance of the Master Copy (as noted
efforts are currently underway to define best practices in this
area.).
If URLs are used to point to external descriptions of practice
in any of the fields above, the recording institution must assume
the responsibility of maintaining the validity of the link.
(NOTE that if use and archival master copy are the same, the
information should be repeated in both areas.)
Statement of intent to digitize. In order to avoid
unnecessary duplicate conversions, libraries are encouraged to
record their intent to digitize an object as soon as the definite
decision to do so is made. The statement should include the
projected date of completion. The MARC21 583 field can be used to
record this information. A problem with the use of a queuing
mechanism of this sort for microfilming activity has been that
not everything queued has subsequent updates to indicate that the
filming has actually taken place. In order to avoid this
difficulty in the digital registry:
- a name and contact information should be recorded for each
queued item;
- the registry system should send a tickler message for each
item that remains queued more than 90 days after the expected
date of digitization;
- the registry should delete any queuing information more than
a year past the expected date of digitization.
Information from Multiple Sources
There will be many cases in which more than one institution
will need to register digital copies of the same bibliographic
item. In particular, one can expect that for multi-part items
such as journals, the entire bibliographic item may not be
available from a single institution, and that the record in the
Registry should show that one institution has digitized some
volumes, and another institution other volumes. Likewise there
may be instances in which two institutions digitize the same item
in different formats or to different standards. The Registry
should provide a unified and coherent view of all digital
versions registered
Access
Registered information must be available to users in two
ways:
Interactive searc. Records of registered materials must
be interactively searchable through standard information
retrieval queries based on normal MARC bibliographic elements.
This search facility is primarily intended for use by library
staff looking to see if a known item has been digitized If
registered materials are integrated into a larger bibliographic
file, searching must provide the ability to limit results to
registered digital copies. Because searching is only intended to
support library staff, a naïve general user interface is not
required.
Harvesting. Registered catalog data must be available
for harvesting, supporting at a minimum the Open Archives
initiative (OAi) protocols. Metadata formats supported should
include both MARC21 and Dublin Core. There is no specific
requirement that sub-setting of registered data during harvesting
via the OAi "set" functions be supported.
Easy accessibility should be provided to the entire
international library community as well as to other institutions
and companies (both commercial and non-commercial) engaged in
digital conversion efforts.
Visibility as entity
It is important that the Registry be visible and have a
recognized name to encourage contribution and use. While it is
highly desirable that registered data be accessible as part of a
larger bibliographic file, some means of identifying the Registry
(including, but not limited to, the ability to access only
registered materials in searching and harvesting as discussed
above) and making its utility visible should be provided.
Input and Maintenance
Data input and maintenance should be available through both
interactive on-line transactions for individual records, and in
batch mode for groups of records. A "derive" function, allowing
the majority of the bibliographic description of materials to be
copied from existing records for paper originals, is highly
desirable. It is expected that the original registering
institution will need to be able to update information in the
Registry after initial input, to record such things as a change
in status from intended to actually digitized, additional volumes
digitized, and changes in the format of master or use copies.
Additionally, other institutions will need to be able to add
information about other digital versions created. The ability to
contribute and maintain data should be easily and readily
available to the entire library community and to other
institutions and companies (both commercial and non-commercial)
engaged in digital conversion efforts.
return to top >>
|