Minimum Criteria for an Archival Repository of Digital Scholarly Journals
Version 1.1
D. Greenstein and D. Marcum 17 April 2000
Introduction
This document sets out the minimum criteria of a digital
archival repository that that acts to preserve digital scholarly
publications. It is based closely on the Reference Model for an
Open Archival Information System and modified to reflect the
specific needs of library, publishing, and academic communities.
It also indicates some of the key research issues that are likely
to emerge for those who establish digital archival repositories
that meet these criteria.
The criteria have been agreed by representatives from
university research libraries that have a stated interest in
contributing to the development of such repositories. They are to
be reviewed shortly by a group of publishers (Version 1.2) and finally by representatives
of library licensing consortia. In the meantime, comments are
actively encouraged from the broadest possible community and
should be sent to dlf@clir.org.
Revisions of this document will be maintained on this web
site.
Criterion 1. The digital archival repository that acts to
preserve digital scholarly publications will be a trusted third
party that conforms to minimum requirements agreed by both
scholarly publishers and libraries.
Agreed minimum criteria are essential. Libraries need them to
assure themselves and their patrons that digital content is being
maintained, and to enforce any maintenance requirement on
scholarly publishers. Publishers need them so they may
demonstrate to libraries, but also their authors that they are
taking all reasonable measures to ensure persistence of their
publications. Finally, emerging repositories need them as a blue
print for services, but also as a benchmark against which service
can be measured, validated, and above all trusted by the
libraries and publishers that rely upon them.
The key research question entails the definition of those
criteria. Initial meetings with libraries and publishers are an
essential first step in developing these definitions. Their
refinement is expected to be an iterative process, one that takes
account of experience building, maintaining, and using digital
archival repositories.
Criterion 2. The repository will define its mission with
regard to the needs of scholarly publishers and the research
libraries. It will also be explicit about which scholarly
publications it is willing to archive.
This definition will help to focus the repository on the
nature and extent of digital information it will acquire and on
the requirements of the research library as the primary recipient
of any data disseminated by the repository.
Research issues:
- Mission statements" that document the scope and nature of
materials a repository aims to collect, the strategy and methods
it adopts for developing its collections (attracting deposits),
and the community of libraries (and other users) it seeks to
serve.
- The development of registries that documenting what scholarly
publications are archived where (and implicitly those not
archived at all), is a further research issues
Criterion 3. The repository will negotiate and accept
appropriate deposits from scholarly publishers.
The repository will develop criteria to guide consideration of
what publications it is willing to accept. Criteria may include
subject matter, information source, degree of uniqueness or
originality, and the techniques used to represent the information
(e.g., physical media, digital format, representation
information).
Individual negotiations with publishers may result in deposit
agreements between the repository and the data producer. Deposit
agreements may identify the detailed characteristics of the data
(and accompanying metadata) that are deposited, the procedures
for deposit, the respective roles, responsibilities, and rights
of the repository and the data producer with regard to those
data, references to the procedures and protocols by which a
repository will verify the arrival and completeness of the
deposited data, etc.
Research issues:
- Selection criteria used by the repository to review potential
accessions
- Guidelines for depositors that identify preferred or required
data and metadata formats, transmission methods and media,
etc.
- Schedules, licenses and other administrative materials that
surround and govern the deposit process and determine rights and
responsibilities of depositor and repository
- Procedures for verifying the arrival and completeness of
deposited data and metadata
- Adherence by several archives to some common range of data
and/or metadata formats
Criterion 4. The repository will obtain sufficient control of
deposited information to ensure its long-term preservation.
In this respect, the repository will at a minimum require
perpetual licenses that allow it sufficient control to accession,
describe, manage, even transform deposited data (and accompanying
metadata) for the sake of their preservation. Where repositories
act in association with one another (e.g. to ensure sufficient
redundancy in the preservation process), they may also require
rights allowing them to mirror or deposit data with other
associated archives.
Further repositories will need to pay attention to whether and
how its rights and responsibilities with regard to any particular
deposit may change through time. For example, where a depositor
ceases to supply its materials to the scholarly community, the
repository must be positioned to supply those materials to
existing licensees (perhaps at a fee).
Research issues:
- Acceptable licenses and licensing principles
- Fuller understanding of how the repository's rights and
responsibilities may change over time
Criterion 5. The repository will follow documented policies
and procedures which ensure that information is preserved against
all reasonable contingencies and enables the information to be
disseminated as authenticated copies of the original or as
traceable to the original
Preservation strategies and practices are not right or wrong
but more or less fit for their intended purposes. Nor is any
general theory of digital preservation or data migration likely
to become available anytime soon. Thus data in different formats
may require different strategies and these may need to be worked
out with the data producer (depositor). Documenting how and where
different preservation strategies and practices prove cost
effective and fit for their intended purposes will be a primary
interest of any co-ordinated approach to developing preservation
capacity appropriate to scholarly publisher, research library and
academic communities. Because preservation practices are likely
to vary across repositories, and because we have an interest in
encouraging the development of different practices, we may wish
simply to request that participants in any such co-ordinated
effort agree to a document the practices they adopt and disclose
them to some community review and evaluation.
Research issues:
- Preservation metadata
- Migration strategies (and their application with specific
data formats)
- Data validation and integrity checking
- Scaleable infrastructure
Criterion 6. The repository will make preserved information
available to libraries.
Repositories may support several kinds of interactions with
libraries including: questions to a help desk, requests for
literature, catalog searches, order and order status requests.
Orders may involve an agreement outlining, for example, how (in
what form, on what media, with what metadata) data are
disseminated, the rights and responsibilities of libraries as
users of those data, etc.
Although repositories will need to support access at some
level, those services should not replace the normal operating
services through which digital scholarly publications are
typically made accessible to end users.
Research issues:
- Resource discovery mechanisms
- Access (data dissemination) strategies supported by
archives
- User licenses and how enforced
Criterion 7. The repository will ensure that data can be
disseminated to libraries in a renderable form.
At a minimum, libraries should be able to create end-user
services appropriate to the disseminated data and to do so
independently of any assistance from those who initially produced
the data.
Research issues:
- Minimum definition of "renderable form" and implications for
data and metadata format, and transmission method for any data
disseminated by a repository
Criterion 8 Repositories will work as part of a network.
At a minimum, repositories will need to operate as part of a
network to achieve a satisfactory degree of redundancy for their
holdings. Although an appropriate level of redundancy is
difficult to quantify (let alone to mandate), it will ideally
extend for any single data to three archival sites, at least one
of which is located off shore.
A network of repositories offers additional advantages to
libraries and scholarly publishers. Libraries may benefit from
common finding aids, access mechanisms, and registry services
that are supported by a network and allow libraries more
uniformly to identify and gain access to information about
scholarly publications that are preserved in trusted
repositories. Publishers may benefit from having access to a
single repository or group of repositories that specialize in
publications of a particular type and from the cost efficiencies
that emerge from within a network.
Research issues:
Perceived value of:
- standard interfaces between repositories;
- standard deposit licenses and/or user agreements
- standard methods for data deposit;
- standard methods for data dissemination;
- standard preservation and other metadata;
- standards resource discovery practices;
- standard migration strategies and implementation
procedures;
- standard specifications for physical media;
- standard accreditation of requirement conformant
archives.
For further information, please consult the following
pages:
return to top >> |