Summary of the
Distributed Open Digital
Library Design Event
Of the Digital Library
Federation
Capital Hilton
Hotel,
Washington,
D.C.
8 October 2003
The following
persons attended:
Peter
Brantley, director of technology, California Digital
Library
Sayeed Choudhury, associate director, Library
Digital Programs, The Johns Hopkins University
Richard Detweiler, interim president, Council
on Library and Information Resources
Dale Flecker, associate director of the
University Library for planning and systems, Harvard
University
Richard Lucier, librarian, Dartmouth
College
Wendy Lougee, university librarian, University of
Minnesota
Clifford Lynch, executive director, Coalition
for Networked Information
Carol Mandel, dean of libraries, New York
University
Deanna Marcum, associate librarian of Congress for library
services, Library of Congress
John Mark Ockerbloom, digital library planner,
University of Pennsylvania
Merrilee Proffitt, program officer, Research
Libraries Group
David Seaman, director, Digital Library
Federation
Clay Shirky, Internet consultant and adjunct
professor, Interactive Telecommunications Program, New York
University
Abby Smith, director of programs, Council on
Library and Information Resources
Winston Tabb, dean of university libraries,
The Johns Hopkins University
Karin Wittenborg, university librarian, University of
Virginia
Lee Zia, lead program director, National
Science, Technology, Engineering, and Mathematics Education
Digital Library Program, National Science Foundation.
Also invited but unable to attend were Lorcan
Dempsey, vice president for research, OCLC, and Jerry McDonough,
digital library development team leader, New York
University.
Prior to the meeting, these background
readings were distributed:
Digital Library Aggregation Services: A Report
to the Digital Library Federation, prepared by Martha L. Brogan, consultant,
September 2003
Draft Framework for a Distributed Open Digital
Library, prepared by
the Digital Library Federation Distributed Library Initiative
Committee, 27 May 2003.
Introduction
Ms. Lougee
opened the meeting as co-chairperson and facilitator with Ms.
Marcum. Using slides, she explained that the meeting's purpose
was to refine, for presentation to the Steering Committee of the
Digital Library Federation, a proposal to create a distributed
open digital library (temporarily designed by the acronym
DODL). The library would be "distributed," she said, by
leveraging federated collections, without unsustainable overhead,
and taking federation to the next level; and the library would be
"open" by enabling users to download content into local systems
for integration with other resources and recombinant uses. The
library would be user-focused by providing richer and less
frustrating access to, and interaction with, shared content, and
by making holdings easier to access, manipulate, enrich, and
incorporate into academic and pedagogical practices, services,
and tools. The library also would be library-focused by fostering
interdependence at the data level between libraries, improving
the economics of production, improving service though a highly
functional, layered service model, and allowing integration of
content for customized audiences. In the past, Ms. Lougee
continued, libraries have shared content at a superficial level.
The DODL would promote a large aggregation of content rather than
thematic projects. It would need mechanisms for co-developing
services and tools and would need a framework to incorporate
rights for free and protected resources. The DODL goals would be
to create a framework for shared activity, to catalyze
collaboration among shareholders, to develop extensible
architecture, to generate, facilitate, and evaluate use, and to
engage potential funding agencies.
Concerns
about Participation
Ms.
Wittenborg began the discussion by remarking that, while the DLF
has done wonderful work in such areas as standards development,
the content digitized so far has been a trickle, and scholars are
demanding much more. Noting that not every institutions will
participate in the DODL, she urged that believers in it move
forward with it despite obstacles, and that participation not be
limited to DLF member institutions.
Mr. Lucier
agreed, but called for more work on the process of getting
library directors to buy into the DODL concept. Content issues,
which are key, he said, are insufficiently emphasized in the
draft proposal, which now is not compelling. A strong though not
necessarily large core of DLF members needs to move ahead and
achieve some success within a reasonable time.
Mr. Tabb,
noting that institutions may participate in the DODL in different
ways, asked for a better explanation in the document of what
institutions may bring to the DODL and what they are being asked
to agree to do. Institutions that cannot or will not contribute
need reassurance that DLF members do not have to do
so.
Mr. Seaman
reminded the group that, earlier, DLF members reached a general
consensus that, generically, DODL was worth doing, even if some
institutions are not ready to contribute.
In response
to a suggestion by Mr. Lucier that concerns about DODL be
identified, the following emerged:
- Would DLF
member institutions' contributions to the DLF's capital fund, to
be used for the advantage of all, go to the DODL?
- Is
institutional readiness adequate? Can the DODL's developers build
a large collection, present content uniformly, and make sure that
it is stable for long-term preservation?
- Would shared
content and tools be "branded," crediting the creating
institutions, or made available anonymously?
- Would an
integrated library cost institutions more and make them abandon
what they have individually built?
- Is the
necessary technology too much of a moving target?
Observations
followed that the concerns seemed small, but people needed
positive reasons to put them aside; that the concerns are similar
to those that arose when OCLC and RLG began; and that people have
assumed that the things they do on their own could somehow,
magically, come together. Can commitment to federation be
sustained?
Concerned
that no decision would be made, Mr. Tabb called for discussing
how, not whether, to develop the DODL.
Ms. Smith
suggested that concerns could be allayed if the focus of the
DODL's development is on users, on the advancement of learning,
and on convincing libraries that their investments will not be
lost. Consensus was fairly high within the DLF that libraries
could provide better user service by sharing collections, Mr.
Seaman said; and Ms. Mandel remarked that getting support for
building local content is easier if it is seen as contributing to
a larger project. Other participants reiterated that institutions
could take part in phases without be required to make an initial
contribution, but that an educational challenge exists:
institutions need to see clearly what a commitment to the DODL
would mean.
Mr. Lynch
said that making tangible what DODL participation means to
institutions will require recognition that the DODL collection
needs a critical mass or users will not come to it; so consensus
will be needed about subject matter to which institutions will
give priority for digitization. Mr. Flecker expressed uncertainty
about whether the DODL would be a place to come to or, instead, a
facilitator for the use of digital objects in various
services.
User
Perspectives
The group
then turned to a session on user perspectives, moderated by Ms.
Smith. She remarked that all libraries serve users, who will look
elsewhere if they do not find credible, authentic, persistent,
and accessible resources. Users are less concerned than libraries
about branding resources. Closer collaboration in the DODL's
development is needed with scholars, though their interest tends
to be limited. They assume that the content they desire is
available and want convenient access to it.
In the
ensuing discussion, observations were made that scholars do not
want to go to multiple sites to find content they need, that they
do not regard building readily accessible digital resources as
part of their job, that much content has been built in respond to
individual professors' requests, which makes some scholars feel
that substantial content already is available, and most tend to
be naïve about what will be required to build a substantial
digital library for the future.
Mr. Lucier
said that the development document does not need to be compelling
to scholars but to DLF members. Scholars already think that
libraries have the capacity for a DODL. We know what we have to
do, he continued, but we don't know where we are with the
technology, what the milestones should be, and which institutions
can lead in developing the technology that will make possible
compelling services of use in making access easier.
Ms. Smith
said that, in service delivery, libraries can asset leadership
and do branding, and the compelling argument is that contributors
of content to the DODL will get back from it access to a larger
pool of material of benefit to their own faculties and
students.
Mr. Brantley
asked how the group was conceiving the kinds of digital content
to be developed. Would the DODL provide scholars with shared
databases as well as manuscript texts and images? Would the DODL
make possible a richer kind of content access?
A discussion
followed about whether, as once proposed, the DODL should start
with content for scholars in the humanities, who are underserved,
but for whom some of the larger collections digitized by DLF
members have been developed. The DODL could focus on some subject
area in great demand, Ms. Wittenborg suggested, then pick a
project and seek expert advice about its feasibility and cost.
Mr. Choudhury observed that technical requirements differ for
different kinds of content formats, and that humanists and
scientists differ in the materials they use. He asked, are we
drafting a call for proposals, to be financed from the DLF
capital fund?
Others
responded that costs cannot now be calculated because the DODL
document does not spell out what the DODL will need, but is more
an identification of the high-level, functional things the DLF
wishes to achieve, a framework for the parts, from which calls
for proposals might be developed. If an attainable next step were
to be identified, then one could determine which institutions
within the DLF were capable of carrying it out, and whether the
capital fund (which Mr. Seaman said contains approximately
$604,000) could be used for such a purpose.
Mr. Tabb
proposed stipulating that the DODL's content scope be the
humanities, as currently defined by the National Endowment for
the Humanities. Others agreed that much humanities content
already has been digitized, that there is user demand for it,
that concentrating on it could be phase one with the
understanding that other fields might be included later, that
much content still would be needed, but that a first project
could be started in the humanities, with specified goals for what
exactly it would provide historians, for example, by the end of
the DODL's first three years.
Service
Considerations
At this
point, the group turned to service considerations in a session
moderated by Mr. Shirky. He described work under way for the
National Digital Information Infrastructure and Preservation
Program headed by the Library of Congress. He elucidated the
importance of metadata development for interoperability between
information systems and institutions and identified difficulties
posed by metadata differences, emphasizing differences between
"described data," which are determined by human judgment, and
"derived data," which can be obtained with little human judgment.
He introduced some cautions about expecting to transfer and
preserve data long-term without semantic loss, about the
feasibility of centrally collecting content to store as opposed
to helping originating organizations do it, and about the inverse
relationship between the size of a network and the richness of
metadata that it can handle. He distinguished between back-end
interoperability, in which participating institutions share data,
and front-end interoperability, in which data appears unified to
users rather than from multiple sources. With which kind of
interoperability would the DODL start, or would it explore
both?
Additional
considerations arose in the ensuing discussion. Should the DODL
begin as a research project or a production project, in which the
DLF would take advantage of "low hanging fruit," using already
digitized content. Will tradeoffs be necessary between content
aggregation and presentation in the DODL's development? What are
the minimum, affordable requirements for shareable content? Is
Borrow Direct (a program in which Ivy League universities use
e-mail to provide rapid book borrowing and delivery service) a
useful model? What lines of development have the best chance
currently of producing progress? How can the DODL's technological
needs best be explained to decision makers?
Mr. Shirky
and Mr. Brantley described current difficulties in transmitting
metadata and in verifying the identity of digital objects
transmitted between institutions. Others suggested that the user
needs a "storefront" (like an ATM) for accessing content, with
search capability as the core function, and with some degree of
results presentation so that users could see if they want what
they have found. Mr. Ockerbloom additionally described technical
needs but suggested starting with what digital librarians already
know how to do and adding functionality. Differences of opinion
emerged about whether, in the first phase, the DODL should show
anything to users that they cannot "repurpose locally," and
whether users of the DODL should be allowed to know if content
items exist to which access is denied. Librarians present argued
for "total openness" with scholars, who "want to know what's
there" regardless.
Differences
of opinion also arose about whether the DODL should be seen as
focused on content or on transference. Some thought that simply
developing means of exchanging content was insufficient without a
concept of who and what the content was for, and that the DODL
needed enough content to have meaningful impact on scholarship
and learning. Digitize material now available may not be what
scholars most need. Others thought that collection building could
and would continue, but that the DODL needed to find ways to
transfer and share it more deeply. Some participants spoke of
DODL possibilities as ranging from simply building a catalog of
content available now ("a project of use, not discovery") to
creating an entirely new interface for content sharing, which
would require research to deal with hard technical issues.
Consensus seemed to emerge that whichever focus were paramount,
both content and transference needed development, and that the
time had come to move beyond pilot projects.
What
Next?
After lunch,
the discussion resumed with Ms. Marcum as moderator. What, she
asked, is the most productive thing to work on?
Mr. Lucier
suggested that the political process is critical, that the
Steering Committee would need to trust that major issues had been
dealt with and practical steps identified. Three things are
important, he said: (1) we want to move the DLF to a next step of
development so that it remains meaningful; (2) this will result
in a larger body of content of importance to humanities scholars;
and (3) we will create tools to share content so that people can
interact with the content more deeply. Ms. Wittenborg agreed that
this would need explaining to the Steering Committee before the
next DLF meeting.
In the
ensuing discussion, views were expressed that a development
project would be more meaningful than a research project, that
there is a need to share content in the humanities, that
meaningful access to what already has been digitized could be
valuable, and that bodies of content such as the American Memory
collection at the Library of Congress are available to work on.
Technologists in the group, however, expressed concern that the
initial scope of the DODL project seemed to be getting enlarged
and amorphous, and wondered whether institutions are even willing
to share. Mr. Flecker and Mr. Lynch suggested that analysis is
needed of what sharing means; moving objects between
systems raises many questions, particularly if one institution is
trying to speak to another's users; and currently institutions
have no way to match up how they describe their collections.
Also, the degree of difficulty varies with format and academic
field. Sharing, said Mr. Lynch, may need to be more than
cross-institutional delivery; creating a collective resource to
which people can contribute is different from resource
distribution.
Mr. Detweiler
suggested working on some particular body of content to begin
collaborative development. Ms. Mandel called for a series of
explorations of functions such as browsing and presentation that
digital librarians already know need work.
Technological Considerations
The group
then turned to a session on technology, moderated by Mr. Lynch.
He said that he had heard three different things that the DLF
should be doing. (1) The DLF is in a good position to explore,
without great cost, assumptions about what is needed for the DODL
and which aspects the DLF can deal with, including whether needed
tools are in scope for the DLF or should be left to the market
place and the academic discipline groups to develop. (2) The DLF
could develop a big, highly functional content system, a
universal catalog, requiring common agreements and resource
choices. (3) The DLF could undertake a digitization program for
systematic collections development. Additionally, DLF libraries
could produce a federated service by coupling their physical
catalogs with a digitize-on-demand service for scholars, who
would request the digitization of materials of use to them. Is
the problem now, Mr. Lynch asked, one of content availability or
of content "find-ability"?
Mr. Shirky
noted that "search-ability" involves issues both of delivery and
of policy. The DODL might start by identifying the simplest thing
that could possibly work, which would be something combining open
policy with ease of delivery. In the ensuing discussion,
participants said that re-purposing content is difficult within
institutions' systems, let alone between them, but that external
connections can encourage internal connections; and that most
digital material used by DLF institutions do not come from them,
so a DODL collection would need to extend beyond them.
Mr. Flecker
said that two levels of development, undertaken in parallel,
would be useful: (1) building up simple access, and (2) exploring
the sharing of objects, format by format and discipline by
discipline. Mr. Lynch suggested that the Coalition for Networked
Information might collaborate with the DLF in exploring "deep
sharing" requirements. Others suggested that workshops could be
valuable to deal with issues in the second developmental area,
that technology issues could be dealt with first and then policy
issues, that exploratory work could start in a discipline with
relatively simple digital objects, and that American history and
literature offered an existing pool of content and an
identifiable body of users. Whatever the approach, Mr. Seaman
said, the need now is not to produce another year of reports but
to learn by doing.
Mr. Tabb
recommended two complementary tracks, one towards creation of a
national digital humanities library covering all of the
humanities, and the second towards creation of a DLF portal for
user access. Ms. Mandel, while recommending that the DODL not be
called a national library, suggested making digital
collections expansion a third track and involve scholar-partners.
Mr. Lucier said that collections development was an integral
component of overall DODL development and that international
partners would be necessary to provide enough content. Ms.
Proffitt described how content is being made available in a free
Web service through the "Cultural Materials" project of the
Research Libraries Group, with which Ms. Mandel suggested
collaboration. Mr. Shirky wondered if a DLF portal would compete
with DLF members' portals, and who would take the calls from DLF
portal users. In response to other questions about whether the
DLF would have to run a portal, Mr. Lynch suggested that a
finding system could be developed without administration by the
DLF itself. Ms. Proffitt suggested that such a system would need
an authoritative storefront.
Alternative
views received expression about whether to call the DODL a
Humanities Digital Library with a collections component, a
finding system, and partners beyond the DLF, or a Digital
Library for the Humanities, or Digital Resources for the
Humanities. Also debated was whether any such aspiration was
too high for now, misleading people into expecting its completion
soon. Instead, the DLF might embark on an initiative to develop
just a finding system for a digital humanities library, but that
aspiration seemed to some too low. The group in general agreed,
however, that development of collections and a finding system
were linked.
After a brief
discussion of whether to stress content sharing or not in
explaining the DODL concept, suggestions were made to engage one
or more consultants to describe specifications for each part or,
alternatively to establish implementation committees of DLF
library directors who could bring in technical people or other
consultants. After additional comments about process, Ms. Marcum
and Ms. Lougee, as co-chairs, concluded that the design committee
would use what it had heard at this meeting to decide what to
take to the DLF Steering Committee, and would consult further
with individuals in the group in shaping the proposal. The
meeting was then adjourned.
Respectfully
submitted,
Gerald
George, recorder
13 October
2003