DIGITAL LIBRARY FEDERATION
SPRING FORUM
SAN DIEGO
APRIL 13-15, 2005
Westin Horton Plaza
910 Broadway Circle
San Diego, CA 92101
(619) 239-2200
Floor Plan
PRECONFERENCE: TUESDAY, APRIL
12
9:00am—5:00pm
METS Editorial Board Meeting—by Invitation Only (Library, Lobby Level)
9:00am—5:00pm
The California Digital Library (CDL) American West Project Meeting—by Invitation Only (Coronado, Third Level)
DAY ONE: WEDNESDAY, APRIL 13
10:00am—12:30pm
METS Editorial Board Controlled Vocabulary Meeting—by Invitation Only (Plaza Room B, Second Level)
10:00am—12:00pm
The Digital Library Federation (DLF) OAI Focus Group Meeting—by Invitation Only (Plaza Room C, Second Level)
12:00pm—1:00pm
Registration
(Second Level, Top of Stairs)
1:00pm—2:10pm
Keynote Address: Technology and the Professorate.
Edward Ayers, Dean, College of Arts and Sciences, University of Virginia
(California Ballrooms A and B, Second Level)
2:10pm—2:30pm
Break
(California Foyer, Second Level)
2:30pm—4:00pm
Session 1:
METS PROFILES
(California Ballroom A, Second Level)
Making METS Profiles Machine-Actionable to Support Validation and Facilitate Interoperability.
Corey Keith, Library of Congress
Until now METS profiles have existed in prosaic form guiding the user in creating
conforming METS documents. The METS Profile schema serves as a
standardized container for this information but due to its form
it must be interpreted by a human to derive any benefit. The
Library of Congress has created an XML-based, machine-readable
scheme to express profile requirements. This scheme goes beyond
the grammar-based METS schema to express structural requirements,
relationships between elements, and profile-required metadata
elements. Expressing requirements in this form enables the
development of generic profile-aware tools. The Library of
Congress has developed a prototype of an obvious example of such
a tool: a METS profile validation tool. Profile validation can be
reused within the institution, but more importantly it
facilitates interoperable exchange of METS documents between
institutions by enforcing the profile contract. Senders can
validate their METS documents before dissemination and receivers
can validate before ingestion and be sure of the conformance to
the profile specification. The presentation will also discuss
other uses for machine-actionable METS profiles including
input/editing tools and dissemination systems.
Creating METS Profiles Using METS and MODS.
Morgan Cundiff, Library of Congress
The Library of Congress has created a set of METS Profiles for various document types including musical scores and parts, sheet music, phonodiscs,
compact discs, photographs, print materials, recorded events, and
bibliographic records. The profiles were created for use with the
Library's digital library application called "I Hear America
Singing"
( http://www.loc.gov/rr/perform/ihas/),
but are intended to be generally useful to the digital library
community. It is further intended that the profiles be used as a
basis for interoperation between applications and digital
archives. The profiles make use of METS and MODS together to
express both the logical hierarchy and the physical hierarchy of
given object types. Further, the profiles are also intended to
serve as a first step toward the creation of profile-aware
software tools.
2:30pm—4:00pm
Session 2:
REPOSITORIES AND SERVICES
(California Ballroom B, Second Level)
A Technology Analysis of Repositories and Services.
Sayeed Choudhury and Tim DiLauro, Johns Hopkins University
The concept of the institutional repository has gained traction within the digital library community. While this idea provides a useful description
that may facilitate institutional adoption, it may also
oversimplify the complete picture associated with digital library
architecture. Institutions may now be finding that there will be
multiple repositories and applications in the same environment.
At Johns Hopkins University, we are promoting the idea that applications
should access repositories through an abstract, repository
agnostic layer, rather than through custom application to
repository integrations. With funding from the Mellon Foundation,
Johns Hopkins will evaluate repository software and a
range of services. The result of this evaluation will be a set of
best practices, recommendations, and functional requirements for
repositories and applications. This project reflects our belief
that content should reside in multiple repositories external to
applications, so that the same content can be used by several
systems and support multiple services. This concept will be
tested with content that is moved through repositories into
applications as defined against a set of use cases that reflect
various services. Specific examples we are considering include
digital preservation (e.g., Archive Ingest Handling Test),
e-learning (e.g., Sakai), and e-publishing (e.g., Project Muse).
UVA Library Repository Interface and Tool Evaluation.
Leslie Johnston, Director of Digital access Services, University of Virginia Library
In fall 2004, the University of Virginia Library launched its
Central Digital Repository for its first experimental year. The
Repository includes a digital image collection, electronic
text collection, and EAD Finding Aids for the UVa Library's
Special Collections. The Repository itself was built using
Fedora, a digital library management architecture jointly
developed by the University of Virginia and Cornell University.
The interface was built using Cocoon, XPAT, JavaScript, and Web
Standards-compliant XHTML and CSS. The interface was designed to
accommodate discovery and delivery of objects across collections
and formats—images, texts, finding aids—and provide access
to tools that support use of the collections in research and
instruction. This includes an Image Viewer for on-the-fly
manipulation of images, and a Digital Object Collector Tool for
the creation of personal collection portfolios, slide shows, and
image reserve Web sites. The interface and tools will be briefly
demonstrated. The development of the interface and tools required
an extensive internal design review, where every element on every
screen was scrutinized for consistency and proper functionality
in a number of browsers for Wintel PCs and Macintosh. The
interface is currently undergoing task-based usability testing
with library staff and faculty. A group of six faculty members
are also currently testing both the interface and the tools in
the teaching of six courses, ranging from undergraduate courses
to graduate seminars to design studios. Testing procedures and
examples of test results and changes made to the interface and
tools will be presented.
The Bibliotheca Alexandrina Digital Library: Services and
Repository-Building.
Noha Adly, ICT and ISIS Director, Bibliotheca Alexandrina
This presentation will begin with a brief overview of the
projects Bibliotheca Alexandrina is undertaking towards building
a digital library, including the Internet Archive, the Million
Book project, and the digital preservation of the modern history
of Egypt.
A special focus will be given to one of the new projects, which
is the building of a Digital Assets Repository (DAR) system to
create and maintain the digital library collections. The system
introduces a data model capable of associating the metadata of
different types of resources with the content, such that
searching and retrieval can be done efficiently. Further, it
automates the digitization process as well as the preservation
and archiving of the digitized output. The goal of this project
is to build a digital resources repository to support the
creation, use, and preservation of digital resources as well as
the development of management tools. These tools help the library
to preserve, manage, and share digital assets. The system is based
on evolving standards for easy integration with Web-based
interoperable digital libraries.
4:00pm—4:30pm
Break
(California Foyer, Second Level)
4:30pm—6:00pm
Session 3: INTEGRATING DIGITAL LIBRARIES (California Ballroom A, Second Level)
The DLF Aquifer Initiative: A Progress Report.
Katherine Kott. Aquifer Director, The Digital Library Federation.
DLF Aquifer emerged as the re-awakened strategic direction of the Distributed Open Digital Library initiative of the Digital Library Federation in May 2003. According to the original 1995 Digital Library Federation mission statement, the DLF was established to “bring together -- from across the nation and beyond -- digitized materials that will be made accessible to students, scholars, and citizens everywhere, and that document the building and dynamics of America's heritage and cultures.” DLF has progressed towards this strategic goal since its inception through support, coordination and participation in the development of prototypes, proofs of concept and test-beds that will form the foundation of DLF Aquifer. This project briefing will review the status of the DLF initiatives upon which DLF Aquifer is being built and outline the project plan for the coming year. The update will focus on organizing for collaboration, leveraging existing collections and technical developments and defining the DLF Aquifer problem space.
Integrating Digital Libraries: Teaching, Learning, and Publishing in the DART Project.
Gordon Dahlquist, Brian Hoffman, and David Millman, Columbia University
The Digital Anthropology Resources for Teaching (DART) project
integrates the content acquisition and cataloging initiatives of
a federated digital repository with the development of scholarly
publications and the creation of digital tools to facilitate
classroom teaching, a union between the traditional perspectives
of the library and the scholarly publisher. While the focus of
the existing repository is in the field of anthropology, the DART
model presents a practical methodology to combine repository and
publication that is both exportable and discipline-neutral. The
scope of the digital repository is established by area librarians
and scholars, who work with editorial staff to curate content
selection, describe hierarchies, rights, provenance, and other
metadata, and utilizes harvesting protocols such as OAI-PMH to
acquire targeted records and resources. The project then employs
postdoctoral teaching Fellows, working within the EPIC publishing
environment with editorial and technical staff, to apply
teaching-related metadata, annotation, text, etc. to repository
material to create self-contained digital teaching tools such as
online syllabi, complex learning objects, and curriculum models.
Because these teaching tools emerge from and retain links back to
the larger DART repository, students are introduced to a specific
context within a given learning object, while remaining free to
examine those same resources within the unconstrained context of
the entire collection. This unique combination puts students into
a relationship where they benefit from the added value of
editorial and pedagogical structure without sacrificing the
unfiltered access to a traditional library collection crucial to
their own independent research. Because these publications and
learning objects emerge from and lead back into the larger
collection, DART offers an environment where undergraduates are
given the ability to make the transition to graduate-level
research methods in a way not available in most (digital or
non-digital) secondary-source learning materials.
4.30pm-6.00pm
Session 4:
FACULTY-LIBRARY COLLABORATIONS IN BUILDING DIGITAL COLLECTIONS
(California Ballroom B, Second Level)
Oya Y. Rieger, Director, Digital Library and Information Technologies, Cornell University Library, Moderator
Leslie Johnston, Director, Digital Access Services, University of Virginia Library: presentation
Ann Lally, Head, Digital Initiatives, University of Washington Libraries
Danielle Mericle, Digitization Coordinator, Digital Consulting and Production Services, Cornell University Library
There are several initiatives among
DLF members to promote library-faculty partnerships in creating
digital collections. The goal of this presentation is to discuss
the technical, financial, organizational, and policy issues
brought up by these collaborations, which are different than
internal digitization projects. Issues of rights, standards,
workflows, and different terminologies and expectations can pose
significant challenges for the participants of such initiatives.
This forum will bring together representatives from three
libraries with faculty initiatives to compare and discuss
experiences and best practices that are emerging in support of
such programs. After a brief introduction by the panel organizer,
there will be presentations by the panelists based on a standard
set of questions. The goal is to offer a structured presentation
to allow comparison of institutional policies and practices on
issues such as service frameworks, financial aspects such as per
image costs, standards implementation, promotion of the
initiatives, lessons learned, rights management challenges,
integration of these collections with internal projects, etc.
7:00pm—10:00pm
Reception
(Garden Pavilion and Terrace, Fourth Level)
DAY TWO: THURSDAY, APRIL 14
8:00am—9:00am
Breakfast
(Garden Pavilion, Fourth Level)
9:00am—10:30am
Session 5:
DIGITAL LIBRARIES, DIGITAL COMMONS, AND THE DIGITAL LIBRARY OF THE COMMONS
(California Ballroom A, Second Level)
Charlotte Hess, Director, Digital Library of the Commons, Indiana University: presentation
Andy Revelle, Library Coordinator, Digital Library of the Commons, Indiana
University: presentation
John A. Walsh, Associate Director for Projects and Services, Digital Library Program, Indiana University: presentation
Charlotte Hess: "The Digital Library of the Commons: From Theory to Practice"—
"Commons" are generally thought of as resources jointly shared by
a group of people. In a commons, the groups can be small (the
family refrigerator) or community-level (sidewalks, playgrounds,
libraries, etc.), or very large, at the international and global
levels (deep-sea oceans, the atmosphere, the Internet, and
scientific knowledge). The commons can be well-bounded (community
forests, irrigation systems, libraries); trans-boundary (Danube
River, migrating wildlife, the Internet); or without clear
boundaries (knowledge, the ozone layer). The unifying thread in
all commons resources is that they are jointly used, managed by
groups of varying sizes and interests. Core to all commons are
issues of collective action, equity, and sustainability. The
Digital Library of the Commons (DLC), as a global repository, is
itself a "commons" serving as a gateway to the scholarly
literature on the commons and common-pool resources (CPRs). The
DLC uses open-source software and is OAI-compliant. It contains
over 1,100 full-text articles, conference papers, working papers,
and dissertations. In addition to offering a self-publication
portal, it contains other services, such as an advanced searching
and browsing mechanism, a comprehensive, searchable bibliography,
and a specialized keyword thesaurus. This presentation will focus
on the institutional design of the DLC in order serve an
international, interdisciplinary community of students, scholars,
practitioners, and policymakers interested in questions of
effective resource management and sustainability.
Andy Revelle: "Uncommon
Findings on Users of the Commons"—This presentation represents
an informal assessment of the users and usage of the Digital
Library of the Commons (DLC). The study employs both transaction
log analysis and experiences working with users. It begins by
presenting a profile of the DLC users, who are an international
and interdisciplinary cohort of scholars, development-agency
workers, and others concerned with the myriad of issues related
to common-pool resources and the commons. This geographical and
institutional variety represents a striking difference from the
users of other self-archiving digital library collections, who
tend to share institutional and/or academic affiliation. I
discuss some issues presented by this diverse user group, most
notably low user bandwidth and matters related to academic
terminology and keyword classification. The presentation
continues by discussing system usage. One function of the DLC is
as a repository for papers presented at conferences related to
common pool resources. We have observed a positive correlation
between the number of hits and user submissions, and the posting
of conference papers. I argue that this represents a possible
solution for self-archiving repositories faced with low numbers
of submissions.
John A. Walsh: "The Technological Growth of the Digital Library of the Commons"—
The Digital Library of the Commons (DLC) has been, in many ways, a
first for the Indiana University Digital Library Program. It was
one of our first partnerships with a research center, in this
case the Workshop in Political Theory and Policy Analysis
(http://www.indiana.edu/~workshop/),
and our first experience with self-archiving technologies. The
DLC was originally and briefly developed on the IBM Content
Manager platform. With the arrival of EPrints (http://www.eprints.org/)—the
fist widely implemented, open-source self-archiving platform—we
migrated to the Eprints solution. Since then, we have upgraded
from Eprints 1.x to 2.x and have gradually integrated into the
EPrints-based site's additional features and functionality,
including a searchable bibliography on literature of the commons,
a linked keyword thesaurus, and full-text searching of the
EPrints archive. My talk outlines the development of the Digital
Library of the Commons, our experiences working with EPrints's
self-archiving software, and our efforts to integrate additional
features, beyond out-of-the-box EPrints functionality.
9:00am—10:30am
Session 6:
FASTEN YOUR SEATBELTS: WE ARE APPROACHING A PERIOD OF TURBULENCE . . .
(California Ballroom B, Second Level)
Ann Okerson, Associate University Librarian for Collections and Technical Services, Yale University, Moderator
Mark Sandler, Director of Collections, University of Michigan Library: presentation
Joseph Esposito, Portable CEO Consulting: presentation
Bernard Frischer, Director, Institute for Advanced Technology in the Humanities, University of Virginia: presentation
The digital revolution is
so 90s! We have accomplished much but have done so inside a now
stable and predictable paradigm: online resources that look a lot
like their artifactual equivalents, accessed through an OPAC,
searched with search engines that improve their functionality by
working more and more like print systems squeezed through a 1950
issue of Popular Science (making footnotes active links,
incorporating illustrations that turn out to be animated), and
all using computers of a size, shape, and brand of operating
system that we've been familiar with for at least a decade. The
premise of this session is that it's good to be reminded of the
turbulence ahead so that libraries will be prepared to address it
adequately.
Mark Sandler will take us
to the Google Print construction site to think about what happens
when the real science fiction transformation of print in the
spirit of Vannevar Bush happens and every book potentially
becomes available online.
Joseph Esposito will take
us to a world beyond today's electronic journals, those
publications we love to hate, and will imagine for us the
post-journal culture in all its creative glory, while asking the
question, what would we create today if we did not know
journals?
Bernard Frischer will lift
us off the page into the third dimension—a dimension we will
access away from our comfortable desks and traditional monitors.
Your life preserver is under your seat or in the armrest.
10:30am—11:00am
Break
(California Foyer, Second Level)
11:00am—12:30pm
Session 7:
NEW USER SERVICES
(California Ballroom A, Second Level)
In Search of the Single Search Box: Building a "First-step" Library Search Tool.
: video demo
Tito Sierra and Steve Morris, North Carolina State University
Libraries are under increasing pressure to provide users with a single search box that provides access to the diverse set of content and services
available through the library website. Neither library catalogs
nor generic Web site search tools meet this need directly.
Metasearch, while promising, is still generally characterized by
slowness, incompleteness in coverage, and confusing result sets.
At NC State, an analysis of library Web site search logs indicated
that a large percentage of user-submitted search terms target
similar classes of content (e.g., database names, journal titles,
library information) to which the library could readily provide a
direct link. Concurrent with implementing a next generation
metasearch tool, NCSU Libraries is developing a new Web site
search tool designed to provide users with quick and comfortable
access to distributed silos of library content. A
"sponsored-links" component of this tool connects users to
relevant high-use library resources and information. A subject-
identification component provides contextual links to subject
resource guides. Integrated results from ancillary local indexes
enable use of the tool as a "first-step" in library search. This
presentation will describe an in-house solution based on open
source tools such as Nutch and SWISH-E. Challenges of current
development will be discussed and future development directions
will be outlined.
docWORKS/METAe: Automated Conversion of Printed Materials into METS/ALTO Objects
Claus Gravenhorst and Daniel Lanz, CCS Content Conversion Specialists.
11:00am—12:30pm
Session 8: COLLABORATIVE SERVICES
(California Ballroom B, Second Level)
Southern Spaces: A Collaborative Model for Open Source Scholarly
Publishing
Katherine Skinner, Emory University
This presentation
will provide an overview of the intensive collaboration between
librarians and scholars that has produced the peer-reviewed
internet journal, Southern Spaces (www.southernspaces.org). It
will consider the viability and sustainability of the model
Southern Spaces offers of born digital, library-supported
publishing. Recent advances in digital technologies have fostered
new forms of information exchange that have significant
implications for the field of scholarly publishing. Although
various internet publishing models, including born print and born
digital, have been tested by scholarly presses, society presses,
and commercial entities, no institutionalized form of scholarly
e-publishing has emerged to date. Further, although libraries
have subscribed to, and sometimes even hosted, many of these new
e-publication forms, few libraries have participated
collaboratively in the creation and maintenance of internet-based
publications. Emory University's MetaScholar Initiative pioneered
a collaborative model for library-based e-publishing by bringing
together librarians and scholars to design and implement a born
digital, open access publication, Southern Spaces. This
peer-reviewed internet journal and scholarly forum seeks to
expand the potentials of scholarly publication in two seminal
ways. First, it reexamines the relationship between form and
content, pushing scholarship toward new multimedia explorations
of topics that cannot be managed in traditional print formats.
Second, it explores the possibility of the library taking on a
new role as a publishing center for scholarly work. This
presentation will encourage discussion of the feasibility of
fostering and supporting such open access journals as digital
library initiatives.
Creating an Online Library of Map and Geospatial Data: Challenges and Opportunities
Tsering W. Shawa, Princeton University
This presentation will share how we designed a system that allows
us to manage, store, and make scanned maps, aerial photographs,
satellite images, and geospatial data accessible online. The
system was designed using off-the-shelf commercial software
packages such as ESRI's ArcCatalog, ArcIMS, and ArcSDE, Mapping
Science's GeoJP2 Image Server, Encoder and Decoder, Microsoft's
SQL Server database, and Safe Company's SpatialDirect and FME.
The presentation will not only discuss how and why we designed
this special system architecture but also how we developed our
workflows, what standards we used in creating metadata, scanning
maps, and compressing images using JPEG2000 technology, and what
lessons we learned from designing this complex system.
12:30pm—2:30pm
Break for Lunch (Individual Choice)
2:30pm—4:00pm
Session 9: ADVANCES IN SHAREABLE METADATA AND WEB SERVICES
(California Ballroom A, Second Level)
Best Practices for OAI Data Provider Implementations and Shareable Metadata: A
DLF Initiative
Kat Hagedorn, University of Michigan
Sarah L. Shreeves, University of Illinois at Urbana-Champaign
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has been widely adopted since its inception in 2001; there are currently
over 500 active 2.0 data providers from a wide variety of domains
and institution types. The protocol has demonstrated its
usefulness as a tool to move and aggregate metadata from diverse
institutions. The first phase of the Digital Library Federation's
Aquifer project will include an OAI-based repository for metadata
harvested from participating DLF members.
However, as the protocol has become more widely adopted, several broad areas of concern have surfaced—mainly through the documentation of
service providers—that would benefit from the establishment of
best practices. In the summer of 2004 a group of DLF and NSDL
affiliated OAI data and service providers began work on a set of
best practices for both data and service providers. These cover
implementation practices to be encouraged among data and service
provider groups; communication both among and between data and
service providers; and the optimization of shareable metadata.
This talk will present the completed drafts of two sections of
the OAI Best Practices work—OAI data provider implementations
and shareable metadata—to the DLF membership. We will briefly
discuss the process of creating the best practices, highlights
from these two sections, and discuss next steps in the process,
including the need to operationalize the best practices through
development of tool sets and implementation within commercial and
open source content management systems. We will offer an
opportunity for feedback and discussion.
Ockham Update.
Jeremy Frumkin, Oregon State University
The OCKHAM Initiative is a collaborative effort to
promote interoperability among digital library services.
Sponsored by the Digital Library Federation, the initiative is
currently working on a funded NSF/NSDL grant to build a digital
library services registry, and a suite of digital library
services. This presentation will consist of an overview of the
initiative, a detailed look at the digital library services
registry, and a close look at the digital library services
developed by the project.
2:20pm—4:00pm
Session 10: LOCKSS AND THE HUMANITIES (California Ballroom B, Second Level)
Tom Robertson, Assistant Director and Technical Manager, LOCKSS Program: presentation
Ann Okerson, Associate University Librarian for Collections and Technical Services, Yale
University
Glen Worthy, Stanford University
John Ockerbloom, Digital Librarian Planner, University of Pennsylvania: presentation
Bill Kehoe, Digital Analyst, Cornell University: presentation
In February 2004, 13
institutions met and agreed to collaborate on a project to
collect and preserve important, born digital, freely available
humanities e-journals using the LOCKSS system. These 13
institutions (and others) are contributing time from a technical
person, time from a collection development person, and a LOCKSS
computer:
-
Columbia University
-
Cornell University
-
Harvard University
-
Indiana University
-
Johns Hopkins University
-
Library of Congress
-
New York University
-
New York Public Library
-
Princeton University
-
Stanford University
-
University of Pennsylvania
-
University of Wisconsin
-
Yale University
To date the group has
identified hundreds of titles. The quality of these at-risk
e-journals is such that most research libraries would have them,
if they were available on paper. In many cases their use of
multimedia and animation make paper versions impossible. As
titles from this project are released for preservation, they are
listed
http://lockss.stanford.edu/about/titles.htm
The group has experience with the efficacy and efficiency of
using the LOCKSS system to build born digital, open access,
humanities e-journals collections. Specifically we have
experience: selecting titles; obtaining publisher permission;
developing software; collecting; and preserving the content. This
cooperative collection development model may be applicable to
other subjects with important collections of born digital and
open access content.
The speakers will address, from their perspective: reasons for participation; processes and
procedures to date; and early key insights.
We've chosen to have more
than the usual number of speakers - and to have each person speak
briefly (10 minutes) to underscore the community and
collaborative nature of this work.
-
Ann Okerson will address broad strategic collections issues and moderate panel
-
Tom Robertson will address LOCKSS Program technical progress, including OAI and
format migration
-
Glen Worthy will address collection curatorial issues
-
John Ockerbloom will address processes and insights from technical perspective
-
Bill Kehoe will address processes and insights from technical perspective
4.00pm-4.30pm
Break
(California Foyer, Second Level)
4.30pm-6.00pm
BIRDS OF A FEATHER SESSIONS
1) Digital Library Education
(Coronado, Third Level)
Kristine
Brancolini, Digital Library Program, Indiana University
Leigh Estabrook, Graduate School of Library and Information Science, University of Illinois Urbana-Champaign
In October 2004 Indiana University and
University of Illinois Urbana Champaign began an IMLS-funded
project to create two research-based curricula at our respective
schools of library and information science to prepare librarians
for work in digital library programs in libraries, archives, and
museums. "Building an Effective Library Curriculum through
Library School and Academic Partnerships" builds upon the
experience of the digital library programs at these universities
and the desire on the part of their library schools to learn from
practitioners. Many library schools offer "digital library"
courses, but how well do library school courses synchronize with
the knowledge and skills actually needed by librarians who work
in digital library programs?
To be successful we must engage in
discussions with librarians from many different digital library
settings. The Digital Library Federation Forum offers a perfect
opportunity for interacting with these librarians. At the spring
forum in San Diego we would like to have a more informal
discussion with others who might be interested in this topic. We
would give a brief overview of our results to date, but the real
purpose would be discussion with librarians from all levels of
experience. Our project also includes paid and unpaid
internships. In addition to discussing education for digital
librarianship in general, we would also like to get a sense of
other institutions that might be willing to supervise interns
from our new program, which will be launched in fall 2005. For
more information about our project, please see the project web
site:
http://lair.indiana.edu/research/dlib/index.php
2) Preservation Metadata (Harbor A, Third Level)
Rebecca Guenther, Library of Congress, Priscilla Caplan, University of Florida, and Brian LaVoie, OCLC.
A discussion of recent advances in the area of preservation metadata. By the time
of the Forum the PREMIS preservation metadata element set and
data dictionary will have been in public circulation for more
than a month. This would be a good time to start a new discussion
about where we are with preservation metadata, how it is being
implemented and managed at different institutions, and what the
next steps should be to move forward in this area. Topics which
might be discussed at the BOF could include, but would certainly
not be limited to, the potential for formal standards-building in
this area; opportunities for collaborative creation and sharing
of preservation metadata across repositories; automated tools;
and the role of registries in supporting maintenance of certain
forms of preservation metadata.
3) OAI Best Practices
(Harbor B, Third Level)
Sarah Shreeves, University of Illinois at Urbana-Champaign
The Open Archives
Initiative Protocol for Metadata Harvesting (OAI-PMH) has been
widely adopted since its inception in 2001; there are currently
over 500 active data providers from a wide variety of domains and
institution types. The protocol has demonstrated its usefulness
as a tool to move and aggregate metadata from diverse
institutions. However, as the protocol has become more widely
adopted, several areas of concern have surfaced that would
benefit from documentation of best practices. This session will
be a discussion of the work of a DLF convened group to develop
best practices for OAI data and service providers, particularly
on two sections on 1) the implementation of OAI data providers
and 2) shareable metadata. We encourage participants to actively
share their concerns, questions, and ideas on these
guidelines.
4) Digital Imaging
(Balboa, Third Level)
Clay Redding, Princeton University
Roel Muñoz, Princeton University
Digital imaging and related technologies. Topics include:
-
equipment
-
imaging and metadata workflows
-
color management
-
bulk and archival storage issues
-
digital preservation initiatives
-
growing use of JPEG2000
-
automated generation of technical metadata such as MIX/Z39.87
-
improving communication amongst practitioners
DAY THREE: FRIDAY, APRIL 15
8:00am—9:00am
Breakfast
(Garden Pavilion, Fourth Level)
9:00am—10:30am
Session 11:
THE NATIONAL DIGITAL PRESERVATION PROGRAM (NDIIPP): DLF INSTITUTIONAL PARTICIPATION
(California Ballroom A, Second Level)
Martin Halbert Emory University, Moderator
Caroline Arms, Library of Congress: presentation
Martin Halbert (Emory): presentation
David Ackerman (NYU)
Suzanne Samuel (CDL): presentation
Steven Morris (NCSU): presentation
Bill Mischo (UIUC): presentation
This panel will include a brief introductory recap of the NDIIPP, as well as brief presentations by DLF institutions participating in the program concerning the goals of the cooperative projects they are
leading.
9:00am—10:30am
Session 12:
DIGITAL LIBRARY GRID INITIATIVES.
(California Ballroom B, Second Level)
MacKenzie Smith, MIT, Moderator
Mark Conrad, NARA
Chris Frymann, UC San Diego: presentation
Ray Larson, UC Berkeley SIMS: presentation
Reagan Moore, San Diego Supercomputer Center: presentation
Richard Rodgers, MIT Libraries: presentation
Rob Sanderson, University of Liverpool
The panel will describe several important projects which are
working on applications of the computational grid
(http://www.globus.org/) and the data grid
(http://www.sdsc.edu/srb/) to the digital library and archives
domain. Three presentations will review current projects and
discuss their relevance to the future of digital library
developments. The three presentations will include:
1. DSpace digital library and data grid integration
2. NARA research prototype persistent archive based on data grids, and
3. Digital Library grid initiative.
Each of these projects is looking at some aspect of digital
library work, either in the discovery process (e.g., Cheshire) or
the digital library arena (e.g., DSpace) or the digital archives
arena (e.g., the persistent archive prototype), and all are working
with SRB and related grid tools to accomplish this. By including
this range of grid-based projects the panel will outline a
roadmap for how grid technology might affect digital library work
over the next five to ten years.
10:30am—11:00am
Break (California Foyer, Second Level)
11:00am—12:30pm
Session 13:
NEW PROVIDER SERVICES (California Ballroom A, Second Level)
METS Navigator: A METS-based Display and Navigation Utility for Multi-Part Digital Objects.
John Walsh, Associate Director for Projects and Services, Indiana University Digital Library Program
Jenn Riley, Metadata Librarian, Indiana University Digital Library Program
Dazhi Jiao, System Analyst/Programmer, Indiana University Digital Library Program
Michelle Dalmau, Interface and Usability Specialist, Indiana University Digital Library Program
The presentation will discuss METS Navigator, a METS-based system for displaying and navigating multi-image digital objects. Using the information in the METS <structMap> elements, METS Navigator builds a hierarchical menu that allows users to navigate to specific
sections of a document, such as title page, specific chapters,
illustrations, etc. METS Navigator also allows simple navigation
to the next, previous, first, and last page image or component
part of a digital object. METS Navigator also makes use of the
descriptive metadata in the METS document to populate the
interface with basic descriptive information about the digital
object. METS Navigator was initially developed by the Indiana
University Digital Library Program for the online display and
navigation of brittle books digitized by the IU Libraries' E.
Lingle Craig Preservation Laboratory. However, realizing the need
for such a tool across a wide range of digital library projects
and applications, we designed the system to be generalizable and
configurable. We have also designed METS Navigator with the goal
of eventual release as a free open source utility for the wider
digital library community. Our presentation will trace the
development of METS Navigator, demonstrate the METS Navigator
system, review METS Navigator configuration options, and outline
plans for future development. METS Navigator is built using Java
and open source Web technologies, including the Apache Struts Web
Application Framework, the Castor Java & XML Data Binding
libraries, and Ant, and runs under a Web application server such
as Apache Tomcat.
CDL's Interface Customization Tools: How One Provider of Digital Library Tools
Enables Service Providers to Skin and Slice Bodies of Content.
Steve Toub, Web Design Manager, The California Digital Library
Among other activities, the California Digital Library (CDL) provides site-building tools to digital libraries. CDL's Interface
Customization Tools were first released in April 2004. At
present, several service providers are using this set of
templates and documentation in conjunction with CDL's XML gateway
to provide branded interfaces for the subset of content they have
submitted to CDL's repository. CDL is expanding this set of
templates and documentation to allow its customers to be able to
"skin and slice" TEI-encoded texts and EAD-encoded finding aids;
this new system works in parallel with the new Lucene-based
platform for searching and displaying well-formed XML: eXtensible
Text Framework (XTF), which was introduced at the Fall 2004 DLF Forum.
In addition to illustrating how the XSLT-based customization tools work, the presenter will cover lessons learned, current development, and future plans including:
-
the ability for non-programmers to work with the system;
-
refactoring the display XSLTs to take advantage of a common branding
configuration file;
-
issues relating to conversion of the repository from a file system to a database;
-
the inclusion of JSP in addition to XSLT;
-
how to apply interface customization within CDL's metasearch platform; and
-
consideration of how generalizable and transferable these tools will be both to future activities at CDL and to others in the community.
11:00am—12:30pm
Session 14: NEW CHALLENGES IN DIGITAL PRESERVATION
(California Ballroom B, Second Level)
DSpace and Web Material: From Preserving Bundled Web Pages to Preserving
Websites as Applications.
Leslie Myrick, Digital Library Programmer/Analyst, New York University
As a partner in the DPP "Web at Risk" Project headed by the California Digital Library, NYU will be examining the feasibility of using DSpace for the ingest, storage, preservation of and access to websites. DSpace 1.x introduced functionality to ingest and store HTML pages along with any ancillary files (e.g. images, .css) as bundles of bitstreams with the HTML wrapper nominated as the primary bitstream for display purposes. HTML pages can be thus ingested, stored, displayed and accessed as discrete bundled units - not necessarily as navigable components of a website. By nominating a website's entry page as the primary bitstream to all other files, on the other hand, entire websites can also be ingested as such and navigated internally to DSpace. This presentation will offer a preliminary analysis of changes necessary to make DSpace 2.x and METS fully amenable to website ingest, management, access and navigation when the source of the ingest is a gzipped Heritrix .arc archive. Analysis will include an exploration of the relative strengths and quirks in the data models of various Content Packaging Standards such as METS, IMS-CP, XFDU and DIDL in managing website objects whether deposited in DSpace or other repository systems.
Old Wine in New Wineskins: Sustaining Access to and Preserving Legacy Digital
Collections
Joy Paulson, Preservation Librarian, Mann Library, Cornell University
The earliest digital library collections are now more than a decade old. These early collections were often created as part of research and demonstration projects at a time when there were no best
practices or standards for digital library creation. Some of
these collections are no longer available online and some have
disappeared entirely due to the use of proprietary software or
technology that has become outdated. However, a number of these
collections are still available online, although they may not
meet best practices now in place. For example, metadata standards
have only developed more recently. Many early digital collections
may have recorded little or no metadata, or they were scanned at
resolutions less than 600 dpi. Are these collections worth
maintaining access to and preserving? What types of enhancements
may be necessary to maintain access to or to preserve these
collections? The Core Historical Literature of Agriculture
(CHLA), created at Cornell in a series of projects between 1992
and 2000, will be used as a case study to examine these issues
and the level of resources, staff and financial, necessary to
maintain access to legacy collections and to enhance them for
improved access and preservation.
POST-CONFERENCE
12:30pm—5:30pm
DLF Developers' Forum—by Invitation Only (Santa Fe Room, Second Level)
Linking Public Search Engines to Library Content: A Considering of Approaches.
1:00pm—6:00pm
DLF OAI Best Practices and IMLS Project Teams Joint Meeting—by Invitation Only (Plaza Room, Second Level)