DLF Fall Forum 2003: Albuquerque

Fall Forum 2003
Albuquerque, New Mexico,
17-19 November, 2003

DLF Forum

Winners

DLF Forum Fellowships For Librarians New To The Profession

Paul Fogel, California Digital Library

Anne Graham, University of Washington

Uri Kolodney, University of Texas , Austin

Jin Ma, Penn State University

Christine Madsen, Harvard University

Program Committee

David Seaman: Digital Library Federation

Dale Flecker: Harvard University

Nancy Hoebelheinrich, Stanford University

Leslie Johnson: University of Virginia

Jerome McDonough: New York University

John Ober: California Digital Library

Jennifer Vinopal: New York University

Pre-Forum

Monday 17 November

9:00-1:00: DLF Developers Forum. The Potter’s Room.

10:00-1:00: E-resources Management Initiative Steering Committee meeting (Tim Jewell, chair). The Fireplace Room.

Fall 2003 Meeting Summary

Forum Schedule

Digital Formats: Factors for Sustainability, Functionality, and Quality. Caroline Arms and Carl Fleischhauer. Office of Strategic Initiatives, Library of Congress.

Digital Object Format Validation. Stephen L. Abrams, Digital Library Program Manager, Harvard University Library

The DLF E-Resource Management Initiative: Project Report. Tim Jewell, University of Washington; Ivy Anderson, Harvard; Adam Chandler, Cornell; Sharon Farb, UCLA; Kimberly Parker, Yale; Angela Riggio, UCLA; Nathan Robertson, Johns Hopkins.

MPEG-21 DIDL, the OAI-PMH, and the OpenURL as building blocks for representing, storing and disseminating complex digital objects. Jeroen Bekaert, Patrick Hochstenbach, Herbert Van de Sompel. Los Alamos National Laboratory, Research Library, Prototyping Team.

Update on the Fedora Open-Source Project . Sandy Payette, Cornell University

LOCKSS Implementation: technology, collections, and access.
Tom Robertson, Technical Manager, LOCKSS Program, Stanford University;
Perry Willett, Head, Library Electronic Text Resource Service, Indiana University;
Martin Halbert, Director for Library Systems, Emory University.

Responding to Digital Data Needs: The DEWI System. Ron Nakao and Chris Bourg, Stanford University Libraries

Building Collections with Greenstone Digital Library Software. Tod A. Olson, Programmer/Analyst Digital Library Development Center, University of Chicago Library

Metadata Tradeoffs in High-Production Digitization Environments. Nancy J. Hoebelheinrich. Metadata Coordinator, Stanford University Libraries / Academic Information Resources (SUL/AIR)

The Union Catalog of Art Images (UCAI): Aggregating and Standardizing Diverse Legacy Metadata. Esme Cowles (Database Developer) and Linda Barnhart (Project Coordinator), Union Catalog of Art Images

Preservation-Worthy Digital Video; or, How to Drive your Library into Chapter 11. Jerry McDonough. New York University

California Digital Library's Digital Preservation Program. Patricia Cruse, Director, Digital Preservation Program, California Digital Library

Digital Scholarship in the Academy: What Scholars Need. Ann Lally, Head, Digital Initiatives, University of Washington Libraries

The Evolution of an Interface from the User Perspective: From End User Testing to a Usage Log Analysis. Sarah Chandler, Cornell University

A rose is a rose by any other name; what's a DODL? Michael Keller, Stanford University

The National Digital Information Infrastructure and Preservation Program. Clay Shirky and Laura Campbell

LibData: a library web management system . Paul F. Bramscher, Shane A, Nackerud, and John T. Butler, University of Minnesota , Twin Cities

NAND: A New Tool for an Old Problem . Charles Blair, Elisabeth Long, and Keith Waclena, The Digital Library Development Center, The University of Chicago Library

NSDL Projects Update
"The OCKHAM Library Network." Martin Halbert, Emory University, with a draft of the OCKHAM reference model, as well as results from an early survey.
"Adding Value to NSDL : A Business Proposition and Service Enhancement." Laine Farley, Director, Digital Library Services, California Digital Library.

Data Mining Library Collection Silos: An Opportunity for Cooperative Collection Management of Print and Electronic Books. Lynn Silipigni Connaway. OCLC Online Computer Library Center, Office of Research.

From aggregation to commerce: the next phase for the RLG Cultural Materials Alliance . Ricky Erway, Digital Resources Manager, RLG

Monday 17 November

1:00-2:00: REGISTRATION. Foyer to Alvarado Room.

2:00-3:30: BREAKOUT SESSION 1: PRESERVATION. Alvarado Room ABC.

Digital Formats: Factors for Sustainability, Functionality, and Quality.

Caroline Arms and Carl Fleischhauer. Office of Strategic Initiatives, Library of Congress.
View presentation

The Library of Congress is drafting a planning framework that identifies and documents digital content formats that are promising (and unpromising) for long-term sustainability. The resulting reference resource is intended to serve staff who evaluate born digital content for selection for the Library's collections and make provisions to sustain that content.

The term format is used broadly in this context, and includes:

· file formats at the level indicated by Windows file extensions or Internet MediaType (aka MIME type)

· versions or subtypes of these that develop through time or are tailored to narrow, specific purposes

· classes of related formats, whose familial characteristics are important

· "wrappers" that must be distinguished in terms of their underlying bitstream structures

· bundling formats that bind together the files or bitstreams comprising a single digital work

The initial investigation has outlined two sets of high-level factors that may be used when choosing formats:

(a) conceptual factors that affect the sustainability of any digital format

(b) factors that relate to quality or functionality (beyond normal rendering) that might be desired for certain categories of content

The shorthand names for the sustainability factors are disclosure, adoption, transparency, self-documentation, external dependencies, and technical protection mechanisms. Quality and functionality factors have been sketched for sound, still images, text, and video -- content categories with which Library staff have experience in the digital realm.

Additional content categories are being added as the investigation continues. The activity is also developing summary descriptions for digital formats, intended to be synergistic with the proposed Digital Format Registry.

Digital Object Format Validation.

Stephen L. Abrams, Digital Library Program Manager, Harvard University Library
View presentation

The concept of representation format, or type, permeates all technical areas of digital repositories. Policy and processing decisions regarding object ingest, storage, access, and preservation are frequently conditioned on a per-format basis. In order to achieve necessary operational efficiencies, repositories need to be able to automate these procedures to the fullest extent possible.

JSTOR and the Harvard University Library are collaborating on a project to develop an extensible framework for format validation: JHOVE (pronounced "jove"), the JSTOR/Harvard Object Validation Environment.

JHOVE provides functions to identify, validate, and characterize digital objects: Format identification is the process of determining the format to which a digital object conforms, e.g.: "I have a digital object; what format is it?"

Format validation is the process of determining the level of compliance of a digital object to the specification for its purported format: "I have an object purportedly of format F; is it?"

Format characterization is the process of determining the format-specific significant properties of an object of a given format: "I have an object of format F; what are its salient properties?" JHOVE is a stand-alone, command-line oriented Java application, with an extensible plug-in architecture. In its initial release, JHOVE includes modules for recognizing and validating ASCII and UTF-8 encoded text, TIFF (including popular public profiles, such as TIFF/EP, TIFF/IT, and GeoTIFF), and PDF (including profiles such as PDF/X-1, -1a, -2, and -3).

2:00-3:30: BREAKOUT SESSION 2: RESOURCE MANAGEMENT. Alvarado D.

The DLF E-Resource Management Initiative: Project Report.

Tim Jewell, University of Washington; Ivy Anderson, Harvard; Adam Chandler, Cornell; Sharon Farb, UCLA; Kimberly Parker, Yale; Angela Riggio, UCLA; Nathan Robert son, Johns Hopkins.
View presentation

3:30-4:00: Break: Foyer to Alvarado Room.

4:00-5:30: BREAKOUT SESSION 3: ARCHITECTURES. Alvarado Room ABC

MPEG-21 DIDL, the OAI-PMH, and the OpenURL as building blocks for representing, storing and disseminating complex digital objects.

Jeroen Bekaert, Patrick Hochstenbach, Herbert Van de Sompel. Los Alamos National Laboratory, Research Library, Prototyping Team.
View presentation
View handout

Various XML-based approaches aimed at representing so-called complex digital objects have emerged over the last years. The MPEG-21 Digital Item Declaration Language (DIDL) is an XML-packaging specification that, so far, has received little attention in the Digital Library Community. The first part of this presentation will highlight major characteristics of DIDL, and report on research conducted at the LANL Research Library to determine the applicability of DIDL for the representation of complex objects in the LANL repository. The second part of the presentation will discuss a repository architecture under development at LANL, in which DIDL-conformant documents are the unit of storage. The architecture builds on the OAI-PMH, the forthcoming NISO OpenURL Framework Standard, and concepts from the MPEG-21 Digital Item Processing specification to make stored content accessible. While the discussion will be framed in the context of ongoing work at LANL, it is hoped that it will reveal the relevance of some of the presented concepts to other Digital Library efforts.

Update on the Fedora Open-Source Project.

Sandy Payette, Cornell University
View presentation

Thanks to a generous grant from the Andrew W. Mellon Foundation, the University of Virginia Library and Cornell Information Science have now made available an open-source version of the Flexible Extensible Digital Object Repository Architecture (Fedora). This presentation is an update on the Fedora project since its initial release on May 15, 2003.

Fedora is an open source digital object repository system that supports both management and delivery of heterogeneous digital content. The system can be used as the foundation for a variety of information management solutions including institutional repositories, preservation management systems, digital asset management systems, content management systems (CMS), and digital libraries.

As a simple use case, Fedora can be used to manage and deliver digital objects that aggregate one or more content streams into a single digital object. For example, a Fedora object can be an aggregation of different resolutions of the same digital image. More interestingly, Fedora provides the building blocks for managing and delivering more complex objects which have services associated with them. These objects can define relationships among content streams and can be configured to deliver one or more specialized views or "representations" of the content streams within the object.

Fedora provides web service interfaces to support digital object management and access. New features of the system include content versioning, object change tracking, time-stamped access requests (i.e., to obtain former views of objects), and a new graphical object editor.

This presentation will offer a brief review of the system features, followed by a demo of the latest software release (Fedora 1.2, available October 2003). It will also report on progress of institutions who have installed Fedora, software download statistics, and future development plans. Fedora is currently available for download at http://www.fedora/info

4:00-5:30: BREAKOUT SESSION 4: PRESERVATION. Alvarado Room D

LOCKSS Implementation: technology, collections, and access.

Tom Robertson, Technical Manager, LOCKSS Program, Stanford University; Perry Willett, Head, Library Electronic Text Resource Service, Indiana University; Martin Halbert, Director for Library Systems, Emory University.

An unintended consequence of the web is that libraries cannot easily collect and preserve e-collections. They lease access to paid content; they merely access freely available content. Libraries are unable to own collections; they are able to offer only very limited services around these collections. Libraries in the web environment are unable to full their traditional society memory organizational role. One of the biggest risks to libraries fulfilling their memory role, or for any institution wishing to take responsibility for digital preservation, is a budget cut. The LOCKSS approach tries to prevent content being lost through budget cuts by dispersing all costs and responsibilities across many institutions. The systems robustness depends upon redundancy of hardware, software, content and administration.

The panel will present a technical overview and real-world implementation activities: Collection Development activities in the Humanities; Providing seamless end user access to locally stored content.

Tom Robertson, Technical Manager, LOCKSS Program, Stanford University : Tom will outline the LOCKSS technology, the Program's status, and next steps
View presentation

Perry Willett, Head, Library Electronic Text Resource Service, Indiana University : Perry co-chairs a group selecting American and British literature e-journals for LOCKSS preservation. He will discuss importance of preserving humanities born digital materials and the process of obtaining publisher permissions.
View presentation

Martin Halbert, Director for Library Systems, Emory University : Emory is prototyping various methods of configuring LOCKSS caches within institutional networks so readers can access "lockss-cached" materials when the publisher's site is unavailable. Martin will outline experiences and methods to integrate LOCKSS using PAC files, EZ Proxy, and Squid and describe future work with Open URLs.
View presentation

6:30-9:00: Reception. The Franciscan Ballroom, Sheraton Old Town .

DLF reception

Tuesday 18 November

8:00-9:00: Continental Breakfast. Foyer to Alvarado Room.

9:00-10:30: BREAKOUT SESSION 5: TOOLS. Alvarado Room ABC.

Responding to Digital Data Needs: The DEWI System.

Ron Nakao and Chris Bourg, Stanford University Libraries
View presentation

Although data has long been an important element of social science research and instruction, the nature of data needs within the social sciences has changed dramatically in recent years. A major trend is the dramatic increase in demand for numeric data by undergraduates who use and manipulate it in their own research. The number of courses that include data intensive assignments has also increased. As technology has developed, researchers are increasingly looking for efficient ways to share data electronically with local and distant colleagues. Finally, researchers and librarians alike are recognizing the need to create electronic archives of available data.

The Data Extraction Web Interface (DEWI) System is a suite of tools for the processing, preservation, and delivery of Stanford's social science numeric data collection. DEWI provides an integrated point of service for data users, by allowing users to browse lists of variables, search for variables of interest, and create custom sub-sets of data which can be downloaded to personal computers in a variety of formats compatible with popular statistical software. In addition, supporting documentation in the form of codebooks, external links, and locally developed guides to the data are available for most data sets.

DEWI can also be used to restrict access to datasets to selected users. This allows us to ingest and serve current data collected by faculty, so that research teams can use DEWI to access, control, and archive their data before releasing it for public use. This feature of DEWI encourages early and therefore more accurate and complete, creation of metadata and documentation.

In this presentation, we will describe the development of DEWI, discuss how DEWI has been used within the Stanford research and teaching community, and discuss some of the directions--including broader collaborative efforts--we are exploring in the future development of DEWI. We will also discuss how DEWI represents one approach in the search for solutions to the kinds of challenges identified at the 1999 Digital Library Federation Workshop on Social Science Data Archives.

Building Collections with Greenstone Digital Library Software.

Tod A. Olson, Programmer/Analyst Digital Library Development Center, University of Chicago Library
View presentation
View handout
View Powerpoint

The University of Chicago Library recently launched a new digital collection of musical scores, Chopin Early Editions, using emerging standards for building digital collections from reformatted materials. Descriptive metadata were extracted from the OPAC, cross-walked to MODS, and combined with structural metadata to create METS objects. Using XSLT, METS records were converted into the Greenstone Archival Format and loaded into Greenstone Digital Library Software.

Greenstone is a well-established, configurable, and customizable system for creating digital libraries. It accepts documents in a number of common formats and its own internal format. With no customization, Greenstone provides full text searching, metadata-based searching and browsing, support for custom metadata, and hierarchical or page-turning document navigation. Because of Greenstone's great flexibility, and with support from an active and knowledgeable user community, most of the desired interface features for this collection have been implemented.

We have added custom metadata, modified browse and search results displays, and implemented custom document displays and intra-document navigation features. Modifications were accomplished through editing configuration files, modifying display macros, and custom generation of the internal Greenstone document format. More recently, Greenstone includes a tool to aid the creation and administration of collections, including direct editing of document metadata.

9:00-10:30: BREAKOUT SESSION 6: METADATA & PRODUCTION. Alvarado D

Metadata Tradeoffs in High-Production Digitization Environments.

Nancy J. Hoebelheinrich. Metadata Coordinator, Stanford University Libraries / Academic Information Resources (SUL/AIR)
View presentation

A common myth exists from the Library point of view that the more metadata that can be captured for a digital object the better –- for purposes of resource identification, selection, rendering, and reconstruction. But, how much metadata is really necessary and practicable to capture and/or create in high-production digitization environments for those purposes? What are economically feasible, realistic workflow scenarios that will scale to a faster pace and enable enough metadata to be captured and created to accomplish the goals of a digital repository? What are the tradeoffs between creating more-and-better metadata, and creating a greater number of high quality digital objects? In this presentation, metadata creation and capture will be described and evaluated for two different high production environments at Stanford University Libraries.

SUL/AIR has developed two high-production digitization channels in recent years:

· the onsite capture of the GATT Archive at the World Trade Organization

· the DL1 Laboratory on Stanford campus

For each of these channels, part of the challenge associated with the capture process has been to decide kind, extent, and feasibility of creating descriptive, technical, and administrative metadata in a collaboration of collection, scanning, cataloging and metadata, IT, and preservation staff. For this presentation, the rationale behind the metadata models used will be discussed. Also discussed will be the roles played by metadata standards, and the tradeoffs made between sound data structure and information capture. Finally, an assessment will be offered of some areas of research and further testing that would prove useful to develop better answers to the questions raised.

The Union Catalog of Art Images (UCAI): Aggregating and Standardizing Diverse Legacy Metadata.

Esme Cowles (Database Developer) and Linda Barnhart (Project Coordinator), Union Catalog of Art Images
View presentation

The Union Catalog of Art Images (UCAI) project, funded by The Andrew W. Mellon Foundation and centered at UCSD, has built a prototype union catalog from three large and extremely diverse sets of legacy metadata for art images. UCAI is a research and development project that is building the technical infrastructure for a shared cataloging resource for the visual resources community to promote the copy cataloging of images. The platform for the prototype is the open source native XML database Xindice, with an open source front end search engine, Lucene. The aggregated dataset, at 675,000 records, may be one of the larger implementations of a native XML database in the digital library community, and includes thumbnail images for approximately 25% of the records.

The speakers will describe the problems uncovered and the software tools developed to map, standardize, and ingest the metadata to build the UCAI prototype. The challenges of clustering and merging redundant metadata will also be described. The speakers will also address the drawbacks and benefits of our experience with this native XML database, and the next steps for project development.

10:30-11:00: Break: Foyer to Alvarado Room.

11:00-12:30: BREAKOUT SESSION 7: PRESERVATION. Alvarado Room ABC.

Preservation-Worthy Digital Video; or, How to Drive your Library into Chapter 11

Jerry McDonough. New York University
View presentation

This session will provide an overview of what NYU has learned so far in trying to establish best practices with regards to archiving digital video, including a brief technical overview of relevant characteristics of digital video, a discussion of the abstract requirements for preservation-worthy digital video, and some discussion of costs for creating and maintaining a large scale digital video archive.

California Digital Library's Digital Preservation Program.

Patricia Cruse, Director, Digital Preservation Program, California Digital Library
View presentation

In partnership with the University of California libraries, the California Digital Library established a digital preservation program to focus on the persistent management of digital information. Since its inception last year, CDL's program has developed so that it leverages the infrastructure available to the California Digital Library. The program is active in three primary areas:

1) identifying methods to preserve and persistently manage e-journals

2) establishing a preservation repository for content created or managed by the UC libraries

3) evaluating methods for gathering and persistently managing web-based materials

Findings from CDL's web-archiving activities, which were funded by The Andrew W. Mellon Foundation, will be presented.

11:00-12:30: BREAKOUT SESSION 8: USER PERSPECTIVE AND ASSESSMENT. Alvarado Room D.

Digital Scholarship in the Academy: What Scholars Need.

Ann Lally, Head, Digital Initiatives, University of Washington Libraries
View presentation

In March, 2003, the University of Washington Libraries held a retreat to discuss digital scholarship – that is, academic work that was not possible prior to the development of digital technology.

This retreat, titled "From Vision to Transformation: New Models of Academic Support for Digital Scholarship", brought together sixty-two scholars, librarians, archivists, museum curators, academic leaders and technologists. Discussions with retreat participants centered around two questions:

· "What are scholars’ needs and wants regarding digital scholarship, collections, and technology?"

· "What strategies should the University of Washington and the UW Libraries take to advance such scholarship and learning?"

The retreat included presentations, facilitated small group discussions, and opportunities for social interactions. Overwhelmingly the retreat attendees pointed to the Library as the logical place to turn for assistance. A wealth of information was collected at this retreat; the issues raised (and the interesting omissions) and the models proposed by faculty will be discussed and will include potential implications for libraries.

The Evolution of an Interface from the User Perspective: From End User Testing to a Usage Log Analysis.

Sarah Chandler, Cornell University
View presentation

“Find Articles/Find Databases/Find e-Journals” is the new system the Cornell University Library (CUL) has implemented to replace the old “e-Reference Collection” system.

Find Articles/Find Databases/Find e-Journals is built using Endeavor’s ENCompass product, Oracle and XSL, whereas the e-Reference Collection was based on a MySQL database and Perl. E-Reference allowed for searching the metadata about both proprietary and unrestricted resources and then connecting directly to each database for searching in its native environment. The new system builds on the existing database search capability with “Find Databases,” adds federated searching at the article level with “Find Articles,” and provides title-level access to the 20,000 or so electronic journals available through the CUL with “Find e-Journals.” In addition, the system adds reference linking with “Find it at Cornell,” which utilizes Endeavor’s LinkFinder Plus.

CUL’s ENCompass project team began working on the new system in earnest in the summer of 2002 and brought the system up live in May of 2003. Beginning in fall, 2002, the team undertook substantial XSL and HTML customization of the web interface and integrated a new authentication system using EZProxy. Through the course of faculty and student focus groups, reference staff feedback, and, most recently, a usage log analysis, the ENCompass team has been able to measure effectiveness of the new system and consider or implement improvements. The presentation will focus on the evolution of the interface and implications the usage log study raises for maximizing system performance in future.

12:30-2:00: Lunch. Franciscan Ballroom.

2:00-3:30: PLENARY. Alvarado Room ABC

Introduction

David Seaman, DLF

A rose is a rose by any other name; what's a DODL?

Michael Keller, Stanford University
View presentation

3:30-4:00: Break: Foyer to Alvarado Room.

4:00-6.00: BIRDS OF A FEATHER SESSIONS

a) Elizabeth A. S. Beaudin (Yale): UNICODE: The Right Tools, but how to use them? Alvarado Room ABC.

The proposed presentation will review the technological decisions made at the outset of an open source electronic union list of Middle Eastern serials. For example, it might be argued that a Microsoft solution was and is available to address the aspects of the project's linguistic needs. Why not use it? Several factors informed the decision, among these: server configuration and experimentation, hardware costs for Middle Eastern partners, and international software accessibility, expertise and popularity. Further, how is Unicode being exploited in our system prototype? How has the tool been put to use to resolve search, retrieval, and display requirements? What challenges remain to solve, for example, when XML is introduced in later years of the project?

b) John Kunze (CDL): Persistent Identifiers: What's left to be done? Alvarado Room D.

To solicit ideas, needs, and visions that could be used to inform an analysis of how to move library community discussion forward on the creation and maintenance of durable links. As a possible point of departure, attendees are encouraged to read a recent reformulation of the problem that can be found here (PDF, 9 pages):

http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf
(alternate URL: http://bibnum.bnf.fr/ecdl/2003/proceedings.php?f=kunze )

c) Jerry McDonough (NYU): METS Implementers. Potters Room.

An opportunity to discuss a range of METS issues.

d) Perry Willett ( Indiana ): The Text Encoding Initiative and the revision of the DLF "TEI in Libraries" Guidelines. Weavers Room

An opportunity to discuss a range of TEI issues.

e) Mackenzie Smith (MIT): Global Digital Format Registry. Fireplace Room.

An opportunity to discuss a range of GDFR issues.

Wednesday 19 November

8:00-9:00: Continental Breakfast: Foyer to Alvarado Room.

9:00-10:30: BREAKOUT SESSION 9: PRESERVATION. Alvarado Room ABC.

The National Digital Information Infrastructure and Preservation Program.

The National Digital Information Infrastructure and Preservation Program (NDIIPP). Laura Campbell.

View presentation
NDIIPP Architectural Model for Federated Digital Preservation. Clay Shirky.
View presentation

9:00-10:30: BREAKOUT SESSION 10: RESOURCE MANAGEMENT. Alvarado D.

LibData: a library web management system.

Paul F. Bramscher, Shane A, Nackerud, and John T. Butler, University of Minnesota , Twin Cities
View presentation

The goal of publishing dynamically-generated library web pages through a system that integrates easy-to-use web authoring tools with a large database of information resources led the University of Minnesota Libraries to build LibData. Like many large research libraries, Minnesota provides access to thousands of resources, both licensed and freely available. Access to these resources is provided in numerous ways, including through its web site, subject pathfinders, and course-customized library web pages.

LibData's master database -- which contains records for resources, services, library locations, staff, and more -- was designed to allow for easier management of these resources and their rapid retrieval and incorporation into the variety of web presentations that librarians create for users. LibData currently offers three distinct page authoring tools useful to both novice and expert librarian users:

· Research QuickStart Subject Builder (for subject pathfinders),

· CourseLib (for course-related pages)

· PageScribe (for free form pages).

These tools are tightly integrated with the main database, making resource management easier to control, and assuring that library users receive predictable and up-to-date information. LibData also features a robust staff management system, user and page statistics, and complete customizability and extensibility. This presentation will discuss the LibData system, its underlying database architecture, the administration interface, and efforts to create an open source distribution of the system. Also discussed will be the extensibility of LibData including work to connect it with enterprise systems such as the campus portal and course management software.

NAND: A New Tool for an Old Problem.

Charles Blair, Elisabeth Long, and Keith Waclena, The Digital Library Development Center , The University of Chicago Library
View presentation

The Digital Library Development Center (DLDC) at the University of Chicago Library has developed a lightweight, versatile tool for searching and browsing collections of data, including but not limited to bibliographic data, via the World Wide Web. This tool, NAND: A Non-Relational Database, has allowed the DLDC both to meet the current needs of its user base and also to anticipate demand in some areas.

NAND has been used on a variety of projects ranging from an e-journals list to an integrated search interface for digital projects to management information for Unix computers. It is a generic tool that addresses a class of data-indexing and presentation needs while allowing customization by relatively non-technical staff to meet individual project requirements.

NAND is a portable single-file executable which combines a back-end indexing component and a CGI-based web interface for searching and browsing. A project can be completely set up with as few as three lines of configuration, but advanced users have recourse to a complete object-oriented programming language within the configuration file. The system supports several types of input data (including delimited fields, CSV, refer, HTML, and electronic mail). The CGI interface supports customizable HTML and record layout; multiple sorts; multi-page hierarchical browsing; and Boolean searches across any number and combination of fields with phrase searching, wildcards and regular expressions.

10:30-11:00: Break: Foyer to Alvarado Room.

11:00-12:30: BREAKOUT SESSION 11: NATIONAL SCIENCE DIGITAL LIBRARY. Alvarado Room ABC.

NSDL Projects Update.

Martin Halbert, Director for Library Systems, Emory University

Two DLF institutions, Emory University and the CDL, have independently received NSDL grants for similar and potentially complementary work on interoperability with the NSDL.

"The OCKHAM Library Network." Martin Halbert, Emory University, with a draft of the OCKHAM reference model, as well as results from an early survey.
View presentation
"Adding Value to NSDL : A Business Proposition and Service Enhancement." Laine Farley, Director, Digital Library Services, California Digital Library
View presentation

11:00-12:30: BREAKOUT SESSION 12. Alvarado Room D.

Data Mining Library Collection Silos: An Opportunity for Cooperative Collection Management of Print and Electronic Books.

Lynn Silipigni Connaway. OCLC Online Computer Library Center , Office of Research.
View presentation

The OCLC Online Computer Library Center WorldCat database is used to identify print books (p-books) that have an electronic book (e-book) edition and the libraries that hold these materials. An analysis of the bibliographic characteristics of and the geographic holdings for these materials provide empirical data for library decision-making.

Libraries are installing compact shelving, moving lesser-used and older collections to remote storage locations and, increasingly, are digitizing their materials. With digital collections come new challenges, such as usage and cost comparisons of print and electronic resources, digitization and preservation processes, organization, retrieval systems, services, and collection management. By analyzing collection data across institutions and within collections, library decision-makers are able to make collection decisions based on empirical data. An aggregated database of library holdings is required for such an analysis.

This research draws on the OCLC Online Computer Library Center WorldCat database, containing more than 50 million records. WorldCat has not only served as an aggregator of bibliographic data for thirty years, but also identifies almost a billion holding locations for library resources. WorldCat can be used to describe collections bibliographically, as well as geographically. The researchers use WorldCat to identify paper books (p-books) that have an electronic book (e-book) edition. Holding patterns are analyzed by type of library, publisher, date, and subject areas (using the North American Title Count) for all p-books and e-books. A comparison of the characteristics of p-books and e-books document the development and growth of the transition from the paper library to the digital library. The findings from this research will not only increase our understanding of the current e-book/p-book scenario, but could also be useful in seeking outside funding for a range of library operational issues, such as, preservation and digitization of materials and cooperative and individual library collection development and management decisions.

From aggregation to commerce: the next phase for the RLG Cultural Materials Alliance .

Ricky Erway, Digital Resources Manager, RLG
View presentation

Beginning with a brief review of where RLG and the institutions in the Cultural Materials Alliance are today in aggregating their digitized special collections to make them easily accessible for teaching and research, the majority of the presentation will focus on the next phase of the initiative, which has as its goals reaching new audiences, providing broader awareness of and access to the institutions' special collections, and testing the waters of commercial licensing. This will include the results of RLG's investigation of image stock-houses, the plans for open web access via internet search engines, and the results of the deliberations of the Alliance policy advisory group.

12:30-1:00: Closing remarks: David Seaman, DLF. Alvarado Room ABC.