Fall Forum 2001: Extended Program

Reference linking in the digital library

CrossRef: the citation linking backbone

Ed Pentz, CrossRef

CrossRef is a non-profit organization setup up by scholarly publishers in January 2000. Based on collaboration and standards, the CrossRef system enables the automated large-scale assignment of DOIs to scholarly journal articles, conference proceedings and books and the efficient creation of reference links using DOIs. Widespread reference linking is now taking place between CrossRef members and affiliates. With the CrossRef/DOI infrastructure in place, more sophisticated linking models are being developed.

OpenURL. A standard for extended linking

Eric Van de Velde, California Institute of Technology

A URL takes requesters from a citation to a destination... provided, of course, the URL is still valid. The OpenURL standard, which we are developing under the auspices of NISO, allows the development of high-quality links that feature additional properties, such as:

Persistence: Increase the probable lifetime of citations.
Multiplicity: Produce a menu of services for each citation.
Context-Sensitivity: Resolve a citation in a manner appropriate to the requester's context.
Manageability: Create a log of citations according to requester specifications.

A few commercial services can provide some of this functionality right now. To encourage the growth of more and better extended-linking services, NISO has put the standardization of OpenURL on the fast track. We are developing a standard that will serve the scholarly-information community immediately and other communities in the long term. The current chaotic web is wonderful in its way. However, within this web infrastructure, we believe there is a need for a high-quality web of vetted information. To bring this to the scholarly-information society as soon as possible, we need the OpenURL standard as a key enabling technology. I will outline the major issues of OpenURL standardization, and I will show how you can get involved.

References and acknowledgements:

Web site of NISO Committee AX for standardization of OpenURL: http://library.caltech.edu/openurl/
Bibliography on OpenURL: http://library.caltech.edu/openurl/Bibliography.htm

Localized reference linking

Dale Flecker, Associate Director for Planning and Systems, Harvard University Library

This paper will discuss results of an experiment involving CrossRef, the DOI, and others to localize reference linking.

New approaches to scholarly publishing

Publishing from the library: Project Euclid and the distribution of scholarly communications

David Ruddy, Electronic Publications Specialist, Cornell University Library

With development support from The Andrew W. Mellon Foundation, Project Euclid has as its goal the advancement of affordable scholarly communications in mathematics and statistics. A consortial effort among Cornell University Library, Duke University Press, and several development partners who publish math and statistics literature, Euclid will provide independent publishers with a mechanism for the discovery, display, and distribution of their electronic content. By doing so, we hope to challenge emerging patterns of consolidation in the ownership and dissemination of scholarship, at the same time recognizing and encouraging the value added by learned societies and other publishers of scholarly materials. Project Euclid represents an expanded and more active role for academic libraries in the arena of scholarly communication. This talk will give an overview of the project--its goals, scope, challenges, technical design, and current status. Project Euclid URL: http://projecteuclid.org.

eScholarship

Catherine Candee, Director of Scholarly Communication Initiatives, California Digital Library

eScholarship--a California Digital Library (CDL) initiative to support innovation in scholarly publishing--applies technical and organizational support to faculty-led initiatives to change scholarly communication. In just two years developments in digital technologies and Web publishing, coupled with greater acceptance of non-traditional publishing forms, have given rise to ever more sophisticated scholar-led projects and a corresponding array of unsettled questions concerning copyright, peer-review, and permanence of the scholarly record. Currently, eScholarship is exploring production level dissemination and publication, experimenting with new business models, and enjoying collaborative partnerships with the University of California Press, bepress.com, and several scholarly societies.

Sustainable models of electronic scholarly publishing

Maria Bonn, Head, Scholarly Publishing Office, University of Michigan Library

The University of Michigan Library's Scholarly Publishing Office (SPO) has tools and methods for the electronic publication and distribution of scholarly content. The office supports the traditional constructs of journal and monographic publication in an online environment, as well as publishing scholarly work expressly designed for electronic delivery. SPO exists to develop services that are responsive to the needs of both producers and users, to foster a better economic model for campus publishing, to support local control of intellectual assets, and to create highly functional scholarly resources. SPO is particularly concerned -as this talk will reflect - with building sustainable models and methods the can be shared with other institutions and that bridge the gap between academic self-publishing and large, aggregated, commercial publishing.

Attributes of digital archives and archive objects

Meg Bellinger, President, Preservation Resources, OCLC
Robin Dale, Program Officer, Research Libraries Group

This panel discussion will present work resulting from OCLC's and RLG's joint exploration of digital archiving of digital archiving. Beginning with an overview of standardization and best practice efforts for digital respoitories, the session will focus on the development of attributes information and preservation metadata information in the digital archives environment. Looking forward, the speakers will discuss certification issues, sample metadata implementations (including embracing other emerging standards such as METS), and the applicability of attributes information to existing digital libraries and emerging digital repositories.

Working across domains

New expectations and traditional stewardship in museums and libraries

Bernard Reilly, Director, Center for Research Libraries

Changing conditions in the non-profit cultural sector have created a new baseline of expectations of publicly supported public institutions. Previously the "contract" between the community and its libraries and museums outlined the basic expectations for those institutions. This contract, albeit often an implied one, derived from statutory obligations of non-profit organizations, the inherent conditions of public funding, and the traditions of philanthropy in the U.S. The contract entailed a stewardship role that involved: long-term preservation and security of cultural materials; continuous and unrestricted availability of those materials to a general audience or to the supporting community; and a general accountability of the organizations to their supporting community.

This stewardship role may not convey to those libraries' and museums' activities in the global, digital environment. New models for service and product delivery, compensation, and funding are creating different, and far higher, audience expectations for service and access. Information and content must be more than just secure and available: it must be delivered to the desktop and presented together with user tools and ancillary resources. Users expect to obtain more than catalog records and other such limited kinds of information from museum and library web sites. They expect real-time responsiveness and information on demand. This involves a tacit "upping of the ante" for libraries and museums, beyond the demands for responsible stewardship. To fulfill such expectations, works of art and collections in themselves offer relatively little help. They are "inert" assets. The content embodied in those collections, i.e., the visual and textual data contained in them, and even the expertise of curatorial and research staff, are the more "liquid" and hence valuable assets. These can be re-aggregated, electronically delivered, licensed, and brokered by the organizations for revenue and other benefits. The high costs of operating in this environment, moreover, are driving organizations toward financially self-sustaining activities and alliances, which in turn reinforce the drift away from stewardship. My presentation will explore the costs of such expectations, and examine some of the larger factors that have driven such changes. I will also suggest terms that might be useful for a new "contract" between cultural heritage organizations and the public.

ET call home. Who's out there: The issue of audience in the digital age

Barbara Taranto, Digital Library Program, New York Public Library

As we move deeper into the 21st century, it would be best to anticipate the issues of audience and move to understand, strategize and plan for future use of digital materials. Should we do it, can we do it and how can it be done?

In the world of books, one does not often hear of a novel that was horrid, but pages 195 through 200 were brilliant. In the theater occasionally one may hear that the playwright should reconsider his or her choice of career, even though the performance by certain actors was inspired. Not so with film. It is an oddity of the cinematic arts that on the whole a film may be maudlin, tedious, obtuse, cloying, overacted, under-acted or simply bad, and still be memorable, not alone for its shortcomings, but for a single moment when the presentation transcends the story and becomes something altogether different.

So it was when Henry Thomas and Drew Barrymore stalwartly supported their extraterrestrial guest as it sent an electronic mayday out into the cosmos. For the children in the audience there was nothing incredible, even if they didn't entirely believe in little squished chocolate marshmallow man with superior intelligence; about ET sending out a distress signal and expecting a response. Those children wholeheartedly believed and continue to believe in the audience "out there".

As libraries consolidate and continue to increase their special collections, the issues of short-term and long-term audience becoming heightened. At one time libraries, both circulating and not, and especially research libraries were reasonably certain about the topography of their audiences. In fact, in many cases it was the proscribed audience and not collection development, per se, that shaped the identity and role of the institution. But the practices of public service that are so familiar within the non-research community are fast becoming the concerns of the specialized libraries.

As we move deeper into the 21st century, it would be best to anticipate the issues of audience and move to understand, strategize and plan for future use of digital materials. Not so much because the technology is changing rapidly, but because the audience is evolving more quickly than libraries may be prepared to accommodate.

Who is the new "us"? And who will that be in five years or ten years? Do we have an obligation to meet the expectations of the new audience given that it is changing so rapidly? Can we afford not to?

The Colorado Digitization Project: A case study

Nancy Allen, Dean and Director, Penrose Library, University of Denver

The Colorado Digitization Project has had three years of experience in collaborative approaches to digitization of primary resource material from scientific and heritage organizations of many types. The CDP provided infrastructure (web-based project management tool kit, training, regional scanning centers, metadata creation tools, a finding system, small incentive grants) to encourage all types of cultural heritage institutions to work together. Libraries of many types, museums and historical societies large and small, and archives with various emphases have all partnered to contribute digital resources through a coordinated statewide effort. This presentation will cover many of the cross-organizational issues involved in such collaboration, including museum and library values and culture, models for decision making and collaborative organization, development of standards and best practices that work for all collaborators, and interoperability issues and solutions.

Plenary session 2. 1315 - 1400

Building a National Science Digital Library

Title to advised

Bill Arms, Professor of Computer Science, Cornell University

Professor Arms will speak about the vision for a National Science Digital Library for Science, Math, Engineering, and Technical Education, and how that vision is being realized through a major program of the National Science Foundation.

Breakout session 4. 1400 - 15:30

Using networked scholarly information

Methods for assessing use of online collections and services

Denise Troll, Associate University Librarian, Carnegie Mellon University

Denise Troll will present the results of an extensive survey of methods used by leading digital libraries to measure the use and usability of online collections and services. The study offers a qualitative, rather than quantitative, look at experiences conducting digital library assessments. The results are not comprehensive or representative of library efforts, but indicative of trends in library practice. The trends identify popular research methods and common problems encountered when using these methods to assess and enhance access to and usability of online collections and services. The most popular research methods are survey questionnaires, focus groups, user protocols, and transaction log analysis. The common problems encountered include difficulty recruiting representative research subjects; difficulty selecting and using the appropriate research method; and difficulty analyzing, interpreting, presenting, and applying the data gathered to strategic planning and decision-making. The survey results indicate that the internal organization of libraries and the skills, preferences, and assumptions of librarians can be the biggest impediments to conducting successful assessments and implementing the findings. The presentation at the DLF Forum will focus on significant, common issues and concerns encountered when conducting assessments using popular research methods, and conclude with suggestions for future research or ways to address these concerns.

To be advised

Evaluating the use of your website

Rue Ramirez, Digital Library Services, University of Texas at Austin

Two years ago the University of Texas at Austin received a grant from the Institute for Museum and Library Studies (IMLS) to create a website that would provide evaluative techniques and tools to managers of museum and library websites. This presentation will provide a project summary and overview of techniques and tools we will be offering on the site.

Architecting the [inter]national digital library

Repository architectures. An introduction to Fedora

Thornton Staples, University of Virginia
Carl Lagoze, Cornell University

The presentation will introduce a session devoted to review of different approaches to the development of comprehensive digital library architectures. Lagoze and Staples will discuss their practical work implementing the Fedora repository architecture.

Let's give interoperability a chance ...

Lorcan Dempsey, Vice President, Research, OCLC

Interoperability is much advocated, less often achieved. As our services are increasingly designed as elements in a distributed environment, so we increasingly recognise the need to architect this environment. This presentation describes a set of activities which have resulted in one definition of such an environment (http://www.dner.ac.uk/dner/). The potential advantages of such an approach are that it gives a framework for discussing interoperability, that it provides a common frame of reference for discussing product and service offerings, that it allows dependencies and roles to be identified, that it supports a development agenda. Using this framework as a basis, the presentation will discuss interoperability issues, and suggest some likely service directions.

Libraries services in a well architected networked environment

Daniel Greenstein, Director, DLF

Abstract pending

Breakout session 5. 16:00 - 17:30

Approaches to cross-collection searching 1

Metastar @ North Carolina State University Libraries

James M. Jackson Sanborn, Data Services Librarian and Charley Pennell, Head Cataloging Department

This presentation will focus on the planned usage of Blue Angel Technology's MetaStar product by the North Carolina State University Libraries. Library users are faced with an often bewildering array of research resources, including the catalog, electronic indexes and databases, spatial and numeric data collections, websites, Special Collections resources, etc. In attempting to provide a simpler yet more useful research gateway, NC State University Libraries have identified the goal of facilitating user-defined cross collection searching. There are a number of possible approaches to accomplish this. One solution that NC State Libraries is investigating centers on the use of MetaStar. MetaStar is a metadata harvesting, management, indexing, and searching tool. Through a modular architecture that uses XML as a connection tool, MetaStar provides the ability to harvest and index metadata from webpages, manage new and existing collections of metadata, and search multiple local and remote collections concurrently. Still in the early research and prototyping stages, we have been designing collections of intranet metadata, spatial and numeric data search tools, and subject specific gateway search interfaces.

Addressing mixed granularity in cross-collection searching

Marty Kurth, Head of Cataloguing, Cornell University

A team at Cornell University Library is engaged in an effort to provide access to multiple digital collections as well as its entire online catalog through a single searching-and-browsing interface. By using the ENCompass digital library management system, the team is organizing a domain of digital objects that range in size from the full-text of a single pamphlet page to a database of over 100,000 electronic books in order to map those objects into the management system's three-level record structure of collections, containers, and objects. Striving to provide users with intelligible result sets has presented many challenges in three notable areas. First, representing a digital object's relative size within the system as a whole can be done via a collection-specific object record when the objects in a collection are of similar granularity, but objects in variably granular collections resist this approach. Second, presenting navigational paths to users to minimize disorientation by such means as clearly identifying the collections searched and foregrounding the hierarchical nature of the system's record structure inevitably runs into inflexibilities in system architecture. Finally, refining the interface vocabulary to describe systematically the record structure and the relationships between the records in the structure confronts the lack of commonly accepted user terms for digital objects and the relationships among them. The presentation will describe the team's efforts to address these challenges thus far. The Cornell University Library ENCompass Team includes Karen Calhoun (Team Leader), Meryl Brodsky, George Kozak, Marty Kurth, Fred Muratori, David Ruddy, Tom Turner, Sarah Young

Moving geospatial metadata (CSDGM) to MARC and OAI: The CUGIR model

Elaine Westbrooks, Metadata Librarian, Cornell University Library
Adam Chandler, CTS Information Technology Librarian, Cornell University Library
Vivek Uppal, graduate student, Cornell University

The Cornell University Geospatial Information Repository (CUGIR) is a clearinghouse that provides unrestricted access to geospatial data and metadata, with special emphasis on those natural features relevant to agriculture, ecology, natural resources and human-environment interactions in New York State. CUGIR, like all nodes within the National Spatial Data Infrastructure (NSDI), is available for searching by way of the Z39.50 information retrieval protocol. Implementation complexity, retrieval accuracy, scalability, and performance issues make the Z39.50 architecture problematic within this context. Our presentation will describe a system at the Cornell University Library that bridges the gap between today's NSDI, traditional MARC-based library catalogs and utilities, and the Open Archives Initiative (OAI). The glue binding these different digital objects together is our implementation of Michael Nelson's pioneering work with "buckets." This session will provide an overview of the steps, challenges, and problems involved with mapping and managing complex metadata surrogates across standards and systems. Finally, we will speculate on how the OAI harvesting protocol may be seen as an alternative to the current Z39.50 NSDI system.

Metadata Encoding and Transmission Scheme (METS). An update

Jerome McDonough, New York University
Morgan Cundiff, Library of Congress
MacKenzie Smith, Harvard University;
Rick Beaubien, University of California at Berkeley

METS is a generalized metadata framework, developed to encode the structural metadata for objects within a digital library and related descriptive and administrative metadata. METS provides for the responsible management and transfer of digital library objects by bundling and storing appropriate metadata along with the digital objects. METS is expressed using XML, which means that METS data is stored according to platform and software independent encoding standards, such as UTF-8 (Unicode), ISO-8859-1, etc.

One important application of METS may be as an implementation of the Open Archival Information System (OAIS) reference model and as such can function as a Submission Information Package (SIP) for use as a transfer syntax; a Dissemination Information Package (DIP) for display or other applications; and an Archival Information Package (AIP) for storing and managing information internally

This panel will give the background and overview of the schema, clarifying the distinction between a metadata framework (such as METS) metadata wrapper schemes (such as RDF). Panelist from UC Berkeley, The Library of Congress, and Harvard will review their current and planned usage of METS, and will give examples of tools in development for producing and using METS objects. There will be plenty of time for questions and discussion.

Sunday November 18

Breakout session 6. 0915 - 1045

Benchmarking preservation digital masters and image quality methods

The DLF has been examining the tools needed for common use in representing information digitally, including benchmarks for quality control and uniform interaction. This session will examine two such efforts. Robin L. Dale, RLG Member Programs & Initiatives, will chair this session and comment on these two efforts in the context of others initiatives both nationally and internationally.

Benchmarking digital reproductions of printed books and serials

Anne Kenney, Director of Programs, Council for Library and Information Resources

Libraries and others are digitizing increasing quantities of printed material for online access without agreement on any desirable level of imaging quality. The DLF is working to identify, and build support for, specifications acceptable as the minimum necessary for digitally reproducing printed books and serial publications with fidelity. Adoption of such benchmarks would help users and libraries both. Users could have more confidence in the fidelity of digital reproductions available to them. And libraries could produce and maintain reproductions with confidence that expensive re-digitization would not become necessary. Digital reproductions meeting at least the benchmarks' minimum specifications would remain viable even as reproduction techniques improved.

Also, because such texts would have well-known, consistent properties, they could support a wide variety of uses (including uses not possible with printed texts). Additionally, agreement on minimum benchmarks for digital reproductions of printed publications is an essential first step for libraries that wish to investigate whether they could manage and preserve print materials more effectively if they relied more heavily on digital reproductions for access. The draft benchmark is currently being reviewed by DLF member libraries who are being asked to endorse it. The guidelines have been shared broadly with the DLF membership, and comments are due back soon. Anne R. Kenney, will describe the work to date, its status, and next steps. (For details, see http://www.diglib.org/standards/draftbmark.htm.)

Assessing the quality of digital images and imaging systems

Steve Chapman, Preservation Librarian for Digital Initiatives, Weissman Preservation Center, Harvard University

This paper will report on the outcomes of a DLF-sponsored meeting of 20 expert imaging practitioners to address the question, "How can we assess the quality of digital images without ambiguity?" Given that libraries and museums have been and will likely continue to use different specifications for image scanning, it is important to consider whether these collections will be interoperable, particularly when federated in initiatives such as ArtSTOR. The forum gave practitioners the opportunity to exchange ideas about what is "good" quality in images and imaging systems, then to prioritize needed tools, applications, and training to meet institutional and collective goals to make digital reproductions of consistent quality and persistent utility.

Comment and context

Robin Dale, Member Programs and Initiatives, Research Libraries Group

The speaker will act as respondent to the two papers presented in this session and place the initiatives they describe in their national and international context

Approaches to interoperability and cross-collection searching 2

Plans for an OAI metadata harvesting service

Timothy W. Cole, University of Illinois at Urbana-Champaign

The University of Illinois has been funded by the Andrew W. Mellon Foundation to research and implement resource discovery services based on the Open Archives Initiative (OAI) Protocol for Metadata Harvesting (PMH). One of seven OAI Metadata Harvesting projects funded by Mellon, the Illinois Metadata Harvesting Service will focus on indexing item-level metadata describing cultural heritage holdings of libraries and museums with a special emphasis on the harvest of Encoded Archival Description (EAD) finding aids and metadata describing cultural heritage content held by CIC libraries and allied archives and museums. A prototype portal allowing search and retrieval of harvested metadata will be developed and tested with end users. A primary objective of the Illinois project is to investigate and document the potential of OAI-based services to reveal and make more accessible "hidden" online scholarly information resources in the cultural heritage domain. We will be investigating a wide-range of issues including those related to search interface design, value-added indexing functions, presentation of search results, transformations between heterogeneous metadata schemas and subject taxonomies, portal usage patterns, and the influence on search success of adherence to metadata creation "best practices." This presentation will describe preliminary system architecture, current project plans, and work done to date. A preliminary harvest of more than 150,000 metadata records from a half dozen OAI Metadata Providers has been completed and will be discussed. Open Source tools and sample implementations for both OAI Provider and OAI Harvesting services have been developed and also will be discussed during this presentation.

The IMAGES project: Building a metadata sharing community at the University of Minnesota

Chuck Thomas, Digital Projects Librarian, University of Minnesota Libraries

Since project inception in April 2000, the IMAGES (Image Metadata Aggregation for Enhanced Searching) initiative has been a major component of the University of Minnesota Libraries digital library effort. IMAGES is both a delivery platform for all digital imaging projects within the University Libraries, and a metadata aggregator for the entire campus. The IMAGES model is offered as a strategy for establishing libraries as THE place to find digital content on campuses. The public interface to IMAGES just went live in July 2001. The presentation on IMAGES will cover the main concepts behind the project, including a few unique features not yet found in other sites; lessons and decisions of the design process; selling the concept to a campus community; the intended role of IMAGES in a national context; initial reception by both internal and external audiences; and future plans for growth.

An update on the OAI

Speaker to be advised

Plenary session 3. 1130- 1300

Digital libraries and models of academic support

Donald J. Waters, Program Officer, Scholarly Communications, The Andrew W. Mellon Foundation

The Andrew W. Mellon Foundation has supported a wide variety of digital library projects over the last seven years. The speaker will highlight some of the lessons learned in these projects, identify continuing and emerging needs for connecting scholarly resources to ongoing teaching and research, and explore with the audience how digital libraries might best be shaped to meet these needs.

The DLF. Progress and prospect