FALL FORUM 2001
What happens when the "display" software we use to show digital collections online becomes so powerful that it begins to transform the online collection itself? Using Geographic Information Systems (GIS) software in this manner offers an interesting example of how digital collections representing cultural objects can be radically reformatted and opened up by software. This talk will explore the use of GIS in enabling analysis and interpretation of digital historical maps both in their object form and as digital collections. GIS offers an example of the use of software to not only search and index digital collections, but also to reformat (through Georeferencing) digital historical maps to enable measurement, comparison, and analysis that was not possible with the original digital object. In addition, GIS gives us interesting methods of search visualization comparable to hyperbolic trees and other spatial schemes used to represent collection contents. Increasingly, the line between "born digital" GIS content (databases, vector maps) and scanned digital historical maps is being blurred, as all this information is combined and mixed through 3D, transparency, and data queries. This raises interesting questions about the integrity of the original digital image scan and its relationship to its many new manifestations in the GIS. The potential for rich combinations of this data will be enhanced by the growth of several online GIS repositories, already in formation. Practical implementation of these concepts will be shown through the speaker's experience in building an online collection of historical maps, www.davidrumsey.com, and moving it into the GIS sphere.
Dale Flecker, Associate Director for Planning and
Systems, Harvard University Library
Anne Kenney, Associate Director, Department of Preservation, Cornell University Library
John Mark Ockerbloom, Digital Library Architect and Planner, University of Pennsylvania Library.
Ann Okerson, Associate University Librarian, Yale University Library
Panel discussion with representative of Mellon e-journal archiving projects . Panelists will reflect on some of the key challenges arising from their work including: negotiations with publishers about the terms under which archival holdings can be accessed and used; models for long-term funding of archiving efforts; developing trust in the integrity and reliability of the archival repository.
Kris Brancolini, Kris Brancolini, Director, Digital
Library Program, Indiana University
Rebecca Graham, Head, Library Computing Services and Director, Digital Library Program, Johns Hopkins University
Elizabeth Shaw, Visiting Lecturer, Department of Library and Information Sciences, University of Pittsburgh.
Recruitment, training, and retention of good staff are real and serious impediments to the digital library's development. The problems are only partly financial. Career paths that intersect with digital library programs are not at all well established; nor are the professional development and reward structures. Schools of library and information science may contribute substantially to solutions but may themselves be constrained, for example, by their curricula, the abilities of their incoming students, even by their academic staff. From their very different perspectives, the three speakers in this session will initiate discussion by attempting to:
Forum participants who attend the session will be encouraged to contribute their own thinking and experience and also to help think about whether, how, and to what extent the area may be an appropriate one in which to move with a DLF initiative.
Ed Pentz, CrossRef
CrossRef is a non-profit organization setup up by scholarly publishers in January 2000. Based on collaboration and standards, the CrossRef system enables the automated large-scale assignment of DOIs to scholarly journal articles, conference proceedings and books and the efficient creation of reference links using DOIs. Widespread reference linking is now taking place between CrossRef members and affiliates. With the CrossRef/DOI infrastructure in place, more sophisticated linking models are being developed.
Eric Van de Velde, California Institute of Technology
A URL takes requesters from a citation to a destination... provided, of course, the URL is still valid. The OpenURL standard, which we are developing under the auspices of NISO, allows the development of high-quality links that feature additional properties, such as:
A few commercial services can provide some of this functionality right now. To encourage the growth of more and better extended-linking services, NISO has put the standardization of OpenURL on the fast track. We are developing a standard that will serve the scholarly-information community immediately and other communities in the long term. The current chaotic web is wonderful in its way. However, within this web infrastructure, we believe there is a need for a high-quality web of vetted information. To bring this to the scholarly-information society as soon as possible, we need the OpenURL standard as a key enabling technology. I will outline the major issues of OpenURL standardization, and I will show how you can get involved.
References and acknowledgements:
Dale Flecker, Associate Director for Planning and Systems, Harvard University Library
This paper will discuss results of an experiment involving CrossRef, the DOI, and others to localize reference linking.
David Ruddy, Electronic Publications Specialist, Cornell University Library
With development support from The Andrew W. Mellon Foundation, Project Euclid has as its goal the advancement of affordable scholarly communications in mathematics and statistics. A consortial effort among Cornell University Library, Duke University Press, and several development partners who publish math and statistics literature, Euclid will provide independent publishers with a mechanism for the discovery, display, and distribution of their electronic content. By doing so, we hope to challenge emerging patterns of consolidation in the ownership and dissemination of scholarship, at the same time recognizing and encouraging the value added by learned societies and other publishers of scholarly materials. Project Euclid represents an expanded and more active role for academic libraries in the arena of scholarly communication. This talk will give an overview of the project--its goals, scope, challenges, technical design, and current status. Project Euclid URL: http://projecteuclid.org.
Catherine Candee, Director of Scholarly Communication Initiatives, California Digital Library
eScholarship--a California Digital Library (CDL) initiative to support innovation in scholarly publishing--applies technical and organizational support to faculty-led initiatives to change scholarly communication. In just two years developments in digital technologies and Web publishing, coupled with greater acceptance of non-traditional publishing forms, have given rise to ever more sophisticated scholar-led projects and a corresponding array of unsettled questions concerning copyright, peer-review, and permanence of the scholarly record. Currently, eScholarship is exploring production level dissemination and publication, experimenting with new business models, and enjoying collaborative partnerships with the University of California Press, bepress.com, and several scholarly societies.
Maria Bonn, Head, Scholarly Publishing Office, University of Michigan Library
The University of Michigan Library's Scholarly Publishing Office (SPO) has tools and methods for the electronic publication and distribution of scholarly content. The office supports the traditional constructs of journal and monographic publication in an online environment, as well as publishing scholarly work expressly designed for electronic delivery. SPO exists to develop services that are responsive to the needs of both producers and users, to foster a better economic model for campus publishing, to support local control of intellectual assets, and to create highly functional scholarly resources. SPO is particularly concerned -as this talk will reflect - with building sustainable models and methods the can be shared with other institutions and that bridge the gap between academic self-publishing and large, aggregated, commercial publishing.
Meg Bellinger, President, Preservation Resources,
Robin Dale, Program Officer, Research Libraries Group
This panel discussion will present work resulting from OCLC's and RLG's joint exploration of digital archiving of digital archiving. Beginning with an overview of standardization and best practice efforts for digital respoitories, the session will focus on the development of attributes information and preservation metadata information in the digital archives environment. Looking forward, the speakers will discuss certification issues, sample metadata implementations (including embracing other emerging standards such as METS), and the applicability of attributes information to existing digital libraries and emerging digital repositories.
Bernard Reilly, Director, Center for Research Libraries
Changing conditions in the non-profit cultural sector have created a new baseline of expectations of publicly supported public institutions. Previously the "contract" between the community and its libraries and museums outlined the basic expectations for those institutions. This contract, albeit often an implied one, derived from statutory obligations of non-profit organizations, the inherent conditions of public funding, and the traditions of philanthropy in the U.S. The contract entailed a stewardship role that involved: long-term preservation and security of cultural materials; continuous and unrestricted availability of those materials to a general audience or to the supporting community; and a general accountability of the organizations to their supporting community.
This stewardship role may not convey to those libraries' and museums' activities in the global, digital environment. New models for service and product delivery, compensation, and funding are creating different, and far higher, audience expectations for service and access. Information and content must be more than just secure and available: it must be delivered to the desktop and presented together with user tools and ancillary resources. Users expect to obtain more than catalog records and other such limited kinds of information from museum and library web sites. They expect real-time responsiveness and information on demand. This involves a tacit "upping of the ante" for libraries and museums, beyond the demands for responsible stewardship. To fulfill such expectations, works of art and collections in themselves offer relatively little help. They are "inert" assets. The content embodied in those collections, i.e., the visual and textual data contained in them, and even the expertise of curatorial and research staff, are the more "liquid" and hence valuable assets. These can be re-aggregated, electronically delivered, licensed, and brokered by the organizations for revenue and other benefits. The high costs of operating in this environment, moreover, are driving organizations toward financially self-sustaining activities and alliances, which in turn reinforce the drift away from stewardship. My presentation will explore the costs of such expectations, and examine some of the larger factors that have driven such changes. I will also suggest terms that might be useful for a new "contract" between cultural heritage organizations and the public.
Barbara Taranto, Digital Library Program, New York Public Library
As we move deeper into the 21st century, it would be best to anticipate the issues of audience and move to understand, strategize and plan for future use of digital materials. Should we do it, can we do it and how can it be done?
In the world of books, one does not often hear of a novel that was horrid, but pages 195 through 200 were brilliant. In the theater occasionally one may hear that the playwright should reconsider his or her choice of career, even though the performance by certain actors was inspired. Not so with film. It is an oddity of the cinematic arts that on the whole a film may be maudlin, tedious, obtuse, cloying, overacted, under-acted or simply bad, and still be memorable, not alone for its shortcomings, but for a single moment when the presentation transcends the story and becomes something altogether different.
So it was when Henry Thomas and Drew Barrymore stalwartly supported their extraterrestrial guest as it sent an electronic mayday out into the cosmos. For the children in the audience there was nothing incredible, even if they didn't entirely believe in little squished chocolate marshmallow man with superior intelligence; about ET sending out a distress signal and expecting a response. Those children wholeheartedly believed and continue to believe in the audience "out there".
As libraries consolidate and continue to increase their special collections, the issues of short-term and long-term audience becoming heightened. At one time libraries, both circulating and not, and especially research libraries were reasonably certain about the topography of their audiences. In fact, in many cases it was the proscribed audience and not collection development, per se, that shaped the identity and role of the institution. But the practices of public service that are so familiar within the non-research community are fast becoming the concerns of the specialized libraries.
As we move deeper into the 21st century, it would be best to anticipate the issues of audience and move to understand, strategize and plan for future use of digital materials. Not so much because the technology is changing rapidly, but because the audience is evolving more quickly than libraries may be prepared to accommodate.
Who is the new "us"? And who will that be in five years or ten years? Do we have an obligation to meet the expectations of the new audience given that it is changing so rapidly? Can we afford not to?
Nancy Allen, Dean and Director, Penrose Library, University of Denver
The Colorado Digitization Project has had three years of experience in collaborative approaches to digitization of primary resource material from scientific and heritage organizations of many types. The CDP provided infrastructure (web-based project management tool kit, training, regional scanning centers, metadata creation tools, a finding system, small incentive grants) to encourage all types of cultural heritage institutions to work together. Libraries of many types, museums and historical societies large and small, and archives with various emphases have all partnered to contribute digital resources through a coordinated statewide effort. This presentation will cover many of the cross-organizational issues involved in such collaboration, including museum and library values and culture, models for decision making and collaborative organization, development of standards and best practices that work for all collaborators, and interoperability issues and solutions.
Professor Arms will speak about the vision for a National Science Digital Library for Science, Math, Engineering, and Technical Education, and how that vision is being realized through a major program of the National Science Foundation.
Denise Troll, Associate University Librarian, Carnegie Mellon University
Denise Troll will present the results of an extensive survey of methods used by leading digital libraries to measure the use and usability of online collections and services. The study offers a qualitative, rather than quantitative, look at experiences conducting digital library assessments. The results are not comprehensive or representative of library efforts, but indicative of trends in library practice. The trends identify popular research methods and common problems encountered when using these methods to assess and enhance access to and usability of online collections and services. The most popular research methods are survey questionnaires, focus groups, user protocols, and transaction log analysis. The common problems encountered include difficulty recruiting representative research subjects; difficulty selecting and using the appropriate research method; and difficulty analyzing, interpreting, presenting, and applying the data gathered to strategic planning and decision-making. The survey results indicate that the internal organization of libraries and the skills, preferences, and assumptions of librarians can be the biggest impediments to conducting successful assessments and implementing the findings. The presentation at the DLF Forum will focus on significant, common issues and concerns encountered when conducting assessments using popular research methods, and conclude with suggestions for future research or ways to address these concerns.
Rue Ramirez, Digital Library Services, University of Texas at Austin
Two years ago the University of Texas at Austin received a grant from the Institute for Museum and Library Studies (IMLS) to create a website that would provide evaluative techniques and tools to managers of museum and library websites. This presentation will provide a project summary and overview of techniques and tools we will be offering on the site.
Thornton Staples, University of Virginia
Carl Lagoze, Cornell University
The presentation will introduce a session devoted to review of different approaches to the development of comprehensive digital library architectures. Lagoze and Staples will discuss their practical work implementing the Fedora repository architecture.
Lorcan Dempsey, Vice President, Research, OCLC
Interoperability is much advocated, less often achieved. As our services are increasingly designed as elements in a distributed environment, so we increasingly recognise the need to architect this environment. This presentation describes a set of activities which have resulted in one definition of such an environment (http://www.dner.ac.uk/dner/). The potential advantages of such an approach are that it gives a framework for discussing interoperability, that it provides a common frame of reference for discussing product and service offerings, that it allows dependencies and roles to be identified, that it supports a development agenda. Using this framework as a basis, the presentation will discuss interoperability issues, and suggest some likely service directions.
Daniel Greenstein, Director, DLF
James M. Jackson Sanborn, Data Services Librarian and Charley Pennell, Head Cataloging Department
This presentation will focus on the planned usage of Blue Angel Technology's MetaStar product by the North Carolina State University Libraries. Library users are faced with an often bewildering array of research resources, including the catalog, electronic indexes and databases, spatial and numeric data collections, websites, Special Collections resources, etc. In attempting to provide a simpler yet more useful research gateway, NC State University Libraries have identified the goal of facilitating user-defined cross collection searching. There are a number of possible approaches to accomplish this. One solution that NC State Libraries is investigating centers on the use of MetaStar. MetaStar is a metadata harvesting, management, indexing, and searching tool. Through a modular architecture that uses XML as a connection tool, MetaStar provides the ability to harvest and index metadata from webpages, manage new and existing collections of metadata, and search multiple local and remote collections concurrently. Still in the early research and prototyping stages, we have been designing collections of intranet metadata, spatial and numeric data search tools, and subject specific gateway search interfaces.
Marty Kurth, Head of Cataloguing, Cornell University
A team at Cornell University Library is engaged in an effort to provide access to multiple digital collections as well as its entire online catalog through a single searching-and-browsing interface. By using the ENCompass digital library management system, the team is organizing a domain of digital objects that range in size from the full-text of a single pamphlet page to a database of over 100,000 electronic books in order to map those objects into the management system's three-level record structure of collections, containers, and objects. Striving to provide users with intelligible result sets has presented many challenges in three notable areas. First, representing a digital object's relative size within the system as a whole can be done via a collection-specific object record when the objects in a collection are of similar granularity, but objects in variably granular collections resist this approach. Second, presenting navigational paths to users to minimize disorientation by such means as clearly identifying the collections searched and foregrounding the hierarchical nature of the system's record structure inevitably runs into inflexibilities in system architecture. Finally, refining the interface vocabulary to describe systematically the record structure and the relationships between the records in the structure confronts the lack of commonly accepted user terms for digital objects and the relationships among them. The presentation will describe the team's efforts to address these challenges thus far. The Cornell University Library ENCompass Team includes Karen Calhoun (Team Leader), Meryl Brodsky, George Kozak, Marty Kurth, Fred Muratori, David Ruddy, Tom Turner, Sarah Young
Elaine Westbrooks, Metadata Librarian, Cornell
Adam Chandler, CTS Information Technology Librarian, Cornell University Library
Vivek Uppal, graduate student, Cornell University
The Cornell University Geospatial Information Repository (CUGIR) is a clearinghouse that provides unrestricted access to geospatial data and metadata, with special emphasis on those natural features relevant to agriculture, ecology, natural resources and human-environment interactions in New York State. CUGIR, like all nodes within the National Spatial Data Infrastructure (NSDI), is available for searching by way of the Z39.50 information retrieval protocol. Implementation complexity, retrieval accuracy, scalability, and performance issues make the Z39.50 architecture problematic within this context. Our presentation will describe a system at the Cornell University Library that bridges the gap between today's NSDI, traditional MARC-based library catalogs and utilities, and the Open Archives Initiative (OAI). The glue binding these different digital objects together is our implementation of Michael Nelson's pioneering work with "buckets." This session will provide an overview of the steps, challenges, and problems involved with mapping and managing complex metadata surrogates across standards and systems. Finally, we will speculate on how the OAI harvesting protocol may be seen as an alternative to the current Z39.50 NSDI system.
Jerome McDonough, New York University
Morgan Cundiff, Library of Congress
MacKenzie Smith, Harvard University;
Rick Beaubien, University of California at Berkeley
METS is a generalized metadata framework, developed to encode the structural metadata for objects within a digital library and related descriptive and administrative metadata. METS provides for the responsible management and transfer of digital library objects by bundling and storing appropriate metadata along with the digital objects. METS is expressed using XML, which means that METS data is stored according to platform and software independent encoding standards, such as UTF-8 (Unicode), ISO-8859-1, etc.
One important application of METS may be as an implementation of the Open Archival Information System (OAIS) reference model and as such can function as a Submission Information Package (SIP) for use as a transfer syntax; a Dissemination Information Package (DIP) for display or other applications; and an Archival Information Package (AIP) for storing and managing information internally
This panel will give the background and overview of the schema, clarifying the distinction between a metadata framework (such as METS) metadata wrapper schemes (such as RDF). Panelist from UC Berkeley, The Library of Congress, and Harvard will review their current and planned usage of METS, and will give examples of tools in development for producing and using METS objects. There will be plenty of time for questions and discussion.
Anne Kenney, Director of Programs, Council for Library and Information Resources
Libraries and others are digitizing increasing quantities of printed material for online access without agreement on any desirable level of imaging quality. The DLF is working to identify, and build support for, specifications acceptable as the minimum necessary for digitally reproducing printed books and serial publications with fidelity. Adoption of such benchmarks would help users and libraries both. Users could have more confidence in the fidelity of digital reproductions available to them. And libraries could produce and maintain reproductions with confidence that expensive re-digitization would not become necessary. Digital reproductions meeting at least the benchmarks' minimum specifications would remain viable even as reproduction techniques improved.
Also, because such texts would have well-known, consistent properties, they could support a wide variety of uses (including uses not possible with printed texts). Additionally, agreement on minimum benchmarks for digital reproductions of printed publications is an essential first step for libraries that wish to investigate whether they could manage and preserve print materials more effectively if they relied more heavily on digital reproductions for access. The draft benchmark is currently being reviewed by DLF member libraries who are being asked to endorse it. The guidelines have been shared broadly with the DLF membership, and comments are due back soon. Anne R. Kenney, will describe the work to date, its status, and next steps. (For details, see http://www.diglib.org/standards/draftbmark.htm.)
Steve Chapman, Preservation Librarian for Digital Initiatives, Weissman Preservation Center, Harvard University
This paper will report on the outcomes of a DLF-sponsored meeting of 20 expert imaging practitioners to address the question, "How can we assess the quality of digital images without ambiguity?" Given that libraries and museums have been and will likely continue to use different specifications for image scanning, it is important to consider whether these collections will be interoperable, particularly when federated in initiatives such as ArtSTOR. The forum gave practitioners the opportunity to exchange ideas about what is "good" quality in images and imaging systems, then to prioritize needed tools, applications, and training to meet institutional and collective goals to make digital reproductions of consistent quality and persistent utility.
Robin Dale, Member Programs and Initiatives, Research Libraries Group
The speaker will act as respondent to the two papers presented in this session and place the initiatives they describe in their national and international context
The University of Illinois has been funded by the Andrew W. Mellon Foundation to research and implement resource discovery services based on the Open Archives Initiative (OAI) Protocol for Metadata Harvesting (PMH). One of seven OAI Metadata Harvesting projects funded by Mellon, the Illinois Metadata Harvesting Service will focus on indexing item-level metadata describing cultural heritage holdings of libraries and museums with a special emphasis on the harvest of Encoded Archival Description (EAD) finding aids and metadata describing cultural heritage content held by CIC libraries and allied archives and museums. A prototype portal allowing search and retrieval of harvested metadata will be developed and tested with end users. A primary objective of the Illinois project is to investigate and document the potential of OAI-based services to reveal and make more accessible "hidden" online scholarly information resources in the cultural heritage domain. We will be investigating a wide-range of issues including those related to search interface design, value-added indexing functions, presentation of search results, transformations between heterogeneous metadata schemas and subject taxonomies, portal usage patterns, and the influence on search success of adherence to metadata creation "best practices." This presentation will describe preliminary system architecture, current project plans, and work done to date. A preliminary harvest of more than 150,000 metadata records from a half dozen OAI Metadata Providers has been completed and will be discussed. Open Source tools and sample implementations for both OAI Provider and OAI Harvesting services have been developed and also will be discussed during this presentation.
Chuck Thomas, Digital Projects Librarian, University of Minnesota Libraries
Since project inception in April 2000, the IMAGES (Image Metadata Aggregation for Enhanced Searching) initiative has been a major component of the University of Minnesota Libraries digital library effort. IMAGES is both a delivery platform for all digital imaging projects within the University Libraries, and a metadata aggregator for the entire campus. The IMAGES model is offered as a strategy for establishing libraries as THE place to find digital content on campuses. The public interface to IMAGES just went live in July 2001. The presentation on IMAGES will cover the main concepts behind the project, including a few unique features not yet found in other sites; lessons and decisions of the design process; selling the concept to a campus community; the intended role of IMAGES in a national context; initial reception by both internal and external audiences; and future plans for growth.
Speaker to be advised
Donald J. Waters, Program Officer, Scholarly Communications, The Andrew W. Mellon Foundation
The Andrew W. Mellon Foundation has supported a wide variety of digital library projects over the last seven years. The speaker will highlight some of the lessons learned in these projects, identify continuing and emerging needs for connecting scholarly resources to ongoing teaching and research, and explore with the audience how digital libraries might best be shaped to meet these needs.
Daniel Greenstein, Director, DLF