Harvard University
Report to the Digital Library Federation
January 15, 2001


  • Collections, Services, and Systems
  • Projects and Programs
  • Specific Digital Library Challenges

    I. Collections, Services, and Systems

    The HOLLIS Portal
    Launched at the end of August, HOLLIS is a comprehensive and powerful "portal" web interface that presents a single, organized view of web-accessible resources to the Harvard community. It provides access to union catalogs and more than 1,500 electronic resources. HOLLIS also organizes and provides access to information about Harvard's libraries, allowing users to extend searches to individual library web pages. The portal incorporates a number of recently deployed LDI services and systems such as VIA (Visual Information Access) and OASIS (Online Archival Search Information System)
    Information is available at: http://hul.harvard.edu/ois/systems/hplus/index.html
    The system is available at: http://lib.harvard.edu/

    A web-based union catalog of archival and manuscript finding-aids created by archives and repositories throughout the University. OASIS currently contains 305 finding aids contributed by Harvard libraries, museums, and academic departments. Curators and archivists are now able to contribute to OASIS by using off-the-shelf software (mostly WordPerfect SGML Edition) to convert print finding aids into digital form.
    Information is available at: http://hul.harvard.edu/ois/systems/oasis/
    The system is available at: http://oasis.harvard.edu/

    A web-based union catalog describing the photographs, prints, drawings, paintings, and other visual resources held by Harvard libraries, archives, and museums. VIA contains more than 98,000 cataloging records and 19,094 thumbnails, and during its first full year of operation averaged 300 connections and 780 searches per month.
    Information is available at: http://hul.harvard.edu/ois/systems/via/
    The system is available at: http://via.harvard.edu:748/html/VIA.html
    The primary development for VIA this year was the launching of OLIVIA, a catalog maintenance system that supplies VIA with descriptive data and images. Information is available at: http://hul.harvard.edu/ois/systems/olivia/

    Nineteenth-Century American Trade Cards
    An LDI funded project that provides access through VIA to catalog records and images of 1,000 nineteenth-century trade cards selected from the Baker Library Historical Collections. To view this collection, link to VIA at http://via.harvard.edu:748/html/VIA.html, follow the Search VIA link, and following "search for" enter the phrase trade cards in the first search window.

    Digital Repository Services (DRS)
    DRS provides secure, long-term storage for all categories of digital material. In addition to storage, the DRS provides management services such as archiving and reporting on use, and it serves a delivery function by interacting with catalogs and other delivery applications. The first production release of this system was in October.

    Access Management Service (AMS)
    AMS coordinates several steps necessary for the security of networked resources: user authentication, user profiling, and access protocols specific to categories of resources. The security developed by Harvard's LDI will be available for use by all University library systems and is designed to minimize the need for repeated authentication of users. AMS was first used with the launch of the HOLLIS portal in August to control access to resources.

    Electronic Reserves is a web-based service that provides students with online access to course reserves reading materials. The service, piloted in 1999 and expanded in 2000, is currently limited to shorter print materials such as articles and book chapters. Working with the Faculty of Arts and Sciences and the Harvard Divinity School, E-Reserves made available 903 citations for 38 courses in the spring semester and 1070 citations for 38 courses in the fall semester.
    Information is available at: http://hul.harvard.edu/ois/systems/ereserves/
    The system is available at: http://ereserves.harvard.edu/

    II. Projects and Programs

    Library Digital Initiative (LDI)
    LDI, now in its third year, was launched in July 1998 as a five-year program to develop the University's capacity to manage digital information. Information about the program including annual reports for the first and second year is available on the web site at:

    Internal Challenge Grant Program
    Through LDI, Harvard University is funding digital projects contributed from throughout the University. The goals of these projects range from digital conversion of analog materials to the development of new systems to deliver material that is "born digital." These projects are helping to prioritize and define the requirements of LDI's infrastructure systems, to test the resulting systems and services and to contribute content. Descriptions of the projects along with URL's will be provided in the Collections, Services and Systems section of this report as the projects are available online.

    Advisory Services
    One of the key components of Harvard University's Library Digital Initiative is Advisory Services that provide expertise and technical assistance to libraries, archives, museums, and research projects involved in collecting or creating digital resources throughout the University. Assistance is provided in the following areas:
    • Digital Acquisitions ~ issues of licensing, contracting, and vendor relations.
    • Metadata ~ standards and best-practice guidance, creating data related to access and description of digital materials and for managing digital collections.
    • Reformatting ~ technologies, standards, vendors, and workflow design. Includes both analog-to-digital conversion (digitization) and digital-to-digital conversion (digital processing).

    Harvard University Library participated in a trial of the context-sensitive linking technology called SFX. SFX technology allows institutions to provide tailored linking and navigation between heterogeneous, distributed information systems. Harvard is one of a number of libraries and research centers that is testing the use of SFX to enable linking between different types of electronic resources. The SFX server software was developed by researchers at the University of Ghent (Belgium), and has since been acquired by Ex Libris, a developer of applications for libraries and information centers.

    Harvard University is participating in the alpha testing of Lots of Copies Keep Stuff Safe (LOCKSS). LOCKSS, a project of Stanford University's Highwire Press, preserves access to scientific journals published on the web by maintaining multiple copies at various sites and conducting periodic comparisons among them to ensure the materials remain consistent and authentic. The system aims to ensure that digital resources are protected through large-scale replication. Testing began in August 2000.

    Virtual Data Center (VDC)
    The Harvard University Library and the Harvard-MIT Data Center, through a grant from the National Science Foundation (NSF) Digital Libraries Initiative Program, and in partnership with the University of Michigan, are developing an open source "digital library in a box" for social sciences data. An alpha version will be launched in early 2001. This release will provide a lightweight digital repository service, a Z39.50 and Open Archives Cataloging Service, integrated online exploratory data analysis, and support for the creation, cataloging, and preservation of distributed virtual collections.

    Mellon Electronic Journal Archiving Project
    The Harvard University Library was recently awarded a planning grant to explore the issues in establishing a third party repository for electronic journals. The Andrew W. Mellon Foundation, with cooperation from the Digital Library Federation and the Council on Library and Information Resources, established an Electronic Journal Archiving program to take steps in solving the problem of archiving digital scholarly journals. The program awarded a number of planning grants to major research institutions throughout the country including Harvard University.

    The expected outcome of the grant is a plan for implementing an experimental archive for electronic journals. The plan will include agreements with publishers regarding archival rights and responsibilities, methodologies that the archive would adopt to validate its archival processes, and organizational and business models of the archive. The planning effort will last for one year, throughout 2001.

    III. Specific Digital Library Challenges

    Structural metadata
    Harvard is currently investigating options and standards for capturing and encoding the "structural metadata" of complex digital objects (e.g. books, serial runs, multimedia documents, etc.). We are looking particularly at the Making Of America II XML DTD (MOA2) and will be developing software that uses MOA2 for simple page turning applications during the coming year.

    Archiving systems
    As we move forward from developing first generation digital repository services towards more fully functional digital archives, Harvard is investigating architectures for building Digital Archives. We are starting with the OAIS reference model, and researching what others have developed to support this model. To advance this work, Harvard has received a grant from the Mellon Foundation to research and propose a digital archive for electronic journals during 2001.

    Methods to measure image quality
    Last year, DLF published the series "Guides to Quality in Visual Resource Imaging." These essays point to the need for acceptable methods and tools to assess digital image quality. Such methods would ensure, for example, that downstream image processing steps (for delivery or migration) would result in little loss to quality. We are working with experienced imaging practitioners to assess whether a common method of quality assessment can be achieved. If it is achievable, we will work to facilitate the development of appropriate tools and techniques.

    Long-Term Storage
    Over the next 6 - 12 months, planning for cost effective, long-term storage of massive amounts of data becomes a large scale issue for us. As our Digital Repository grows and accepts new formats with very large file sizes, we will no longer be able to store all of the objects on "spinning" media. Currently, all of our objects are stored on Network Attached Storage (NAS nsf mounted disks). We believe that we will need to move large files to cheaper, offline media while maintaining a reasonable response service model.