random library quotation Link: Publications Forum Link: About DLF Link: News
Link: Digital Collections Link: Digital Production Link: Digital Preservation Link: Use, users, and user support Link: Build: Digital Library Architectures, Systems, and Tools
photo of books

DLF PARTNERS

""

DLF ALLIES

""

Comments

Please send the DLF Executive Director your comments or suggestions.

DIGITAL LIBRARY FEDERATION SPRING FORUM 2007 PASADENA, CALIFORNIA APRIL 23 – 25, 2007 The Westin Pasadena Hotel
191 Los Robles Avenue
Pasadena, California 91101
+1-626-304-1442
Floor Plan Pasadena City Hall
DAY ONE: Monday, April 23
PRECONFERENCE 8:30 a.m. – 12:30 p.m.
Developers' Forum—open (Fountain III) John A. Kunze, chair (California Digital Library)

The Developers' Forum is intended for technology developers, technical managers, or those representatives that have influence in decision-making at their institutions. This gathering provides dedicated time at the DLF Forum to meet together to share problems, ideas, and solutions of a technical nature.

Developers' Forum Agenda

8:30 a.m. – 12:30 a.m.
DLF Board of Trustees Meeting (Fountain IV)
DLF Aquifer Services Working Group Meeting—for project participants (Arcadia)
DLF Aquifer Metadata Working Group Meeting—for project participants (The Lieshman Boardroom)
10:30 a.m. – 12:30 p.m.
Registration (Fountain Foyer)
11:30 a.m. – 12:30 p.m.
First-time Attendee Orientation (Fountain Ballroom) Barrie Howard (Digital Library Federation)
Keynote Address 1:00 p.m. – 1:30 p.m.
Architectures for Collaboration: Digital Library Directions. (Fountain Ballroom) Peter Brantley (Executive Director, Digital Library Federation)
1:30 p.m. – 2:15 p.m.
Session 1: Content Proliferations: Libraries and Publishers. (Fountain Ballroom) Alison Mackeen, Maria Bonn, Chuck Henry, and Stephen Rhind-Tutt

The development of a wide range of alternative forms for content delivery and re-use, in concert with the continued growth of xml-based media production systems among libraries and publishers, presents new opportunities for libraries to lower impedances in the flow of content across presentation outlets. How can libraries collaborate with publishers in enriching content presentation through the utilization of content mashups, the derivation of logical and semantic linkages across diverse online content sources, and the efficient indexing of evolving media?

2:15 p.m. – 3:00 p.m.
Session 2: Municipal Harbors: Sustaining Collaborative Preservation. (Fountain Ballroom) Sayeed Choudhury, Abby Smith, and Laura Campbell

Preservation is a community responsibility requiring significant coordination among our institutions. Provisioning adequate distributed storage, establishing community norms for service commitments, developing sustainable funding strategies, and creating policy covenants are all necessary pre-requisites.

Perhaps more fundamentally, institutions will have to produce software toolsets facilitating preservation activities such as curation, content deposit, presentation of inventories for publication in registries facilitating discovery, and the maintenance of file formats. How can digital libraries coordinate the development and sustenance of a rich-enough ecosystem of tools to ensure a low adoption threshold for the initiation of preservation activity across the widest possible number of institutions?

3:00 p.m. – 3:30 p.m.
Break
3:30 p.m. – 4:15 p.m.
Session 3: Public Libraries: Services in Online Scholarly Communities. (Fountain Ballroom) Ben Vershbow and Noah Wittman

Communities of scholars are constructing domain focused social spaces encouraging collaboration, shared production, annotation, editing, and authorship. They typically utilize increasingly easy to use open source content management systems such as django and wordpress that enable the sharing of a wide range of content including images, videos, and texts. Arguably, libraries could be valuable partners in these online spaces, providing both infrastructure as well as content production, discovery, and manipulation expertise. How can libraries explode their registries and services outward to provide both content and technical support?

4:15 p.m. – 5:00 p.m.
Session 4: Google Book Services: Build or Borrow. (Fountain Ballroom) Rick Luce, John Price-Wilkin, and Michael A. Keller

Google has tremendous intellectual capacity. Is it worthwhile for libraries to compete with their services? Or is it better to partner in their definition? Do we need to know how to do these things "just in case" for our own purposes should Google default on their provision? If that is the case, how can we build effective collaborative ventures as an alternative? Would these ventures permit the exploration of new vertical functionalities more specific and valuable for the academy, not provided by google?

5:00 p.m. – 6:00 p.m.
Session 5: OpenID. (Fountain III) Scott Kveton (JanRain)

A solution to the single sign-on problem has been long coming. For far too long users have had to deal with too many usernames and too many passwords. The emergence of OpenID has been a surprise to many and a relief to others. Simple single sign-on is here and its growing at an amazing rate in the form of OpenID. Come hear Scott Kveton talk about the why's and how's with OpenID and what its going to mean to the digital library.

Session 6: PANEL: Beyond the IR: Creating a Sustainable Model for Supporting Digital Scholarship. (Fountain II) Todd Grappone, Deborah Holmes-Wong, Bruce Zuckerman, and Marsha Kinder (University of Southern California) As digital scholarship becomes the norm for faculty members there is increasing pressure placed on the university library to store and preserve born digital content. There has been a leap in born digital projects and scholarship entirely conceived for new media over the past few years.

The USC Libraries' creation of a suite of services known as AIMS (Archiving, Imaging, and Metadata Services) has provided opportunities for collaboration with technology savvy faculty in the creation and preservation of faculty research and to work toward creating a sustainable campus infrastructure for digital scholarship.

Todd Grappone, University of Southern California Associate Executive Director, Information Development & Management, will discuss the challenges of creating a sustainable business model for funding faculty digital scholarship and its ongoing maintenance and preservation. He will discuss the creation of Archiving, Imaging and Metadata Services, a suite of services the USC Libraries provides to faculty to assist them as they begin their foray into digital scholarship. Strategies for creating the infrastructure, aligning grant funding, institutional support and community interest to ensure these projects persist long after the initial grant funding has been spent will also be covered.

Marsha Kinder, University of Southern California, Associate Vice Provost for Research Advancement in the Humanities will discuss recent collaborations with the USC Libraries; the archiving of Russian Modernism, a suite of learning objects created in conjunction with Slavic scholars at USC, the University of California, Berkeley and the University of Chicago. This Labyrinth Project initiative uses cutting-edge database and animation technologies to create an active and engaging educational experience for students.

Bruce Zuckerman, University of Southern California, University Professor will discuss collaborations of the USC Libraries and the Western Semitic Research Project. For the past 20 years WSRP has used advanced photographic and computer imaging techniques to document objects and texts from the ancient world. In doing this they have built a vast collection of images that they make available to scholars, students, educators and the general public through a variety of ways, including InscriptiFact, a database designed to allow access via the Internet to high-resolution images of ancient inscriptions from the Near Eastern and Mediterranean Worlds.

Professor Zuckerman will discuss challenges inherent in digitizing ancient writings drawn from numerous collections and countries. He will highlight the ways in which new imaging techniques and other technology are affecting scholarship and our understanding of the ancient world. Drawing from his recent service on the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences he will underscore the role that libraries must play in assisting scholars find and bring together resources in new and intellectually provoking ways.

Deborah Holmes-Wong, project manager/librarian will address the operational challenges of working with new and existing projects. She will discuss the process of recruiting faculty and evaluating the feasibility for projects, developing budgets, working with faculty on grants, and allocating staffing and other resources to complete projects.
7:00 p.m. – 9:00 p.m.
Evening Reception (The Huntington Library) A special event hosted by the DLF Board of Trustees.
7:00 p.m. – 8:00 p.m.
POSTERS (The Huntington Library)
1) New Tools for Digital Photography: Polynomial Texture Maps and Digital Rollouts. Bruce Zuckerman, Todd Grappone, Marilyn Lundberg, Leta Hunt, and Matt Gainer (University of Southern California) For the past two decades the West Semitic Research and InscriptiFact Projects have been developing methodologies and tools to make the study of ancient texts more accurate, productive, and accessible. Under the direction of Bruce Zuckerman, these research programs have consistently adopted new technologies to help capture, archive, and deliver high-resolution images of ancient inscriptions from the Near Eastern and Mediterranean Worlds. From tuning wavelengths of infrared or ultraviolet light to bring out unseen details from the Dead Sea Scrolls and other manuscripts, to developing customized camera-rigs and lighting to minimize damage and maximize detail when photographing everything from cylinder seals to monumental inscriptions, they have solved complex imaging problems associated with photographing ancient objects.

The USC Libraries Imaging Lab has been working with WSRP and the InscriptiFact team to replace analog tools with their digital equivalents. Along with transitioning from large-format film to high-resolution digital capture, this collaboration has explored two uniquely powerful tools for digitizing ancient artifacts: Polynomial Texture Maps (PTMs, originally developed by Hewlett-Packard Imaging Labs), and Digital Rollouts. PTMs allow users to manipulate the intensity and direction of lighting of images on their screen. In effect, a researcher can use a mouse to change both the quality and direction of light on an object. Digital Rollouts allow users to view 'flattened' digital images of cylindrical objects at extremely high-resolution.

This poster will provide an overview of how PTMs and Digital Rollouts are created, and how Digital Libraries can leverage these unique technologies for capturing three-dimensional objects.
2) Update on CCO (Cataloguing Cultural Objects) and CDWA (Categories for the Description of Works of Art) Lite XML Schema. Murtha Baca and Karim B. Boughida (The Getty Research Institute) In June 2006, ALA Editions published Cataloging Cultural Objects: a Guide to Describing Cultural Works and Their Images (CCO). This poster session will present and promote this standard for descriptive metadata for works of art, architecture, and material culture, and their visual surrogates. We will also present the CDWA Lite XML schema, a data format standard informed by the CCO rules that is one of the recommended formats for harvesting of metadata records via the Open Archives Initiative Protocol for Metadata Harvesting (OAI/PMH). The session will include a demonstration of CDWA Lite metadata records and images from the Getty Museum and the Getty Research Institute that were harvested by ARTstor and incorporated into its Image Gallery.
3) Soliciting User Response to Aggregated Metadata Portal Design. Michelle Dalmau (Indiana University), Anne Karle-Zenith (University of Michigan), Muriel Foulonneau (Centre de la Communication Scientifique Directe Centre National de la Recherche Scientifique), Karen Miller (Northwestern University), Timothy W. Cole (University of Illinois at Urbana-Champaign), and Christopher J. Devers (University of Illinois at Urbana-Champaign) The Library of the University of Illinois at Urbana-Champaign (UIUC), in collaboration with the other research libraries of the Committee on Institutional Cooperation (CIC), has created and implemented an OAI-PMH metadata harvesting service (the CIC Search Portal) providing access to diverse information resources held by participating CIC institutions. Content represented in the CIC Search Portal ranges widely in scope, format, and target audience (e.g., spoken word recordings, theses and dissertations, software, videos, images, texts). To better assess the Portal's design and functionality, a multifaceted usability study was administered across three CIC institutions, Indiana University, UIUC and Michigan State University.

Objectives were to: assess success of the Web interface in helping users overcome the inherent and often difficult-to-fix problems associated with searching aggregated metadata; assess viability of the interface design as a basis for a portal that could positively impact research and instruction; better understand the diverse audience (librarians, teachers, students, scholars, etc.) who could benefit from using this tool and their specific information needs. The study built on and leveraged previous studies conducted by University of Michigan and UIUC on OAI-based services.

The presentation will detail study methodology, execution, and initial findings and will have applicability for future usability research concerning metadata harvesting services.
DAY TWO: Tuesday, April 24
8:00 a.m. – 9:00 a.m.
Breakfast (Fountain III/IV)
9:00 a.m. – 10:30 a.m.
Session 7: Digitizing Collections (Fountain I)
A) A Digitisation Workflow Management System for Massive Digitisation Projects . Noha Adly and Magdy Nagi (Bibliotheca Alexandrina) DAFv2 is a system developed in Bibliotheca Alexandrina, the Library of Alexandria, as a part of its institutional Digital Assets Repository (DAR). DAFv2 aims to manage the whole process of digitization including its various phases, scanning, processing, OCRing, quality assurance, archiving, encoding, system users, files movement and integration with the ILS and the library digital repository. It allows for defining different workflows for various types of objects and supports workflow dynamic evolutions. It provides history tracking of actions and flexibility to simultaneously manage multiple projects with a diversity of materials. Moreover, it supports ingesting a job in the middle of the workflow and allows easy integration of tools used to perform functions of the workflow, thus it allows the system to be used for projects involving different institutions. DAFv2 will be used in a collaborative project with Stanford and Yale for massive digitization of Arabic books, where the scanning will be performed in each library, and other phases will be shared. For instance, BA will be performing part of the processing, OCRing and encoding. The system is implemented on an open source platform and has been complemented with a web based system shared between the various partners where the metadata of the records are ingested by each partner; viewed by all where they can add their comments, but modified by the owners only. The system allows to provide a unified catalog of the books to be digitized, in a collaborative manner while removing duplicates. The database holding the metadata can be centralized or replicated among the partners. The digital objects are ingested in DAFv2 in the appropriate phase until it is published.
B) Digitizing Fire Insurance Maps: The Library of Congress Experience. Colleen Cahill, Julie Sweetkind-Singer, and Paul Rascoe (Library of Congress) Online viewing of the Sanborn Fire Insurance map collection has long been a goal of the Library of Congress which is heavily used by genelogists, historians and urban planners. Both the digitization and metadata creation of this large collection have presented many challenges. To remedy the need for manpower to tackle this mammouth project, the Library is forging cooperative agreements with the University of Texas in Austin and a consortum of libraries in California. Different approaches are being investigated, from borrowing staff to using funds to hire contractors. The panel will discuss not only those agreements and their benefits, but also parameters the Library has established for the digitization, arrangement of the data and the efforts to create a searchable database to the maps that will facilitate online access.
Session 8: Systems Design and Functionality (Fountain II)
A) Search Framework For A Large Digital Records Archives . Quyen Nguyen and Dyung Le (NARA) In order to continue to fulfill its mission in the information technology age, the National Archives and Records Administration (NARA) has made the decision to develop the Electronic Records Archives (ERA) system. The ultimate goal of the ERA system is to make digital records accessible to the public and authorized parties for centuries. Such access should be done in a way that is independent of technical platforms with which those records were created. According to the Open Archival Information System (OAIS) model, Access to records is one of the main components of an archival system. The Access component allows the Consumer to query the digital records in the Archival Storage, and returns result sets relevant to those queries. From the system design perspective, we propose to implement a flexible framework which would facilitate the utilization of current search engines and technologies. Multiple search engines or federation of search engines could be used at the same time, in a manner that is totally transparent to the Consumer. Moreover, the framework should be extensible so that future search techniques could be easily inserted without major redesign of the system. In this paper, we will explore different search technologies which are either currently available, or under on-going research. The goal of the study is to determine which technologies are suitable to the ERA system. One critical piece of the Access component is the Digital Asset Catalog, based on XML technology. We will present the principal data elements of the Digital Asset Catalog, and how its design would allow the overlay of different taxonomies. Topic Map will also be investigated as a potential use within our Search Framework for a richer set of indexing, categorization, and discovery experience.
B) A Single Search Box Interface to the NCSU Libraries, Two Years Later. Tito Sierra (North Carolina State University) At the 2005 DLF Spring Forum we presented about a project in development at the time to create a useful and usable single search box interface to the NCSU Libraries. That Fall we launched Quick Search, a highly-customized library website search tool with enhancements designed to quickly connect users to a variety of library collections, services and tools. Rather than replace existing dedicated discovery tools such as the catalog, electronic resource search, and subject portals, Quick Search is designed to complement and increase use of these tools by directing users to them via an intuitive single search box interface.

This presentation will discuss the major enhancements to the Quick Search application since launch. The application was designed as a platform for learning about the search needs and behavior of our academic library user community. The presentation will describe the methods implemented for collecting fine-grained clickthrough statistics that have helped us make design and content decisions. It will summarize a year's worth of usage statistics to illuminate how a major research university uses a single search box interface to the library.
10:30 a.m. – 11:00 a.m.
Break (Fountain Foyer)
11:00 a.m. – 12:30 p.m.
Session 9: (Fountain I)
PANEL: Digitizing Newspapers. Richard Boulderstone (British Library), Tom O'Brien (Apex Publishing), and Perry Willett (University of Michigan) Newspapers constitute significant historical documents. Their format, while very familiar, is complex, with articles of irregular sizes spanning multiple columns and pages, and also containing auxiliary photos and charts. Determining the copyright status of any or all of these parts is complicated. Newspapers are generally poorly printed on low quality, oversized paper, and many historical newspapers exist only on microfilm.

All of these factors make digitizing newspapers difficult, and as a result, digital libraries have only recently begun to digitize newspapers in large quantities. Best practices for digitizing newspapers have not been developed and adopted to the same extent that they have for books, journals and images, and a multitude of approaches to digitizing newspapers exist.

This panel will discuss digitizing newspapers from three perspectives: librarian, vendor and system designer. These perspectives all impact how endusers find and use digital newspaper collections.
  • Richard Boulderstone will discuss European newspaper digitization projects featuring the BL's ambitious project to digitize their 19th century newspapers collection. He will discuss how digital newspapers might fit into the European digital library program (i2010) and the particular challenges that this pan-European collaboration will encounter.
  • Tom O'Brien will discuss technical aspects of digitizing newspapers, including format and metadata options, workflow considerations, and costs. Apex has digitized many large newspaper collections for commercial publishers and institutions, and recently established a Global Newspaper Initiative to support coordination of the plethora of newspaper projects underway around the world.
  • Perry Willett will discuss U-M's development of a newspaper module in their digital library system DLXS, and the challenges that digital newspapers present to digital library systems designers.
Session 10: Education and Evaluation (Fountain II)
A) Developing a Digital Libraries Education Program. Jerome McDonough and Stacy Kowalczyk (University of Illinois at Urbana-Champaign) Those of us who manage digital libraries know that hiring the right people can be the most critical factor in a successful digital library program, yet the pool of qualified applicants for every position is extremely small. Even those of us fortunate enough to work in academic libraries with allied schools of library and information science are keenly aware of the difficulties in finding recent graduates with broad knowledge of and familiarity with digital library systems and services. Indiana University and the University of Illinois at Urbana-Champaign received a 'Librarian for the 21st Century Program' grant from the Institute of Museum and Library Services to foster collaboration between both universities' schools of library and information science and digital library programs to develop effective curricula for digital librarianship.

This paper will report on our progress to date: the experiences of our fellows and residents, the new courses developed, the issues encountered and our efforts in outreach, both past and future. We hope that this paper will engage the DLF in the discussion of Digital Library Education, and we will invite the audience to comment on developing digital library education programs from their perspective as practitioners. We anticipate that this presentation will take approximately 25 minutes with an additional 15 minutes reserved for questions and comments from the audience.
B) Assessing Scholarly Research Practices and Building New Models of Support. Wendy Pradt Lougee, Cecily Marcus, and Kate McCready (University of Minnesota) What core research behaviors and activities of scholars are shared across disciplines? Where are there distinct needs within a scholarly community? And what forms of digital tools and services support these needs? Since 2005, the University of Minnesota Libraries have been studying these questions while creating a framework to assess the full range of research practices of faculty and graduates students in the Humanities, and Social Sciences. The data collected from interviews with faculty members, graduate student focus groups, and a survey of over 1,100 researchers, allowed us to create a model for identifying and responding to the needs of scholars.

Despite differences in their fields, all scholars are reporting the increased need for:
  • interdisciplinary materials and research support
  • new ways to support collaboration, and
  • assistance organizing research and collections.
A number of endeavors are now underway within the University Libraries to address these issues (including a replication of the behavioral assessment in the Sciences). One outgrowth of our assessment projects, and in response to CLIR's 2004 Institute on Scholarly Communication, is EthicShare. A collaboration between 5 institutions lead by the University of Minnesota, EthicShare is an online community site for Bioethics/Practical Ethics scholars that seeks to implement new models for collaboration and for sharing resources, knowledge, and technical standards. Another new initiative delivers discipline-specific, component-based views of library resources via the campus portal. Future development work addresses the incorporation of social dimensions of online environments.

These projects, and others, help inform how the University Libraries can support the research process. This paper discusses findings from the University Libraries assessment of scholarly practices in the Humanities, Social Sciences, and Sciences, and the emerging projects that address online environments for interdisciplinary and collaborative scholarship.
12:30 p.m. – 2:00 p.m.
Break for Lunch [Individual choice]
2:00 p.m. – 3:30 p.m.
Session 11: (Fountain I)
PANEL: Second Life: An Exploration of Libraries and Education. Rachel Gollub (Stanford University) and Jeremy Kemp (Sloodle) Second Life is a popular three-dimensional, interactive virtual world with over three million registered users. Academic institutions and libraries have a growing presence in Second Life, as they explore the possibilities of expanding libraries and education into the virtual realm. Linden Lab's release of the source code to the open source community has removed some of the barriers to widespread adoption of the system, and has increased academic interest. The visual and social aspects of Second Life make it an ideal setting for a number of creative endeavors, and hold the possibility of providing a step in the evolution of the future of education and information. Join representatives from the Alliance Library System, the New Media Consortium, Sloodle, and Linden Lab as we discuss current and future projects in the academic space, and open the floor for general discussion of Second Life.
Session 12: User Interfaces and Search (Fountain II)
A) Chronology Matters: Problems and Approaches to Exploring Digital Collections by Date. Joseph Dalton (New York Public Library) A new interface has recently been developed at the New York Public Library (NYPL) which will allow users to explore NYPL Digital Gallery collections by date. While the development of date-based queries would seem to present a basic technical hurdle — analogous to the SQL 'order by' statement — there are some hidden and not-so-hidden obstacles to providing a user-friendly, data-driven interface for exploring digital items by historical era, years, or other dates across a disparate set of digital collections. While some of these challenges may result from the way input date text varies over successive generations of catalogs and catalogers, other challenges we encountered included defining date spans at the appropriate object level, assigning respective date ranges, parsing dates into linked queries for any digital item, and addressing digital objects which may contain no date. Other problems arise when deciding which date should be queried: from a user's perspective, would the significant date on which to sort be the year a statue was erected, the life-dates of the statue's subject, the statue's creation date, the sculptor's life-dates, the depicted date of the statue's object, or the date on which the photograph of the statue was taken? This presentation will also discuss some of the approaches which were considered over the course of this project.
B) Subject Metadata Enrichment using Statistical Topic Models. David Newman (University of California, Irvine) and Kat Hagedorn (University of Michigan) Creating a collection of metadata records from disparate and diverse sources often results in uneven, unreliable and variable quality subject metadata. Having uniform, consistent and enriched subject metadata allows users to more easily discover material, browse the collection, and limit keyword search results by subject. We demonstrate how statistical topic models are useful for subject metadata enrichment. We describe some of the challenges of metadata enrichment on a huge scale (10 million metadata records from 700 repositories in the OAIster Digital Library) when the metadata is highly heterogeneous (metadata about images and text, both cultural heritage and scientific literature). We show how to improve the quality of the enriched metadata, using both manual and statistical modeling techniques. Finally, we discuss some of the challenges of the production environment, and demonstrate the value of the enriched metadata in a prototype portal.
3:30 p.m. – 3:45 p.m.
Break (Fountain Foyer)
Open Discussions Session A 3:45 p.m. – 4:45 p.m.
1) The Once and Future Catalog: Finding and Getting in Next Generation Library Information Systems. (Fountain I) Karen Calhoun and Martin Kurth (Cornell University) At the ALA Midwinter meeting in Seattle, the Voyager Ivies Plus user group (Columbia, Cornell, Yale, Princeton, the Library of Congress, UCLA, and the University of Pennsylvania) convened to discuss their integrated library system (ILS) and the future of the online catalog. More and more initiatives (e.g., NC State's Endeca catalog, the Aquabrower, the State of Georgia's Evergreen) support an improved library user experience (e.g., a Google-like interface, faceted searching).

Innovative Interfaces and Ex Libris have launched development projects to introduce new catalog interfaces-Encore and Primo-that may be able to interoperate with other ILS back ends. Other libraries are exploring with OCLC the possibility of exposing WorldCat (presumably, with scoping) as their catalog interface, interacting with an ILS on the back end.

The result of all this progress is that many individuals and organizations are now grappling with the issue of enabling an ILS to interoperate with a different front end supporting discovery. Our group proposes a BOF session at the DLF spring forum to consider the benefits and feasibility of creating a working group to develop best practices and/or functional requirements for connecting the necessary systems and services to complete the discovery to delivery value chain, allowing users to more easily find and get (or connect to) library holdings in all formats. Would a new working group (structured along the lines of the DLF-sponsored Electronic Resource Management Initiative) yield community-wide progress on addressing the problem of connecting disparate systems for discovery and delivery of library resources?
2) Survey of Preservation Systems. (Fountain II) Karim B. Boughida and Sally Hubbard (The Getty Research Institute) The authors are chairing an internal digital preservation task group that will propose and manage an anticipated trusted digital repository. They will firstly discuss an internal report on digital preservation and trusted repositories, and secondly share the results of a survey sent to DLF members and other targeted organizations on what digital preservation systems and practices are currently in use.
4:45 p.m. – 5:00 p.m.
Break (Fountain Foyer)
Open Discussions Session B 5:00 p.m. – 6:00 p.m.
1) Emergent Agendas. (Fountain I) TBA Ever been to FOO (Friends of O'Reilly) Camp? Delivered a lightning talk? Come and participate in the Forum's new Emergent Agenda Lab and help give shape to the experiment.
2) Aquifer Metadata Working Group Open Discussion. (Fountain II) Jenn Riley (Indiana University) The Aquifer Metadata Working Group (MWG) has begun a number of new activities since the recent release of the Digital Library Federation / Aquifer Implementation Guidelines for Sharable Metadata. These activities include informal and formal evaluation of conformance of MODS records harvested by Aquifer to the guidelines, definition of 'levels of adoption' of the guidelines Aquifer participants can use, and investigation of additional tools needed to assist institutions preparing their metadata for participation in the Aquifer initiative. This session will provide an opportunity for the working group to share our goals and progress with interested DLF attendees, and for the working group to gain additional information from participants on the effectiveness of the MODS guidelines and tools needed by partner institutions for implementation of the guidelines. In addition, open discussion will be encouraged regarding the role of formal conformance to metadata best practices in the Aquifer project, and in the greater digital library landscape.
3) Sharing Copyright Information: Opportunities for Collaboration. (Fountain III) John Mark Ockerbloom (University of Pennsylvania) Clearing copyright is essential for digital usage of most content from the past century, but tracking down copyright holders, or determining whether particular content is copyrighted at all, is now far too difficult and expensive. Moreover, as libraries expand their reach into larger-scale digital collections and tools, copyright research and clearance is increasingly redundant as well. This session is an opportunity for interested parties to discuss possibilities of building communities and systems for sharing and searching information about copyrights and their holders, to enable easier clearance, digitization, and reuse of content. Useful information can include copyright registrations and renewals, contact information on copyright holders, and findings from copyright research, including those relevant to "orphan works" or section 108 determinations. Technical, organizational, and legislative possibilities are all fair game for discussion.
DAY THREE: Wednesday, April 25
8:00 a.m. – 9:00 a.m.
Breakfast (Fountain III/IV)
9:00 a.m. – 10:30 a.m.
Session 13: Collaboration (Fountain I)
A) The Planets Approach to Digital Preservation. Adam Farquhar (British Library) The Planets project (www.planets-project.eu) brings together European National Libraries and Archives, leading research institutions, and technology companies to address the challenge of preserving long-term access to digital cultural and scientific knowledge. The four year project is co-funded by the European Commission Information Science and Technologies Framework Programme 6 Call 5 (FP6 Call 5).

To achieve these objectives, Planets will:
  • Develop Preservation Planning services that will empower organisations to define, evaluate, and execute preservation plans. The plans will reflect the organisation's preservation policies, as well as the content in its collections, and the way the content is used.
  • Develop methodologies, tools and services for the Characterisation of digital objects in order to identify the best preservation plans.
  • Evaluate existing tools and services to support Preservation Actions and the development of innovative solutions based on the integration of existing tools and on the design and implementation of new tools where an unfulfilled requirement can be demonstrated.
  • Establish a Preservation Test Bed to provide a consistent and coherent evidence-base for the objective evaluation of different preservation protocols, tools and services and for the validation of the effectiveness of preservation plans.
  • Implement an Interoperability Framework, within which the deliverables from each of Planet's sub-projects can be seamlessly integrated with each other in a distributed service network.
This session we will discuss some of the key business and technical challenges that Planets is addressing as well as details of our approach.
B) Discovering Connections: Subject Maps for Browsing Aggregated Collections. John Mark Ockerbloom (University of Pennsylvania) As digital collections become more widely shared and aggregated, it becomes increasingly challenging for researchers to browse across these collections to find conceptually related items. The organizational schemes of each collection are not likely to be generalizable. However, many collections draw on common information infrastructures, such as the Library of Congress Subject Headings, to describe their items.

Making such infrastructure the basis of subject browsing, however, is difficult, because of the complexity of the infrastructure, and the resultant tendency for different institutions to choose different terms or levels of detail in their cataloging. (Some may choose different descriptive ontologies altogether.)

In this talk, I describe the use of subject maps— interactive clusters of related subject terms displayed alongside lists of the items they describe— to enable multi-dimensional, yet focused, subject-based exploration across multiple collections. The clustering and navigation options, and quick cross-referencing, are designed to make it easy to find items that might be assigned similar but nnt identical subject headings in the same visual groupings, or just a click or two away. The maps start with relationships explicitly defined in LCSH, but also draw on collection usage, facet analysis, lexical and geographic analysis, and adjustments for local needs and usage. I will discuss the techniques we have used to make these connections in different collections, including ones not originally cataloged under LCSH, and will compare them with other approaches used in recent digital library systems.

More information about subject maps can be found at http://labs.library.upenn.edu/subjectmaps/
Session 14: Enabling Digital Scholarship (Fountain II)
A) Ode to the TEI Independent Header: Encoding Serials with the TEI. Melanie Schlosser and Michelle Dalmau (Indiana University) The Indiana University Digital Library Program received a Library Services and Technology Act (LSTA) grant to digitize and encode a 101-year run of a scholarly journal known as The Indiana Magazine of History (IMH). The journal features historical articles, critical essays, research notes, annotated primary documents, reviews, and notices. We decided to encode at the issue level in order to maintain the conceptual integrity of the print journal. In order to represent the richness of the content, however, we needed a way to capture 'article-level' metadata.

The Text Encoding Initiative (TEI) guidelines and community of practice offered a number of potential methods for representing article-level bibliographic metadata, including TEI Corpus, Metadata Object Description Schema (MODS), and article-level TEI documents that link to the parent issue. After exploring these and other options, the Independent Header eventually emerged as the best way to encode a complex serial. The auxiliary schema for the Independent Header (IHS) was developed to allow the exchange of bibliographic metadata for text collections to support the creation of indices and other aggregations. The creators of the schema did not envision the use of the Independent Header in serials encoding, but this method has a number of advantages.

Utilizing the IHS in this fashion allows the TEI to function as the authoritative metadata source for the document, and allows the encoder to faithfully represent the issue-based structure of the original without compromising the unique identity of each article. It also supports our larger goal of interoperability with other text collections. Since the IHS is part of the TEI standard (P4 and earlier), the encoder does not have to extend or modify the DTD. This not only simplifies documentation needs for management and preservation, but also allows for easier reuse of content and integration with other collections. So why is the Independent Header no longer supported by the soon-to-be released P5 version of the TEI? Our talk will provide an overview of the challenges we faced in determining the best encoding approach for The Indiana Magazine of History with particular emphasis on our use of the Independent Header.

We will explore the advantages and disadvantages of the IHS as well as explore alternatives to the IHS in light of P5 and our current infrastructure. We will present survey findings and testimonials of how others have used or are using the IHS, and how current usage informs our own practices and version P5 of the TEI standard. Lastly, we will explore the 'anxiety' of customization, an activity well supported and highly encouraged by the TEI; how the digital library and digital humanities communities generally differ in their views toward customization; and how customization impacts integration and inter-operability of digital content.
B) Sakaibrary: Bridging Course Management and Digital Libraries. Jon W. Dunn (Indiana University) and Gaurav Bhatnagar (University of Michigan) Course management systems are becoming central to teaching and learning activity on university campuses; they frequently serve as the primary mechanism for faculty to provide students with scholarly information and resources, and in turn, for students to access such materials. At the same time, libraries are spending an ever-increasing portion of their budgets on online resources. However, in today's online world, it is very difficult for faculty and students to access these digital library resources from within the course management environment due to numerous technical and organizational challenges.

The Sakaibrary project at Indiana University and the University of Michigan, funded in part by a grant from The Andrew W. Mellon Foundation, seeks to address some of these challenges within the framework of the Sakai community source collaboration and learning system. This talk will discuss the goals, accomplishments, and technical architecture of the Sakaibrary project, provide a demonstration of the tools that we have developed to date, and discuss some of the challenges, both technical and organizational, that we have encountered in working across traditional boundaries of library services.
10:30 a.m. – 11:00 a.m.
Break (Fountain Foyer)
11:00 a.m. – 12:30 p.m.
Session 15: Preservation (Fountain I)
A) The Ultimate Registry Service Component. Andreas Stanescu (OCLC Online Computer Library Center, Inc.) Collections of metadata and collections of digital content are receiving enormous amounts of attention because increasingly more data is needed for accessing information, finding the correct digital objects out of a multitude of possibilities, gathering search results based on users' context, or simply exposing valuable information long held in backend office applications or buried in inaccessible databases. Registries are becoming a more frequent solution or part of a solution in these scenarios.

The Global Digital Format Registry project is developing a highly customizable registry implementation that is registry data-agnostic and can therefore become the ultimate registry implementation.

This presentation will demonstrate a registry capable of completely managing entire collections of metadata and content records, using standard access protocols, a standard query language and a well-defined URL naming scheme. This component will dynamically adapt its indexing, searching, loading and harvesting capabilities based solely on the XML Schema (XSD) which describes the metadata records in the collection.

Out of the box this registry software will immediately create a registry web service capable of servicing SRU/SRW queries, OAI-PMH harvests, access to metadata and content records, access to record history, Atom/RSS subscription and notification services and a highly dynamic AJAX user interface. The user interface can dynamically reconfigure itself based on the record schema with no developer intervention. Many tasks associated with creating, maintaining and enhancing a registry can be done with no developer intervention. These tasks include: defining new record collections, and adding indexes when the schema changes. Some tasks may require some developer intervention such as adapting an existing database to fit within the registry, and adapting an existing service to fit within the registry.
B) Microsoft Office OpenXML. Adam Farquhar (British Library) and Jean Paoli The Office OpenXML file format standard was approved as an ECMA-International standard on Dec 7, 2007. The work to standardize OpenXML was carried out by ECMA-International's Technical Committee 45 (TC45), which included representatives from Apple, Barclays Capital, BP, The British Library, Essilor, Intel, Microsoft, NextPage, Novell, Statoil, Toshiba, and the United States Library of Congress. OpenXML was designed from the start to be capable of faithfully representing the pre-existing corpus of word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft Corporation. The standardization process consisted of mirroring in XML the capabilities required to represent the existing corpus, extending them, providing detailed documentation, and enabling interoperability. In 2006, more than 400 million users generated documents in the binary formats, with estimates exceeding 40 billion documents and billions more being created each year.

During this session, we will discuss how Office OpenXML was designed to preserve the financial and intellectual investment in legacy documents (both existing and new) and meet the needs of long-term preservation. We will outline the process that led to its creation, and highlight some of the significant features of its design. We will discuss how OpenXML fits into the space of existing document format standards. Lastly, we will address some of the myths that have arisen since the standard was established.
Session 16: Interoperability (Fountain II)
A) DLF Aquifer in Context. Katherine Kott (DLF Aquifer), Jennifer Vinopal (New York University), and Jenn Riley (Indiana University) This session will focus on placing the Digital Library Federation's Aquifer initiative within a broader community context. We will explore the way in which Aquifer can be viewed as an instance of the OAI Object Re-Use and Exchange (ORE) Initiative and describe how the Aquifer Services Working Group is using the Services Framework to model Aquifer services. The presenters will offer a brief project status report, including how collections should be prepared for inclusion in Aquifer and the level of support DLF member libraries can expect when they contribute collections as a follow-on to the OAI training DLF has developed. The Aquifer Metadata and Technology/Architecture working groups will each host open discussion sessions with more detailed technical information about linking levels of metadata to services, metadata validation and enhancement exploration and Aquifer technical interoperability models (assetActions) compared to other interoperability schemas such as unAPI.
B) OAI-ORE. Michael Nelson (Old Dominion University), Herbert Van de Sompel (Los Alamos National Laboratory), and Carl Lagoze (Cornell University) YouTube, Flickr, del.icio.us, blogs, message boards and other "Web 2.0" related technologies are indicative of the contemporary web experience. There is a growing interest in appropriating these tools and modalities to support the scholarly communication process. This begins with leveraging the intrinsic value of scholarly digital objects beyond the borders of the hosting repository. There are numerous examples of the need to re-use objects across repositories in scholarly communication. These include citation, preservation, virtual collections of distributed objects, and the progression of units of scholarly communication through the registration-certification-awareness-archiving chain.

The last several years have brought about numerous open source repository systems and their associated communities. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) has been the initial catalyst for repository interoperability. However, there is now a rising interest in repositories no longer being static components in a scholarly communication system that merely archive digital objects deposited by scholars. Rather, they can become building blocks of a global scholarly communication federation in which each individual digital object can be the ore that fuels a variety of applications. Both the interest in this type of federation, and the insights gained thus far are sufficiently strong to move beyond prototypes and to support an effort to formally specify this next level of interoperability across repositories. Through the support of the Mellon Foundation, a two-year international initiative to define this interoperability fabric has started in October 2006. The effort is in the context of the Open Archives Initiative, and is named Object Re-Use and Exchange (ORE). OAI-ORE is intended to be a complement to OAI-PMH.

OAI-ORE is coordinated by Carl Lagoze and Herbert Van de Sompel, and consists of international experts on Advisory, Technical and Liaison Committees. The Technical Committee held its first meeting in January 2007 and began its initial work to develop, identify, and profile extensible standards and protocols to allow repositories, agents, and services to interoperate in the context of use and reuse of compound digital objects beyond the boundaries of the holding repositories. In this presentation, we will give an overview of the current activities, including: defining the problem of compound documents within the web architecture, enumerating and exploring several use cases, and identifying likely adopters of OAI-ORE.

More information about ORE can be found at: http://www.openarchives.org/.
12:30 p.m.
Adjourn
POST-CONFERENCE 1:00 p.m. – 4:00 p.m.
Second Life: An Exploration of Libraries and Education (Fountain III)

Second Life is a popular three-dimensional, interactive virtual world with over three million registered users. Academic institutions and libraries have a growing presence in Second Life, as they explore the possibilities of expanding libraries and education into the virtual realm. Linden Lab's release of the source code to the open source community has removed some of the barriers to widespread adoption of the system, and has increased academic interest. The visual and social aspects of Second Life make it an ideal setting for a number of creative endeavors, and hold the possibility of providing a step in the evolution of the future of education and information. Join representatives from the Alliance Library System, the New Media Consortium, Sloodle, and Linden Lab as we discuss current and future projects in the academic space, and open the floor for general discussion of Second Life.

return to top >>