DLF Spring Forum 2009 Program

DAY TWO: Tuesday, May 5

8:00 a.m. – 9:00 a.m.

Breakfast

9:00 a.m. – 10:30 a.m.

Session 2

A) DLF Aquifer: The Final Story Tom Habing and Susan Harum, both University of Illinois at Urbana-Champaign; Kat Hagedorn, University of Michigan; and Katherine Kott, Stanford University [PRESENTATION] This session will report on assessment of American Social History Online, including use by students, faculty and graduate students, observations about the collections included, the features and functions of the web site, and the integration with Zotero. The presenters will also demonstrate the MODS explorer and discuss results of testing for compliance of incoming metadata records with the Guidelines for MODS Implementation developed by the Aquifer metadata working group. As DLF Aquifer comes to an end, the presenters will share ideas for best practices in collaboration learned during their work together.

B) MediaCommons 2.0 Mark Reilly [PRESENTATION] and Brian Hoffman [PRESENTATION], both New York University

In the 2007 / 2008 academic year, NYU Libraries entered a partnership with the Institute for the Future of the Book (http://www.futureofthebook.org) and became the technical stewards of, among other things, the scholarly communication network MediaCommons (http://mediacommons.futureofthebook.org). In the 2008 / 2009 academic year, NYU Libraries designed and released MediaCommons "2.0" using the open source Drupal content management framework. This presentation will both be a case study of the scholarly communication network as a form of electronic publishing and data curation, and a demonstration of the robust editorial and curatorial features available in Drupal. We will discuss the design process, the choice of Drupal as a content management framework, and the ways in which serving as host of MediaCommons may affect the Library's DNA both as a repository of materials and as a service point for scholars. Finally, we will preview the features of the new user profile system currently in development as part of an NEH Digital Humanities start-up grant.

Session 3

A) Making Your Digitized Books Discoverable Zoe Chao, Myung-Ja Han, and Tim Cole, all University of Illinois at Urbana-Champaign [PRESENTATION]

In the Fall of 2006, the University of Illinois Library at Urbana-Champaign began a partnership with the Open Content Alliance (OCA) as part of a broader large-scale digitization initiative. In concert with this new partnership, a primarily automated workflow was established to maximize online access and exposure for Illinois books digitized by OCA. This work paralleled efforts to enhance visibility of other locally digitized collections. Multiple approaches were developed to increase the visibility of digital surrogates: disseminating metadata about digitized content via a range of systems such as our local OPAC, OCLC WorldCat, our local institutional repository, the locally-developed Illinois Harvest Portal, an OAI-PMH Data Provider; creating splash pages for digitized volumes at the item level and serial level; providing unique, persistent handle URIs for each book digitized from Illinois collections. The workflow was also adapted to link print resources indexed in the Library OPAC with the publicly-accessible digital copies of books digitized by OCA or deposited in Hathi Trust. Allowances were made in the workflow for specialized processing of some resources to meet users' needs in certain scholarly context -- i.e. implementing Mets Navigator (a METS-based page turner application developed by Indiana University) for our digitized brittle books and for our Project Unica collection (a rare books digitization project). This presentation will discuss workflow strategies and methods used to provide access to digitized content, and will describe motivational use case scenarios.

B) Negative the New Positive Melitte Buchman, New York University [PRESENTATION] Libraries and archives have important primary source material that exist only as black and white and color negatives. Many archives have resisted digitizing and creating access to these materials. Some have digitized the materials with disappointing results. Unlike most archival material, negatives are not the �thing of record� but have a relationship much like the score (the negative) to the performance (the print or positive digital file). At NYU Libraries we have had the opportunity with several of our negative collections including Tamiment Library and Robert F. Wagner Labor Archive's Abraham Lincoln Brigade to explore this paradigm for black and white negatives and to expand this discovery process into color negative holdings. Sensitivity to the different nature of negatives and new tools available such as DNG and 16 bit work flow tools have changed our ability to convert these hidden assets into preservation masters and access copies. As film stock ages, fades and becomes embrittled it is important to start processing these collections and opening the contents, in a respectful and visually acceptable way, to the light of day. Our surprising finding is that this can be done in a systematic and efficient way.

C) Linking Resources in the Humanities: Using OpenURL to Cite Canonical Works David Ruddy, Cornell University [PRESENTATION] A joint effort by Cornell University Library and Cornell faculty in Classics is exploring the extention of OpenURL to provide system independent linking between citations of Classical literature and an increasing array of available online resources and services in Classics. Textual references in Classics are commonly to FRBR work-level entities (e.g., Ovid, Amores), independent of any particular edition or translation. Online abstracting and indexing services, as well as Classical scholarship available online, contain many such citations. At the same time, there are an increasing number of Classical texts and resources available online. This project has explored the advantages and challenges of building links among such resources using OpenURL. Such linking, which could be extended to other domains, will allow more seamless movement from scholarly resources to original texts and translations, improving digital services in the humanities. This project has been supported by a recent planning grant from The Andrew W. Mellon Foundation to the American Philological Society. To date, project work has focused on the creation of a canonical citation OpenURL metadata format, a strategy to address implementation challenges, and a prototype of a Classical literature knowledge base and linking system. By design, the metadata format, implementation scheme, and knowledge base structure are independent of Classics and can be deployed in other domains that frequently cite texts independent of editions or translations. The project also demonstrates how knowledge bases may be chained together to provide enhanced services to users, a model which may have wider application within the OpenURL community.

10:30 a.m. – 11:00 a.m.

Break

11:00 a.m. – 12:30 p.m.

Session 4

A) EthicShare: Building an Inter-Institutional Scholaraly Research Community Kate McCready and Chad Fennell, both University of Minnesota [PRESENTATION]

The University of Minnesota is developing EthicShare, an open virtual research community for scholars in the field of bioethics, funded by the Mellon Foundation. The site's development process has resulted in an environment, accessible by scholars regardless of institution, which combines a repository of sources (traditional scholarly materials, relevant popular press resources and other related content) with social, collaborative tools and features.

Thus far, social networking services have only been selectively evidenced among scholars, yet these environments hold opportunity to enable collaboration and, potentially, advance new forms of scholarship. But, while possibilities exist, there are many barriers involved in the creation of inter-institutional research sites. Copyright and licensing issues surround the citations and full text of the aggregated scholarly resources. The current behaviors and motivations of bioethics scholars, especially those surrounding the tenure and promotion process, inhibit the adoption of social spaces. Additionally, their unfamiliarity with social networking technologies requires education and policy creation.

We will discuss the extensive assessment of bioethics scholars that determined both the appropriate content and the collaborative features we incorporated into the site. Our strategy, an iterative process of assessment and development, will be explored. Finally, we will outline the development of the Drupal based tools and features within the EthicShare site that ensure scholars from all institutions can participate. This will include the Link Resolver module built to blend network provisioned data from the WorldCat Registry service with locally stored user configuration information.

B) Whose Stuff Is It, Anyway? A Study of Copyright Statements on DLF-Member Digital Library Collections Melanie Schlosser, Ohio State University [PRESENTATION]

Copyright is a tricky subject for digital libraries. As libraries, we want to promote the rights of users and educate the public. As digitizers, we rely on fair use and the public domain to create and share our collections. As owners of digital content, we want to protect our investment, receive credit for our work, and prevent commercial (ab)use of our materials. Attaching copyright statements to our digital collections is one of the easiest ways for us to achieve these goals. Unfortunately, digital library literature is silent on the subject, and consensus has yet to emerge on what these statements should say, where they should be placed, or even what they should be called. As a result, practice varies widely within and between institutions, and statements are often misleading, inaccurate, or nonexistent.

This presentation will share the results of a study of copyright statements attached to digital collections created by DLF member institutions. The questions posed by the study include: Do such statements exist? What kind of content do they include? Do they provide an accurate representation of the copyright status of the items? The study also addresses how well we are fulfilling our obligation to educate our users on their rights under copyright law, including fair use and unrestricted use of public domain materials. The presentation will conclude with recommendations on how to address the problems highlighted by this study and develop best practices for copyright statements attached to digital collections.

Session 5:

A) Repository Interoperability and Preservation: The Hub and Spoke Framework Thomas Habing, Myung-Ja Han, Patricia Hswe, William Ingram, and Robert Manaster, all University of Illinois at Urbana-Champaign [PRESENTATION] The Hub and Spoke (HandS) Project, one of four UIUC-based technical architecture projects funded by the Library of Congress' NDIIPP program, proposes a paper on our framework for repository interoperability and preservation. The paper will have four parts. The first part will introduce the framework. The second part will provide an overview of our METS profiles and examples of HandS preservation packages. The third will be a demonstration of our workflow manager client application, supporting submission and retrieval of digital packages between the repositories in our system, and a discussion of the technical protocols involved in the HandS workflow cycle, such as our "Lightweight Repository Create, Retrieve, Update, and Delete Service" (LRCRUD) and the Simple Web-service Offering Repository Deposit (SWORD). The final part will address the metadata interoperability layer; in particular, we will present and analyze the challenges we faced in crosswalking MODS metadata to the Scholarly Works Application Profile (SWAP), the FRBR-based metadata format used by SWORD.

B) Student Research on the University and in the Institutional Repository Sarah Shreeves, University of Illinois at Urbana-Champaign [PRESENTATION] EUI, the Ethnography of the University Initiative (http://www.eui.uiuc.edu/), is an innovative program at the University of Illinois that offers students the opportunity to conduct original ethnographic and archival research and archive it for future students to build upon. EUI supports faculty in their efforts to bring the research discovery process into the classroom and works with IDEALS, the University's institutional repository, which maintains a permanent online archive of student research. In this session I'll describe EUI and discuss the roles of participants, give examples of student learning and research enabled by EUI and IDEALS, and reflect on what we've learned from the initiative.

C) TIPR: Toward Interoperable Preservation Repositories Joseph Pawletko, New York University; Priscilla Caplan, Florida Center for Library Automation; and Bill Kehoe, Cornell University [PRESENTATION] The task of preserving our digital heritage for future generations far exceeds the capacity of any government or institution. Responsibility must be distributed across a number of stewardship organizations running heterogeneous and geographically dispersed digital preservation repositories. For reasons of redundancy, succession planning and software migration, these repositories must be able to exchange copies of archived information packages with each other. Practical repository-to-repository transfer will require a common, standards-based transfer format capable of transporting rich preservation metadata as well as digital objects, and repository systems must be capable of exporting and importing information packages utilizing this format.

Bird-of-a-Feather Sessions 2:30 p.m. – 3:30 p.m.

A. TEI Text Encoding in Libraries Michelle Dalmau, Indiana University; Melanie Schlosser, Ohio State University; Kevin S Hawkins, University of Michigan

The Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange (TEI), first published in 1994, quickly became _the_ standard for encoding literary texts. The TEI was widely adopted by libraries for its promise of discoverability, interoperability, and preservation of electronic texts, but the TEI's monolithic nature inspired the codification of library-specific practice. Since 1999, libraries have relied on the TEI Text Encoding in Libraries Guidelines for Best Encoding Practices (https://www.diglib.org/standards/tei.htm) to steer their work with encoded texts. In April 2008, the TEI in Libraries special interest group (SIG) and the DLF-sponsored TEI Task Force partnered to update the Guidelines. The revision was prompted by the release of P5, the newest version of the TEI, and the desire to create a true library-centric customization not constrained by the TEI Lite schema.

The revised Guidelines contain updated versions of the widely adopted encoding 'levels' - from fully automated conversion to content analysis and scholarly encoding. They also contain a substantially revised section on the TEI Header, designed to support interoperability between text collections and the use of complementary metadata schemas such as MARC and MODS. The new Guidelines also reflect an organizational shift. Originally authored by the DLF-sponsored TEI Task Force, the current revision work is a partnership between members of the Task Force and the TEI Libraries SIG. As a result of this partnership, responsibility for the Guidelines will migrate to the SIG, allowing closer work with the TEI Consortium as a whole, and a stronger basis for advocating for the needs of libraries in future TEI releases.

If you work with encoded text or simply want to learn more, please join us for the TEI Text Encoding in Libraries birds of a feather session. We will provide an overview of the Guidelines and the principles that governed our revisions. We will also seek feedback on the work we have done so far and solicit input for future planned revisions.

B. Exposing library resources and services in course management systems Jon Dunn, Indiana University

As online teaching and learning activities on our campuses move increasingly into course management systems (CMS) such as Blackboard, Sakai, Angel, Desire2Learn, and Moodle, many libraries and librarians are looking at how best to expose their collections, services, and expertise in these environments, and there have been several presentations at past DLF Forums on this topic. These efforts often involve work across instructional technology, reference, and library systems groups, and can pose challenges related to both technical issues and organizational culture.

As one example of such an effort, the Sakaibrary project, a joint effort between Indiana University and the University of Michigan with past support from the Andrew W. Mellon Foundation, has implemented functionality in the open source Sakai course management system to allow faculty and students to create and share reading and reference lists and to search for licensed and open full text resources using library metasearch tools and Google Scholar. The project has also created a prototype tool to allow librarians and faculty to create �research guides' within Sakai to provide focused access to resources and services relevant to a particular course, discipline, or research task.

The purpose of this BOF is to get together librarians and technologists who are working on or interested in CMS-library integration issues to learn from each other through informal discussion and sharing of use cases, experiences, plans, and ideas.

4:00 p.m. – 5:00 p.m.

Lightning Talks

5:30 p.m. – 7:30 p.m.

North Carolina State University Reception

Vice Provost and Director of Libraries Susan K. Nutter invites you to a cocktail reception May 5, 2009, 5:30-7:30 p.m. in the D. H. Hill Library Special Collections Reading Room on the campus of North Carolina State University.

RSVP by noon on Friday, May 1st, to Terry Hill at 919-515-7188 or terry_hill "at" ncsu "dot" edu

Transportation will be provided to and from the Raleigh Marriott City Center. The first bus will depart from the hotel at 5:15 and the second bus will depart at 5:45.

This event is sponsored by The Friends of the Library of North Carolina State University.

DAY THREE: Wednesday, May 6

8:00 a.m. – 9:00 a.m.

Breakfast

9:00 a.m. – 10:30 a.m.

Session 6

A) Data Storage Solutions for Digital Preservation: Balancing Costs, Complexity, and Fault-Tolerance Jacob Farmer, Cambridge Computer [PRESENTATION] This session offers a simple framework for evaluating data storage solutions for long-term digital storage. We start with the premise that there is no one-size-fits-all data storage solution, and that each insitution's needs are potentially different from it's peers. Next we observe that the data storage industry is very confusing. There are hundreds of vendors each offering unique twists on disk, tape, and optical media. They all have their reference accounts. They all have polished sales presentations that make their products sound like the best. They all offer special terms to win you as a client. How do you identify what really matters for your specific needs and avoid some common pitfalls?

B) Enabling Collection Interoperability and Preservation Using iRODS and the OAI-PMH Jewel Ward, University of North Carolina, Chapel Hill [PRESENTATION]

Digital Librarians' efforts to preserve digital collections for future use include archiving the digital objects and associated metadata into preservation repositories. The integrated Rule Oriented Data System (iRODS) provides for the enforcement of preservation policies such as replication of collections so that there is no one point of failure. iRODS combines information models from the digital library community with archivists' preservation models and archival storage technology from the data grid domain.

The speaker will discuss federated preservation data grids and demonstrate a prototype application from an ongoing project to use the OAI-PMH to transfer digital objects and metadata from the Odum Digital Archive at UNC-CH into the National Archives and Records Administration (NARA) Transcontinental Persistent Archive Prototype (TPAP) preservation grid, an extension of which uses iRODS. The transfers implement community policies for the generation of descriptive metadata, choice of semantics, and access permissions.

The success of this proof-of-concept means that any digital library or archive that is an OAI-PMH-compliant Data Provider can upload their collections into the preservation data grid, and demonstrates that the iRODS software is flexible enough to provide not only "dark" archival storage and enforcement of domain specific policies, but dissemination and data transfer as well.

Session 7

A) Name This! Automating Metadata Extraction through a Named Entity Recognition Tool Jean Godby, OCLC; Patricia Hswe, University of Illinois at Urbana-Champaign; Judith Klavans, University of Maryland [PRESENTATION]

The Extracting Metadata for Preservation (EMP) Project, funded by the National Digital Information Infrastructure and Preservation (NDIIPP) Program, addresses the ongoing challenge of identifying proper names to improve authority control in metadata creation and extraction, as well as accuracy in end-user information access via web-based search and retrieval. As a collaboration among the University of Illinois at Urbana-Champaign, OCLC, and the University of Maryland, EMP researchers bring multidisciplinary perspectives from the library, computer science, and linguistics communities to the problem of high-quality identification and disambiguation of names.

This presentation reports on three activities. First, we describe an open-source name extractor tool developed by computational linguists at Illinois, configured with a plug-in interface that lowers barriers of access to state-of-the-art research tools. Second, we demonstrate the use of this tool by integrating it into two applications developed at the collaborating institutions: summary views of FRBR-ized MARC records hosted at OCLC and metadata generated by CLiMB (Computational Linguistics for Metadata Building) at Maryland. Finally, we describe the results of evaluation that compares the output of EMP with previously available solutions.

This research will be of interest to those who develop search interfaces, metadata creation tools, institutional repositories, and applications requiring names management.

B) Library of Congress Controlled Vocabularies as Linked Data Clay Redding, Library of Congress [PRESENTATION]

Historically, controlled vocabularies maintained and provided by the Library of Congress have proven difficult to access and process. Some vocabularies, like the LC Name Authority File and LC Subject Headings, have required substantial payment to simply access the data. Others, while freely available, have only been provided within simple lists that lack web addressability for the values within the vocabulary. Both approaches required human intervention to make use of the data.

In Spring 2009, the Library of Congress will launch a new service called id.loc.gov to expose its controlled vocabularies and the values within them as first class web resources. To drive this application, LC primarily uses Simple Knowledge Organization System (SKOS) Resource Description Framework (RDF) metadata. Following the principles of Representational State Transfer and the Linked Data movement, each vocabulary and every value within will be addressable as dereferenceable HTTP URIs to provide the machine readability that has long been requested by developers. This new functionality will allow users to tie LC vocabularies and individual terms directly into their metadata. These URIs will allow for HTTP content negotiation so that machines and human users alike can access a suitable format of the data. If the data at hand doesn't suit the needs of the developer, one can perform custom queries using the site's SPARQL endpoint. Alternatively, one can freely download the data for free in numerous RDF formats, and process the data locally.

10:30 a.m. – 11:00 a.m.

Break

11:00 a.m. – 12:15 p.m.

Session 8

A) Variations: Implementing an Open Source Digital Music Library System Jon Dunn and Mark Notess, both Indiana University [PRESENTATION]

In February 2009, Indiana University released Variations, an open source software package that helps libraries provide online access to streaming audio and scanned score images for teaching, learning, and research, with support from a grant from the Institute of Museum and Library Services. The Variations system, currently in use at Indiana University and four other institutions, provides a repository for storing audio files and score images, tools to assist library staff in ingesting audio and score content, and end-user tools for delivery, annotation, and pedagogical use of music content. A key feature for libraries is a flexible access control and authentication system, which allows libraries to integrate with existing local authentication and authorization systems and to set up access rules based on their own local institutional policies. The system, written primarily in Java and distributed under a BSD-style license, makes use of a number of other open source tools, including Sun's MySQL database and Apple's Darwin Streaming Server. Currently, the end user tools are provided as part of a Java Swing desktop client for Windows and Mac OS X, but the most commonly used tools are being ported to browser-based Web applications.

In this presentation, we will provide an overview of Variations functionality and system architecture and discuss what is required to bring up and support the system at an institution. We will also talk about options currently under review for ongoing support and development of the system and raise some general issues about sustaining discipline-focused open source software.

B) Reimagining the Library Facebook Application Joseph Ryan and Josh Boyer, both North Carolina State University [PRESENTATION] Many library Facebook applications display existing library web content, such as course guides or catalog searches, within Facebook's application space. This presentation will provide an overview of these existing applications, and will contrast the functionality provided by these applications with published research on students' motivations and goals for using Facebook. Following this overview, the presenter will explore an alternative strategy to developing a library presence in Facebook by means of a case study of the NCSU Libraries Activity Wall. Still in development, the Activity Wall is an application designed to help students meet up in physical library space for planned or ad hoc activities, such as group study sessions. The application provides an "at a glance" view of all activity in the library, activity by the user's friends, and useful library information, such as hours and study room availability. Users can join activities created by others, or can simply broadcast their own activity to the community. Focus group feedback and analysis of Facebook user motivations will be discussed in the presentation.

12:15 p.m.

Adjourn

POST-CONFERENCE 1:00 p.m. – 5:00 p.m.

TEI Task Group

POSTCONFERENCE: Thursday May 7 8:30 a.m. – 5:30 p.m.

METS Editorial Board Meeting

ORGANIZATIONAL MEETING : Friday, May 8 8:30 a.m. – 5:30 p.m.

Developing Integrative Practices: METS & DDI, TEI, EAD, FGDC

By invitation only.

Contact Nancy Hoebelheinrich at nhoebel "at" stanford "dot" edu if interested.

return to top >>