University of Pennsylvania
Report to the Digital Library Federation
October 15, 2000
TABLE OF CONTENTSCollections, Services, and Systems
Projects and Programs
Specific Digital Library Challenges
- The Schoenberg Center for Electronic Text and Image (SCETI)
SCETI provides the scholarly community with web access to virtual facsimiles of original texts, documents, and sources from Penn's collections. SCETI now includes eleven specialized collections that provide digital views of printed books, manuscripts, photographs, artwork, maps, broadsides, ephemera, and recorded sound. Founded in 1996, SCETI continues to introduce new methods for producing and presenting of digital resources. Recent additions to SCETI include new text facsimiles for the Furness Shakespeare library, and accompanying multimedia teaching aids that demonstrate the technique and interpretation of Shakespeare's works. SCETI has also recently added several digitized versions of medieval manuscripts (including illuminations) to its Lawrence J. Schoenberg digital collection.
- The Oxford University Press History Books Project
The Oxford University Press History Books project is making newly issued
scholarly history books available electronically to the Penn community. The
project is studying digital book use and its impact on teaching, learning,
and the economics of publishing. Currently over 100 books are available,
with thousands expected by the end of the 5-year study. Descriptions
of the project, its catalog, and three sample books, are available
for the general public to browse.
- Freedman Archive
The Freedman Archive site is an example of the integration of digital and
nondigital resources, and of the migration of digital technologies we hope to
supporting on a larger scale in the years to come. Users can visit the
Freedman collection web site to browse and search a multilingual catalog of
over 27,000 recordings of Jewish music. A few of these recordings
are available as digital sound samples; the others can be listened to
offline at the Archive. The digital catalog, using dBase IV tables with
customized character encoding, has been migrated to web-searchable forms
with standard Unicode encoding for Yiddish and Hebrew text.
We expect to reuse the tools used in this migration for other projects.
- A Celebration of Women Writers
The Penn Library is also hosting A Celebration of Women Writers,
a collection of electronic text transcriptions (many with illustrations)
edited by volunteer Mary Mark Ockerbloom. This sister site to The On-Line
Books Page has digitally republished over 130 books by women on-line, free for
all to read. The Celebration's publications cover English-language
women's writings in all genres, but especially emphasized are fiction
and poetry, children's literature, Canadian authors, and personal accounts
of important historical periods and figures. The site also includes a
browsable database of other writings freely available on the Internet
by and about women writers.
- Cross-collection Searching
A cross-collection search service allows users to find digital resources
in any of dozens of databases, without having to seek out each individual
database, and go through each one's unique user interface. Our first
version of this service is now available as QuickSearch, implemented using
software provided by OCLC. Eventually we hope to have more powerful
cross-collection searches, through this and other tools, with
more databases, more flexible interfaces, and the ability to refine
searches to specific user needs.
- Electronic Journals and Databases Search Tools
An electronic journal search tool, made public this fall. gives patrons
a much more powerful discovery tool for our electronic journals than
our previous static Web pages provided.
Since we now subscribe to over 3000 such journals, finding journals
that are relevant to a particular field of study can be difficult.
The search system allows patrons to search:
- by known title or keyword in title
- by community of interest (see below) or by community clusters
- by journal publishing associations like the Association of Computing Machinery or vendor-supplied aggregations like Journals@OVID
- by format of journal articles - full text, page images or not full text
- by access restrictions (useful for non-Penn users that want to see what electronic journals they can look with no subscription)
- or alphabetically by title
A similar tool is now under development for electronic databases, using
similar searching criteria.
- Electronic Reserves
Electronic reserves were introduced as a pilot service in the 1999-2000
academic year, for the delivery of course reserve materials over the local
network in digital form. We will be supporting electronic reserves on a regular
basis starting in the 2000-2001 academic year. The service includes in-house
rapid scanning facilities, integration with the Franklin catalog and with
on-line courseware, and access control to ensure that we stay within
- New Library Materials
Our New Materials notification service, currently under user testing
as "New Books Plus", lets users find recently acquired materials according
to a variety of critieria preferences (including name, topic, language,
format, and date of acquisition of the materials). Users can also view
a complete listings of recent acquisitions.
- Try This Out!
"Try This Out!" is Penn's prototypes page, introduced this summer.
Library users can go to this page to get access to services still
in development that may be of use to them. They can also give feedback
to us and suggest ways to improve the prototypes. Library staff
have access to a similar page for projects that we want to test
internally but not yet release for public use.
- The On-Line Books Page
Penn's Library web site now hosts The On-Line Books Page, the Web's
oldest Internet-wide index of free-to-read books, edited by John
Mark Ockerbloom of the Library. The On-Line Books Page provides a
searchable index of over 12,000 freely accessible digital books in
English now available on the Internet. It also has links to major
electronic text archives, and posts information on how individuals
and groups can create more on-line books to add to the growing
collection of freely-readable books on the Internet.
- Persistent References
Our persistent references work allows us to use more stable, high-level
references to digital resources than the fragile URLs of the World Wide Web.
We have installed the new version of CNRI's Handle service, and plan to
introduce Handles to identify, and help locate and describe, many of our
locally managed resources. We are developing software to facilitate the
maintenance of locally defined Handles. In the future, we also hope to
support persistent references at the citation level (that is,
dynamically resolved references based on descriptive information like
title, author, and publication details) as well the opaque, one-to-one
identifiers provided by Handles.
- The Typed Object Model (TOM)
The Typed Object Model allows us to describe the structure and behavior of a
wide variety of data formats and information services. The system was
originally developed at Carnegie Mellon, where it still drives a popular
web-based conversion service. At the Penn Library, we have released the
core TOM software as open-source. We plan to use it to document our own
data formats and services, assist in data format migration and other
conversions, and provide uniform application-level interfaces to
heterogeneous data services.
- Citation Linking
We are embarking on a citation linking project that aims to let readers
find literature cited in scholarly works by clicking on citations in the
document's bibliography and footnotes. By using citations directly, instead
of relying on the opaque document identifiers assigned by some other
reference linking systems, we hope to develop tools that cover a wider range
of scholarly literature, and give users more options in finding cited
works and related resources. As part of the project, we hope to implement
a context-sensitive system that automatically identifies and parses
citations in ordinary digital monographs, and embeds links to services
that can then return digital documents, catalog records, or other
resources related to the citation. (We may use third-party software for
some of these components.) We believe that a powerful citation service
may make research signficantly more productive.
- Communities of Interest Initiative
Communities of interest, built around scholarly disciplines or
interdisciplinary collaboration, can form important focuses for
library development. We have defined a set of communities of interest
at Penn, and are now using them to develop services that allow users
to locate resources relevant to particular communities of interest.
(For example, our new materials and electronic journals services allow
searching for materials of interest to specific communities.)
We also are considering developing services to help people in
these communities collaborate, and select especially relevant resources.
- Digital Images
The digital images project is a system to manage and deliver digital images for
use in teaching and research. It uses enhanced MARC records, XML tools for
managing collection metadata, software to browse and search image
collections, and a flexible delivery system that allows viewing all, or
selected parts, of images at various resolutions and detail. (Parts of
this project are supported by third party software such as MrSID from
LizardTech.) The system is being implemented initially for a Fine Arts
slide collection, and we hope to use it for other types of collections as well.
- English Renaissance in Context (ERIC)
English Renaissance in Context (ERIC) ERIC is a three-year, NEH funded
project to create a web site presenting ways in which Shakespeare's plays
can be taught using digital facsimiles of original sources and documents. It
is a collaborative project involving the School of Arts and Sciences (SAS) at
Penn and the Penn Library's Schoenberg Center for Electronic Text & Image
(SCETI). It has two distinct components: a set of self-paced
tutorials that raise a variety of issues for students, and an
introduction to the printing and publishing context of the English
Renaissance. ERIC is part of a larger collaborative effort between
SAS and the Library to create a major archive of digital facsimiles
relating to the English Renaissance, one of the areas of particular
strength both among the faculty and in the Library. A completed prototype
will be available by the summer, 2001.
http://www.library.upenn.edu/etext/collections/furness/eric (Flash player required)
- Franklin to Web
The Franklin to Web project is standardizing metadata for many of our
digital resources, encoding the metadata as MARC records that are filed
in our Franklin catalog, and making the metadata searchable and browsable
via the Web. Although encoded as MARC, the scope of our metadata goes well
beyond ordinary catalog records, and is much more easily searched and
filtered than static Web page listings. We are starting by migrating
electronic journal and database records. (See the services section
above for more details.) We will also eventually incorporate digital
slide images and selected Web sites as well.
- The Information Base
MESL allows members of the Cornell community to view a collection of nearly 9,000 images selected from seven museums and institutions across the United States.
- Museum On-Line Project
The Information Base project is designing interrelated repositories of digital
documents and metadata on a common set of principles, supporting a wide
range of information formats, use of information in multiple contexts, and
long-term preservation. There are several notable aspects to the design:
- A "life cycle" model that describes how digital resources can be managed from their initial acquisition, preserved for long-term access, and evaluated.
- A centralized network storage facility (hosted by a recently acquired terabyte-scale disk array) which simplifies access, backup, and integrity checking.
- Standards for processes, data formats and metadata that reflect the best practices of digital libraries, and that we adapt and extend for our local environment in projects such as Franklin-to-Web.
- Selective Dissemination of Information (SDI)
We are starting to plan and propose services for selective dissemination of
information. We mean to provide views of Library resources that are
customized for particular users and groups, and subscription services
for informing users of new information they may be interested in.
Early aspects of the project may include bundled, customized versions of our
new materials service (see above) and journal contents and abstract
- Visualization of Information
One difficulty in finding electronic resources is that plain-text browser
windows only let users examine a small amount of information at any given
time. We can see much more at one time through nondigital technologies, such
as topically-arranged open stacks. Now digital technologies are being
developed to make large quantities of digital resources intelligibly browsable,
by transforming indexes of metadata into more intuitive graphical designs and
layouts. We are considering partnerships with selected developers of these
technologies, to test them in our own library services, and determine the most
effective uses of this technology. At the same time, we are investigating
ways of making our existing text-based indexes more effective in letting
users browse, discover, and view our resources.
- Further populating our information base, so our digital resources and
metadata can be more easily searched and presented in different ways.
- Transitioning from fragile URLs to more robust persistent identifiers
- Designing and experimenting with archives for long-term
preservation of digital documents.
- Implementing services for selective dissemination of information,
developing experimental citation linking services, and improving
cross-collection search services.
- Implementing searchable, archival-quality collections of digital images.
- Supporting geographic information and related datasets.
- Strengthening the open-source library software community, by
releasing software we develop, working with other developers to
improve their software, and publicizing the benefits of open-source
send comments or suggestions.
© 2000 Council on Library and Information Resources