Indiana University Digital Library Program
Report to the Digital Library Federation
July 15, 2000
VARIATIONS is a digital library project that provides access to over 6600 titles of near CD-quality digital audio to users at computer workstations in the Cook Music Library, located in the Simon Music Library and Recital Center at Indiana University Bloomington. VARIATIONS serves both as a useful system for the faculty and students of the IU School of Music and a testbed for multimedia digital libraries at Indiana University. VARIATIONS has been in operation since April 1996 and presently provides access to an average of 500 sound recordings per day for library users. The database is growing by approximately 75 hours of music per week. Selections from a broad range of musical content, including operas, songs, instrumental music, jazz, rock, and world music, make up the VARIATIONS database.
The following statistics will provide a sense of VARIATIONS activity during the past six months:
During the past year, digitized scores have been added to the VARIATIONS Web site. Since January Music Library staff have completed 1075 pages of 19 scores. In addition, they have digitized about 40 scores (many of which are much larger, averaging 100-150 pages each) that are awaiting HTML markup. Information about this project and some of the scores are accessible at the following URL:
- Number of sound files added or updated: 562
- Number of sound recording titles added or updated: 285
- Average number of sound file accesses per day: 673 (for Spring 2000 semester)
- Total number of sound files in system: 12,142
- Total number of sound recording titles in system: 6650
- Total number of hours of sound in system: approx. 7400
- Victorian Women Writers Project
The goal of the Victorian Women Writers Project is to produce highly accurate transcriptions of works by British women writers of the 19th century, encoded using the Standard Generalized Markup Language (SGML). The works, selected with the assistance of the Advisory Board, will include anthologies, novels, political pamphlets, religious tracts, children's books, and volumes of poetry and verse drama. Considerable attention will be given to the accuracy and completeness of the texts, and to accurate bibliographical descriptions of them.
Texts are encoded according to the Text Encoding Initiative (TEI) Guidelines, using the TEILite.DTD (version 1.6). Each text includes a header describing fully the source text, the editorial decisions, and the resulting computer file. The texts are freely available through the World Wide Web.
Between January and June 2000, project staff added 13 texts (3000 pages, 4.3Mg), bringing the total to 183 texts. There were 178,000 hits on texts in this period.
In late December 1999, General Editor Perry Willett introduced a new search feature, which was enhanced in May 2000. Over 12,500 searches performed in January through June 2000.
- The Hoagy Carmichael Collection
In October 1998, we received two-year funding from the Institute of Museum and Library Services to digitize and preserve the Indiana University Hoagy Carmichael collection housed in the Archives of Traditional Music, the University Archives, and the Lilly Library. November 22, 1999, was the Centennial of Hoagy Carmichael's birth in Bloomington, Indiana. Although the project to catalog, digitize, and preserve the artifacts continues through September 30, 2000, the Carmichael Project team wanted to provide a sample of our work to date by early November -- to join in the celebration of Hoagy's birth.
On November 5, 1999, we launched the Hoagy Carmichael Web site, http://www.dlib.indiana.edu/collections/hoagy/, with complete finding aids for items from the Archives of Traditional Music and as much digital content as we had completed by that time. As of June 2000, this content includes 1,016 music scores, 757 lyric sheets, 1,039 photographs, 222 pieces of correspondence, four scrapbooks, 623 sound files, and 122 images of personal effects. Users can search a complete inventory of the ATM's Carmichael Collection and access selected digital objects and supplemental research information, such as genealogy. For the general user the Web site includes a QuickTime VR virtual tour of the Hoagy Carmichael Room in the Archives of Traditional Music, highlights from the collection, and background information on Carmichael. Users can browse the various collections or they can search for specific information. A more complete progress report was published in the June, 2000, issue of the Internet journal First Monday http://www.firstmonday.org/issues/issue5_6/brancolini/
- Letopis' Zhurnal'nykh statei
In 1999, the Indiana University Digital Library Program received a United States Department of Education Title VI Technology Program grant to digitize and offer on the Web a twenty-year portion of the Letopis' Zhurnal'nykh Statei (1956-1975), a serial publication that indexes Soviet periodicals from 1926 to the present. It covers more than 1,700 journals, series, and continuing publications of academies, universities, and research institutes in the fields of humanities, natural sciences, and the social sciences, and it also covers the popular periodical literature. Once digitized and made available on the World Wide Web, it will provide access to the periodical literature for an essential time in modern Russian history, beginning with the period of the Khrushchev "Thaw" following the 20th CPSU Congress and continuing through the first half of the so-called Brezhnev "Period of Stagnation".
Digitization of the pages has been outsourced; the vendor is Northern Micrographics. The first third of the originals (approximately 80,000 pages) has been digitized and converted to text using Optical Character Recognition (OCR) software called Fine Reader. One year's worth (approx. 6000 pages) of the Letopis has been proofread by our five editors/encoders, who are native Russian speakers. The second third of the originals is currently at the digitization vendor's facility, and is due back at Indiana University on July 15 to begin the OCR process. The final third is in preparation to be sent to the digitization vendor on August 1. Thus we are precisely on schedule to complete the digitization phase of the project by the end of the first year of the grant (November 2000).
The Document Type Definition has gone through several revisions and is now nearing final form. Problems encountered thus far have been mainly associated with the lack of full unicode support in many XML editors and search engines. We have found an XML editor that, while not ideal, will suffice. We have also found an XML search engine that we believe will satisfy the needs of the project. A developer's copy of the proposed XML search engine has been ordered from the software company. The parts of the Letopis that the Indiana University's collection was missing have been acquired, so the database when finished will be a complete run for the years covered (1956-1975). The Project Manager, Andy Spencer, presented a paper on the project at the "Managing the Digital Future of Libraries" conference in Moscow, Russia, in April. The text of the paper can be found at http://www.dlib.indiana.edu/collections/letopis/Moscow_Conference_paper.html.
- Wright American Fiction Project: 1851-1875
Nine libraries in the Committee on Institutional Cooperation (CIC) have agreed to cooperate on a three-year project to digitize the novels listed in Lyle Wright's bibliography, American Fiction 1851-1875. Wright selected a total of 2,832 titles in adult fiction, including "novels, novelettes, romances, short stories, tall tales, tract-like tales, allegories, and fictitious biographies and travels, in prose" (from the introduction), and inventoried 18 American libraries for holdings. This compilation is part of his three-volume set listing American fiction from 1774 through 1900, and is still considered the best bibliography of American adult fiction of the 18th and 19th centuries.
The project will digitize microfilmed versions of the texts, and the page images will be converted to text files using Optical Character Recognition (OCR) software. The unedited text files will be searchable, with the digitized page images displayed. Staff members in the partner libraries will begin proofreading and editing the texts in Summer 2000, and will encode them using SGML following the Text Encoding Initiative (TEI) Guidelines, and the Guidelines for Best Encoding Practices. The cooperative process of proofreading, editing and encoding, shared among the 9 partner libraries, will take three years.
A total of 405 reels of microfilm will be digitized at a rate of approximately 22 reels per month. The first shipment was sent on April 15 and the last shipment will be sent on July 15, 2001. The first shipment, with 75 reels of microfilm, is due back on June 15. On July 11-12, 2000, Perry Willett from Indiana University and Chris Powell from University of Michigan will conduct a workshop at the University of Illinois Chicago to train librarians from the nine institutions in SGML encoding and use of the Text Encoding Initiative Guidelines and the Guidelines for Best Encoding Practices.
- Frank M. Hohenberger Photograph Collection
The Hohenberger Collection, dating from 1917-1960 and housed in the Lilly Library, consists primarily of photographs by Frank Michael Hohenberger, 1876-1963, Brown County photographer and newspaperman. The photograph collection totals 8300 prints and 9400 negatives. Also present in the collection is a small amount of correspondence, a lengthy "diary" by Hohenberger describing many of his photographic tours and processes, and copies of his "Down in the Hills O' Brown County" articles. The total collection is just over 18,700 items.
The first 800 photographs in the Frank M. Hohenberger Photograph Collection have been digitized. A web site for the project offers the complete finding aid, the images, and contextual information about Hohenberger and Brown County. The contextual information includes a biographical essay about Hohenberger written by former Lilly librarian Cecil K. Bryd for an Indiana University Press book of Hohenberger photos and a contemporary article written about Hohenberger and published in the American Magazine in 1933. We are also adding subject descriptors to the images and creating categories for browsing in order to increase the consistency of subject searching. The existing finding aid is based upon the photographer's inventory; it does not employ controlled vocabulary, which makes reliable retrieval difficult.
- U.S. Steel Photograph Collection
In mid-June 1999 we began digitizing the U.S. Steel Photograph Collection, consisting of 1,900 photographs from the Calumet Regional Archive, located Northwest campus of Indiana University, in Gary. These photographs cover the period 1906 to 1941, representing the founding and early years of Gary, Indiana, and U.S. Steel. In June 2000 we received funding from LSTA to finish digitizing the photographs, add searchable text, and create online learning activities for students in grades 4-12. The grant-funded portion of the project will begin September 1, 2000.
send comments or suggestions.
© 2000 Council on Library and Information Resources