‹header›

‹date/time›

Click to edit Master text styles

Second level

Third level

Fourth level

Fifth level

‹footer›

‹#›

Libraries have now been involved in the creation of digital library systems for more than a decade. In some cases they have developed a large number of projects. For example, Cornell University has developed Project Euclid; the Core Historical Literature of Agriculture; Historical Math Monographs; the USDA Economics, Statistics and Market Information System; the Samuel J. May Anti-Slavery Collection; the Making of America project; and the HEARTH (Home Economics Archive: Research, Tradition, History) collection. Rather than re-inventing descriptive metadata for each one of these systems, many of which include digital representations of materials the library has in its print collections, the descriptive metadata used to describe those print materials—MARC metadata from the library catalog—is repurposed and used to describe the digital representations.

The MARC metadata is converted, repurposed, transformed—take your pick—into a wide variety of different metadata schemes, depending on the particular system in question. It takes some planning to put these transformations into place. First, librarians determine how to map one metadata scheme into another. As I’ve said, the mapping for one scheme is often different from that for another. For example, the HEARTH project dealt with materials that had been cataloged according to pre-AACR and AACR cataloging rules. As a result, some descriptive metadata records recorded the edition statement in the field (MARC 250) established for that and some in another, more generic notes field (5XX). Librarians worked with the IT person to ensure that edition statements would be pulled from the most appropriate field. In the Samuel J. May Anti-Slavery collection, information for the date of publication for the pamphlets was recorded in the MARC 008/07-10 field because the MARC records used for the May Collection cataloged the original source documents, not the digital versions. These characters were mapped to the publication date in the metadata record in the digital system. On the other hand, for the Historical Math Monograph Collection, catalogers had created catalog records for the digital versions of the original print monographs. The 008/07-10 field contained the date for the original print monograph, and the 008/11-14 field contained the date for the digital version, so information for this field was mapped to the publication date in the metadata record in the digital system. All this is just planning for how metadata transformations will be done.

As you can see, we’ve been talking about different kinds of metadata mappings and transformations that have been implemented in different digital library projects. With a disperse library system and staff like Cornell has, these tools need to be broadly available to library staff rather than centralized in a single library unit, or on a single library staff computer. They should be gathered together in one place accessible to all library staff. Not only should the tools of metadata mapping and transformation be recorded and made available, but also documentation about the intellectual work—the decision processes that arrived at these tools and how they have been implemented for specific collections. This whole process is part what could be called a metadata management design.

A metadata management design would:

1. Promote the sharing and reuse of tools

2. Recognize that library staff are users too but at the same time doesn’t forget the end user. If library staff are able to utilize a service that helps them reuse metadata mappings and transformations, they will be able to more efficiently create new digital library projects, that, in turn, will serve the end user.

3. Improve operational activities by making it easier to access better managed metadata

4. Reduces the risk that metadata generation and transformation processes will be lost as an organization’s structure, staff, and activities change over time.

This notion of metadata management is not new. Indeed, there are a number of reasons for an organization to efficiently manage any of its resources; these include:

1. An organization must optimize its resources because it can’t afford an unlimited supply of everything

2. An organization must share and leverage a resource as much as possible to maximize its value and minimize its cost

3. An organization must anticipate a resource’s requirements and meet them proactively, and it can’t effectively do this unless its resources are well-managed

4. For all these reasons, organizations must manage their resources carefully to ensure that they are used prudently, efficiently, effectively, and securely

The discipline of data resource management views data as an important organizational resource that should be treated like any other resource. According to the Data Management Association, data resource management “facilitates the stewardship of data” and recognizes that data (or metadata) is an important asset to an organization and is made more valuable through “planning, communication, control, coordination, and management.”

There are a number of benefits of applying data resource management principles to library operations.

Culturally, pulling metadata tools into one place will help to improve communication between librarians—they will know what each other is doing. In addition, because many people who have previously been working on digital library projects will be involved in pulling this disparate metadata together, a many of those in the organization will be a part of an activity that help the organization succeed. Lastly, people involved in pulling together metadata into one place will begin to see this library resource as one that is shared throughout the organization.

Functionally, creating a metadata management design will help library staff to more easily locate metadata tools generated in other parts of the library, thus minimizing the creation of redundant metadata tools. This in turn will result in more productivity among metadata and IT practitioners, which will in turn result in faster development of new metadata applications. In addition, the new applications that are developed will be more easily integrated into the existing metadata environment. All this will result in increased organizational success.

Before a library begins creating a metadata management scheme, it must first bring together metadata managers to discuss the costs and benefits of creating such a scheme and identify all those involved in the creation of metadata tools. Next, staff need to document or inventory the existing metadata relationships and processes; after all, they may be located in different places and managed by different staff. This documentation would record:

1) Authoritative versions of metadata

2) Locations of metadata files (server, filename, identifier)

3) Those responsible for the metadata

4) Backups for the metadata

5) Metadata standards in use

6) Metadata content relationships

7) Maintenance protocols

8) Storage instances

9) Display occurrences

10) Mapping schemes

11) Transformation and processing applications

12)

Next, this documentation would be used to develop a metadata management design that would incorporate and coordinate all these tools.

Getting specific, a metadata inventory for MARC metadata projects such as the May Anti-Slavery Collection could include:

“(1) The MARC bibliographic metadata, both content and content designations, as stored in the library management system. “(2) The extract script or tool that selects and extracts the MARC bibliographic metadata for the project.

“(3) The file that is the product of the extract in (2).

“(4) The collection-specific MARC mapping used for the project.

“(5) The XML metadata collection and storage scheme.

“(6) The transformation script the creates an XML file – meeting the specifications of the metadata store scheme (5) – and populates the descriptive metadata of this file with MARC metadata elements from the extract (3), following the MARC mapping specified for this project (4).

“(7) The XML file that is the product of the transformation in (6).

“(8) The transformation script that generates a TEI Lite file by taking metadata from the XML metadata storage files in (7) and integrating it with page-level optical character recognition data.

“(9) The TEI Lite file that is the product of the script in (8).

“(10) The DTD used to validate the files in (9).

“(11)The project metadata as stored in the digital collection delivery system after the TEI Lite XML file is ingested.

Once a library has completed such an inventory, other steps must be taken towards the implementation of a metadata management design:

(1) Build library-wide consensus regarding metadata element decisions and generalized mappings. (2) Organize meetings with stakeholders to discuss the costs and benefits of creating a MARC metadata repurposing design for the library. Metadata practitioners will want to use these discussions to demonstrate to their colleagues the value of stewardship in making tools and resource files more broadly accessible. Stakeholders’ meetings will present opportunities to begin articulating the roles and responsibilities of metadata staff and information technology staff in metadata management. (3) Begin to develop reusable transformation tools that would help to make such a metadata design logical (4) Move from discussions of a MARC repurposing design to discussions of creating a library-wide metadata management design. Stakeholders in library-wide metadata management design will likely be more numerous than those interested in MARC repurposing design because library metadata activities typically extend beyond MARC repurposing. (5) Investigate the costs and benefits of taking the creation of a library-wide metadata management design yet further by investigating the creation of a metadata management repository of mapping schematics, transformation tools, data files, and other metadata resources. Creating a metadata repository would involve treating metadata components as persistent digital objects with persistent identifiers and descriptive metadata in order to facilitate their discovery and retrieval through a digital content delivery system. Building searchable metadata repositories would make it easier for libraries to share their metadata mapping and transformation resources with each other.

Michael will now talk about what such a repository might look like.

There are three pedals on the floor, two levers on the steering column, and one floor lever to the left of the driver. The floor lever is neutral while in the upright position, second gear when in the forward position while the leftmost pedal is not depressed, and emergency brake when all the way back.

The leftmost pedal is first gear while depressed, second gear if the floor lever is forward when released. The middle pedal is reverse gear and the rightmost pedal is the brake. The right lever on the steering column is the gas, and the other lever is the spark advance. Confused? Once you drive for a month or so, it gets easy, but the controls are far from orthogonal. If you get into trouble, you can just stomp on all three pedals and that will stop you pretty quick. Doing this causes the bands in the transmission to lock up the drive train. The best thing to remember while driving is to plan ahead.