Preserving the Whole:
A Two-Track Approach to
Rescuing Social Science Data and Metadata
by Ann Green, JoAnn Dionne, and Martin Dennis
Digital Library Federation
Council on Library and Information Resources
View PDF version (4.31MB) | View HTML version |
Return to CLIR and DLF publications
To buy a printed version click here!
Preserving the Whole appears as the second publication of the Digital Library Federation and reflects the Federation's interests both in advancing the state of the art of social science data archives and in building the infrastructure necessary for the long-term maintenance of digital information. The paper is especially valuable as a meticulously detailed case study of migration as a preservation strategy. It explores the options available for migrating both data stored in a technically obsolete format and their associated documentation stored on paper, which may itself be rapidly deteriorating. The obsolete data format known as column binary was born in the same era of creatively parsimonious coding techniques that have given rise to the widely publicized Year 2000 (Y2K) computer problems.
Beyond its contributions to our understanding of migration as a particular strategy for the long-term maintenance of digital information, Preserving the Whole also provides more general lessons. It is a remarkable finding of this study that the column binary format, although technically obsolete, is so well documented that numerous options exist not just for migrating column binary files to other formats, but also for reading them in their native format. Moreover, the authors make the important observation that data sets will be indecipherable and cannot survive at all, regardless of the file format in which they are stored, if there is no effort made also to preserve their codebooks. A codebook is essential documentation that relates the numeric data to meaningful fields and values of information.
return to top >>