American West Project OAI Harvest Architecture

California Digital Library

Issues

lSustainability???

lWith AmWest, we have topically enriched metadata records, but no clear process at this point for reharvesting or adding additional materials - YIKES!

lClustering ÜClassifying on ingest?

—CDL Experiment (fingers crossed): Can we build a classifier into our ingest routine for harvested sets to filter records matching 1+ of the 147 topics/word bags we’re calling “The American West”?

—Maybe at some point need to re-cluster to test validity of those those topic/word bags and see if additional valid topics can be surfaced?