California Digital Library
Clustering :: American West Project
lExperimentation
—Enough metadata in OAI records to get good results?
—Explore process/workload
—Harvested approx. 360K records from AmWest-likely OAI sets from partners and other data providers
—Did 7 topic model runs on this prototype “collection”
•Used dc:title, dc:description, dc:subject only from harvested records
l“For real” clustering
—Approx. 240K records from AmWest partners only
—Did 4 topic model runs using the same DC elements noted above
—
—