Clustering vs. classification
}Clustering is main focus
§Huge amount of data
§Needed a tool to “find the topic”
§Preferably a disjunctive tool (placing files under more than one topic)
}Classification is secondary focus
§Have potential classification (UM’s browse)
§Marrying to current system nigh on impossible
Clustering algorithms divide a data set into natural groups (clusters). Instances in the same cluster are similar to each other, they share certain properties.