Clustering Algorithms
Clustering
Different clustering algorithms use the same
bag-of-words representation, but produce
different results.
-Tells you which documents are semantically close
-Tells you which words are close to which documents (solves synonymy in IR)
NMF, Latent
Semantic Analysis (LSA)
- Determines the
topics addressed in the
collection
- Assigns multiple
topics to each document
Probabilistic
Clustering
(e.g.
Topic Model)
- Groups the
documents into K clusters
(topics)
- Assigns one topic
to each document
K-Means
or Hierarchical Clustering
Environment