23
How Automated is Automated?
Automated subject indexing requires significant human input.  But once setup, one can easily index millions of documents.
l creating preprocessing rules
l creating stopword list
l iteratively running clustering
l identifying junk topics
l interpreting and hand-labeling topics
l choosing number of topics
l mapping topics to a standard set of subject headings
l figuring out when to re-cluster
l integrating automatically generated topics into existing systems
l preprocessing
l clustering
l classification
Not Automated
Automated
Lessons Learned