What we need
}Running the tool locally, with a local WSDL instance, would save lots (and lots) of time }Better set names…does this mean a better algorithm? }Ability to cluster by any criteria, not just topic, i.e., a post-processing module }Disjunctive clustering, meaning (so as not to hog storage) filename (not file) clustering