Tuesday, 4/11/06
Halbert - DLF 2006 Spring
19
Focused Crawling
•"Focused crawling system" (FCS)
•dev. by Aaron Krowne, Saurabh Pathak, in cooperation with Donna Bergmark
•Built on Heritrix
•Purpose: efficient, topic-driven discovery of web resources
•"Focused" with a classifier (BOW)
•Based on BOW module for Heritrix