user-configurable relevance ranking, and we still need to do more research to
figure out if/how we can improve the algorithms that we have in place now.
|To create relevance
ranking in Endeca, we the customer select from a variety of available
modules, ordering them based on their importance in determining relevancy.
indexes can (and do) have different strategies Keyword Anywhere and
ISBN/ISSN indexes are searching very different types of data with different
goals. B/c Keyword Anywhere searches nearly everything, weve spent most
effort considering relevance for this search.
|Some of the modules
available in Endeca rank results based on phrase matching, field weighting,
frequency of term, or static ordering of value in a particular field in the
records. Works in a tiebreaker fashion, where all results are first ordered
according to first module, then any ties are broken by the second module, and
on down the line.
|At NCSU, emphasize
|Original query (no thesaurus, stemming, spell correction)
is most relevant
|Exact phase user entered more relevant than terms occurring
separately in the field
|Find most relevant field that contains search terms(s). We
have complete control over ordering the fields. For instance, we emphasize
Title matches as more relevant than Author matches which are more relevant
than Table of Contents matches. We still need to study the effects of
changing this ordering to improve relevance ranking.
|How many fields in the record contain the terms? The more
fields, the more relevant the record.
7 total modules for Keyword Anywhere