Digital Library Federation Forum, April 12, 2006
Unbundling the ILS: Deploying an e-commerce catalog search solution
Relevance defined
nRelevance ranking in Endeca – select from a variety of modules and order them based on importance.
nRelevance most important in Keyword Anywhere - searches all fields.
nAt NCSU…
1.Original query term(s) (no thesaurus, stemming, spell correction)
2.Exact phrase match
3.Field ranking (Title higher than Author higher than Table of Contents)
4.Number of fields that contain term(s) …
n
Endeca offers user-configurable relevance ranking, and we still need to do more research to figure out if/how we can improve the algorithms that we have in place now.
To create relevance ranking in Endeca, we the customer select from a variety of available modules, ordering them based on their importance in determining relevancy.
Different search indexes can (and do) have different strategies – Keyword Anywhere and ISBN/ISSN indexes are searching very different types of data with different goals. B/c Keyword Anywhere searches nearly everything, we’ve spent most effort considering relevance for this search.
Some of the modules available in Endeca rank results based on phrase matching, field weighting, frequency of term, or static ordering of value in a particular field in the records. Works in a tiebreaker fashion, where all results are first ordered according to first module, then any ties are broken by the second module, and on down the line.
At NCSU, emphasize
•Original query (no thesaurus, stemming, spell correction) is most relevant
•Exact phase user entered more relevant than terms occurring separately in the field
•Find most relevant field that contains search terms(s). We have complete control over ordering the fields. For instance, we emphasize Title matches as more relevant than Author matches which are more relevant than Table of Contents matches. We still need to study the effects of changing this ordering to improve relevance ranking.
•How many fields in the record contain the terms? The more fields, the more relevant the record.
•And on…7 total modules for Keyword Anywhere