Tips for using the CDB ht://Dig Search Interface


The CDB search interface is implemented with the ht://Dig package. The htsearch process actually executes the search. Look at the description of the htsearch method for a general summary of how documents are matched and ranked. The CDB search interface is configured to use the search algorithm of:

exact:1 substring:0.7 synonyms:0.5 endings:0.1

This means that exact matches of the word(s) are given the most weight, followed by matches that are substrings of the search phrase, synonyms of the search phrase, and the least weight is given to matches that represent the root of the search word(s) with a different ending.

The search matches and output can be modified using the menus and checkboxes on the search interface. These are described below.


Match
  • All: Retrieves pointers to documents that contain all of the strings entered in the search field. For example, searching on the phrase island temperature will retrieve all documents containing the word island and the word temperature, but not documents containing only the word island or only the word temperature.
  • Any: Retrieves pointers to documents that contain the any of the strings entered in the search field. For example, searching on the phrase island temperature will retrieve all documents containing the word island (even if they don't contain the word temperature), all documents containing the word temperature (even if they don't contain the word island), as well as documents containing both of the words island and temperature.
  • Boolean: Boolean searches require at least two search words/strings that are joined by a boolean qualifier (and, or, not). For example, wind not rain is acceptable, but not rain is unacceptable.
    • Not: indicates exclusion from the search.
    • And: similar to All (see above). The search results will consist of only those documents which satisfy both elements joined by the "and."
    • Or: similar to Any (see above). Only one of the elements in the ""or" phrase must be satisfied.
    • ( ): groups boolean phrases together. Statements within the innermost set of parenthesis will be satisfied first, then the next innermost, etc.
Output format
  • Long: Displays the title of each retrieved document along with a brief document content description.
  • Short: Displays only the title of the retrieved documents.
Show [ ] matches per page
Use this menu to select the number of results that you wish to see displayed on each page.
Sort

Still looking for information on...
  • searching for a specific phrase instead of individual words (i.e., specifying words in order)
  • controlling "wildcard effect" when entering something like "CDF" -- how to indicate that you don't want "NCDF"?