The collection is typically static with no significant change in content after the initial indexing run. Documents can be of any size and of different formats, such as HTML, PDF, or Microsoft Word. These documents are stored in a document table. Searching is enabled by first indexing the document collection.
Queries usually consist of words or phrases. Application users can specify logical combinations of words and phrases using operators such as OR
and AND
. Other query operations can be used to improve the search results, such as stemming, proximity searching, and wildcarding.
An important factor for this type of application is retrieving documents relevant to a query while retrieving as few non-relevant documents as possible. The most relevant documents must be ranked high in the result list.
The queries for this type of application are best served with a CONTEXT
index on your document table. To query this index, the application uses the SQL
CONTAINS
operator in the WHERE
clause of a SELECT
statement.
Figure 1-1 Overview of Text Query Application