About Document Collection Applications

The collection is typically static with no significant change in content after the initial indexing run. Documents can be of any size and of different formats, such as HTML, PDF, or Microsoft Word. These documents are stored in a document table. Searching is enabled by first indexing the document collection.

Queries usually consist of words or phrases. Application users can specify logical combinations of words and phrases using operators such as OR and AND. Other query operations can be used to improve the search results, such as stemming, proximity searching, and wildcarding.

An important factor for this type of application is retrieving documents relevant to a query while retrieving as few non-relevant documents as possible. The most relevant documents must be ranked high in the result list.

The queries for this type of application are best served with a CONTEXT index on your document table. To query this index, the application uses the SQL CONTAINS operator in the WHERE clause of a SELECT statement.

Figure 1-1 Overview of Text Query Application

Description of
Description of "Figure 1-1 Overview of Text Query Application"