About Classification of a Document

Documents are classified according to predefined rules. These rules select for a category. For instance, a query rule of 'presidential elections' might select documents for a category about politics.

Oracle Text provides several types of classification. One type is simple, or rule-based classification, discussed here, in which you create both document categories and the rules for categorizing documents. With supervised classification, Oracle Text derives the rules from a set of training documents you provide. With clustering, Oracle Text does all the work for you, deriving both rules and categories.

See Also:

"Overview of Document Classification" for more information on classification

To create a simple classification application for document content using Oracle Text, you create rules. Rules are essentially a table of queries that categorize document content. You index these rules in a CTXRULE index. To classify an incoming stream of text, use the MATCHES operator in the WHERE clause of a SELECT statement. See Figure 2-2 for the general flow of a classification application.

Figure 2-2 Overview of a Document Classification Application

Description of
Description of "Figure 2-2 Overview of a Document Classification Application"