Oracle Text supplies a knowledge base for English and French. The supplied knowledge contains the information used to perform theme analysis. Theme analysis includes theme indexing, ABOUT
queries, and theme extraction with the CTX_DOC
package.
The knowledge base is a hierarchical tree of concepts and categories. It has six main branches:
Science and technology
Business and economics
Government and military
Social environment
Geography
Abstract ideas and concepts
Oracle Text Reference for the breakdown of the category hierarchy
The supplied knowledge base is like a thesaurus in that it is hierarchical and contains broader term, narrower term, and related term information. As such, you can improve the accuracy of theme analysis by augmenting the knowledge base with your industry-specific thesaurus by linking new terms to existing terms.
You can also extend theme functionality to other languages by compiling a language-specific thesaurus into a knowledge base.
Knowledge bases can be in any single-byte character set. Supplied knowledge bases are in WE8ISO8859P1. You can store an extended knowledge base in another character set such as US7ASCII.
This section contains the following topics.