A stopword is a word that is not to be indexed. Usually stopwords are low information words in a given language such as this and that in English.
By default, Oracle Text provides a list of stopwords called a stoplist for indexing a given language. You can modify this list or create your own with the CTX_DDL
package. You specify the stoplist in the parameter string of CREATE INDEX
.
A stoptheme is a word that is prevented from being theme-indexed or prevented from contributing to a theme. You can add stopthemes with the CTX_DDL
package.
You can search document themes with the ABOUT
operator. You can retrieve document themes programatically with the CTX_DOC
PL/SQL package.
At query time, the language of the query is inherited from the query template, or from the session language (if no language is specified through the query template).
You can also create multi-language stoplists to hold language-specific stopwords. A multi-language stoplist is useful when you use the MULTI_LEXER
to index a table that contains documents in different languages, such as English, German, and Japanese.
At index creation, the language column of each document is examined, and only the stopwords for that language are eliminated. At query time, the session language setting determines the active stopwords, like it determines the active lexer when using the multi-lexer.