Oracle Data Mining supports unstructured text within columns of VARCHAR2
, CHAR
, CLOB
, BLOB
, and BFILE
, as described in Table 7-1.
Table 7-1 Column Data Types That May Contain Unstructured Text
Data Type | Description |
---|---|
|
Oracle Data Mining interprets |
|
Oracle Data Mining interprets |
|
Oracle Data Mining interprets |
|
Oracle Data Mining interprets Oracle Data Mining interprets |
The settings described in Table 7-2 control the term extraction process for text attributes in a model. Instructions for specifying model settings are in "Specifying Model Settings".
Table 7-2 Model Settings for Text
Setting Name | Data Type | Setting Value | Description |
---|---|---|---|
|
|
Name of an Oracle Text policy object created with |
Affects how individual tokens are extracted from unstructured text. See "Creating a Text Policy". |
|
|
1 <= value <= 100000 |
Maximum number of features to use from the document set (across all documents of each text column) passed to Default is 3000. |
A model can include one or more text attributes. A model with text attributes can also include categorical and numerical attributes.
To create a model that includes text attributes:
Create an Oracle Text policy object, as described in "Creating a Text Policy".
Specify the model configuration settings that are described in Table 7-2.
Specify which columns should be treated as text and, optionally, provide text transformation instructions for individual attributes. See "Configuring a Text Attribute".
Pass the model settings and text transformation instructions to DBMS_DATA_MINING.CREATE_MODEL
. See "Embedding Transformations in a Model".
Note:
All algorithms except O-Cluster can support columns of unstructured text.
The use of unstructured text is not recommended for association rules (Apriori).