Choosing the Mining Function

The mining function is a required argument to the CREATE_MODEL procedure. A data mining function specifies a class of problems that can be modeled and solved.

Data mining functions implement either supervised or unsupervised learning. Supervised learning uses a set of independent attributes to predict the value of a dependent attribute or target. Unsupervised learning does not distinguish between dependent and independent attributes. Supervised functions are predictive. Unsupervised functions are descriptive.

Note:

In data mining terminology, a function is a general type of problem to be solved by a given approach to data mining. In SQL language terminology, a function is an operator that returns a value.

In Oracle Data Mining documentation, the term function, or mining function refers to a data mining function; the term SQL function or SQL Data Mining function refers to a SQL function for scoring (applying data mining models).

You can specify any of the values in Table 5-3 for the mining_function parameter to CREATE_MODEL.


Table 5-3 Mining Model Functions

Mining_Function Value Description

ASSOCIATION

Association is a descriptive mining function. An association model identifies relationships and the probability of their occurrence within a data set. (association rules)

Association models use the Apriori algorithm.

ATTRIBUTE_IMPORTANCE

Attribute Importance is a predictive mining function. An attribute importance model identifies the relative importance of attributes in predicting a given outcome.

Attribute Importance models use the Minimum Description Length algorithm.

CLASSIFICATION

Classification is a predictive mining function. A classification model uses historical data to predict a categorical target.

Classification models can use Naive Bayes, Decision Tree, Logistic Regression, or Support Vector Machines. The default is Naive Bayes.

The classification function can also be used for anomaly detection. In this case, the SVM algorithm with a null target is used (One-Class SVM).

CLUSTERING

Clustering is a descriptive mining function. A clustering model identifies natural groupings within a data set.

Clustering models can use k-Means, O-Cluster, or Expectation Maximization. The default is k-Means.

FEATURE_EXTRACTION

Feature Extraction is a descriptive mining function. A feature extraction model creates a set of optimized attributes.

Feature extraction models can use Non-Negative Matrix Factorization or Singular Value Decomposition (which can also be used for Principal Component Analysis). The default is Non-Negative Matrix Factorization.

REGRESSION

Regression is a predictive mining function. A regression model uses historical data to predict a numerical target.

Regression models can use Support Vector Machines or Linear Regression. The default is Support Vector Machine.


See Also:

Oracle Data Mining Concepts for an introduction to mining functions