Changes in This Release for Oracle Data Mining Concepts

This preface lists changes in Oracle Data Mining Concepts.

Changes in Oracle Data Mining 12c Release 1 (12.1)

The following are changes in Oracle Data Mining Concepts for Oracle Database 12c Release 1 (12.1).

New Features

The following features are new in this release:

  • New clustering algorithm: Expectation Maximization

    In addition to enhanced k-Means and O-Cluster, Oracle Data Mining now supports Expectation Maximization, a probabilistic clustering algorithm that creates a density model of the data. The density model allows for an improved approach to combining data originating in different domains (for example, sales transactions and customer demographics, or structured data and text or other unstructured data).

    Because of the probabilistic nature of Expectation Maximization, its cluster assignment probabilities may be more reliable than those produced by k-Means or O-Cluster. Also, the Expectation Maximization algorithm automatically determines the optimal number of clusters needed to model the data.

    See Chapter 12, "Expectation Maximization".

  • New feature extraction algorithm: Singular Value Decomposition with Principal Component Analysis

    In addition to Non-Negative Matrix Factorization, Oracle Data Mining now supports Singular Value Decomposition and Principal Component Analysis, two powerful feature extraction methods that use orthogonal linear projections to capture the underlying variance of the data. Principal Component Analysis is implemented as a special scoring method for the Singular Value Decomposition algorithm.

    Singular Value Decomposition and Principal Component Analysis scale well to very large data sizes (both rows and attributes), and they have a powerful data compression capability. With the introduction of these new methods, Oracle Data Mining extends its feature extraction capabilities to new contexts involving time series, unstructured data, and very large numerical data sets (for example, data from sensors such as Radio Frequency Identification).

    See Chapter 19, "Singular Value Decomposition".

  • Generalized Linear Models enhanced to support feature selection and creation

    Generalized Linear Models provide great transparency, which may be achieved at the expense of accuracy. With the introduction of a feature selection and creation capability, Generalized Linear Models can maintain a high degree of accuracy without sacrificing transparency (the ability to explain the predictions made by the model).

    Feature selection is the process of selecting the most meaningful attributes. Feature creation is the process of combining attributes into features. With feature selection, Generalized Linear Models can be created with fewer predictors, leading to smaller models and faster scoring. With feature creation, Generalized Linear Models use non-linear terms (up to cubic terms), leading to more powerful models and increased transparency.

    See Chapter 13, "Generalized Linear Models".

  • Significant enhancements in text mining

    This enhancement greatly simplifies the data mining process (model build, deployment and scoring) when unstructured text data is present in the input:

    See "Text Data". (See Oracle Data Mining User's Guide for details.)

  • Prediction details expanded

    The PREDICTION_DETAILS function now supports all predictive algorithms and returns more details about the predictors. New functions, CLUSTER_DETAILS and FEATURE_DETAILS, are introduced.

    See "In-Database Scoring" for information about the Data Mining SQL functions. (See Oracle Database SQL Language Reference for details.)

  • Dynamic scoring

    The Data Mining SQL functions now support an analytic clause for scoring data dynamically without a pre-defined model.

    See "In-Database Scoring" for information about the Data Mining SQL functions. (See Oracle Database SQL Language Reference for details.)

Desupported Features

The following features are no longer supported by Oracle. See Oracle Database Upgrade Guide for a complete list of desupported features in this release.

  • Oracle Data Mining Java API

  • Adaptive Bayes Network (ABN) algorithm

Other Changes

The following are additional changes in Oracle Data Mining Concepts for 12c Release 1 (12.1):