The query in Example 6-8 returns the attributes that are most important in predicting the age of customer 100010. The prediction functions apply a GLM regression model named GLMR_SH_Regr_sample to the data selected from mining_data_apply_v.
Example 6-8 Prediction Details for Regression
SELECT cust_id,
PREDICTION(GLMR_SH_Regr_sample USING *) pr,
PREDICTION_DETAILS(GLMR_SH_Regr_sample USING *) pd
FROM mining_data_apply_v
WHERE CUST_ID = 100010;
CUST_ID PR PD
------- ----- -----------
100010 25.45 <Details algorithm="Generalized Linear Model">
<Attribute name="FLAT_PANEL_MONITOR" actualValue="1" weight=".025" rank="1"/>
<Attribute name="OCCUPATION" actualValue="Crafts" weight=".019" rank="2"/>
<Attribute name="AFFINITY_CARD" actualValue="0" weight=".01" rank="3"/>
<Attribute name="OS_DOC_SET_KANJI" actualValue="0" weight="0" rank="4"/>
<Attribute name="BOOKKEEPING_APPLICATION" actualValue="1" weight="-.004" rank="5"/>
</Details>
The query in Example 6-9 returns the customers who work in Tech Support and are likely to use an affinity card (with more than 85% probability). The prediction functions apply an SVM classification model named svmc_sh_clas_sample. to the data selected from mining_data_apply_v. The query includes the prediction details, which show that education is the most important predictor.
Example 6-9 Prediction Details for Classification
SELECT cust_id, PREDICTION_DETAILS(svmc_sh_clas_sample, 1 USING *) PD
FROM mining_data_apply_v
WHERE PREDICTION_PROBABILITY(svmc_sh_clas_sample, 1 USING *) > 0.85
AND occupation = 'TechSup'
ORDER BY cust_id;
CUST_ID PD
------- ---------------------------------------------------------------------------------------
100029 <Details algorithm="Support Vector Machines" class="1">
<Attribute name="EDUCATION" actualValue="Assoc-A" weight=".199" rank="1"/>
<Attribute name="CUST_INCOME_LEVEL" actualValue="I: 170\,000 - 189\,999" weight=".044"
rank="2"/>
<Attribute name="HOME_THEATER_PACKAGE" actualValue="1" weight=".028" rank="3"/>
<Attribute name="BULK_PACK_DISKETTES" actualValue="1" weight=".024" rank="4"/>
<Attribute name="BOOKKEEPING_APPLICATION" actualValue="1" weight=".022" rank="5"/>
</Details>
100378 <Details algorithm="Support Vector Machines" class="1">
<Attribute name="EDUCATION" actualValue="Assoc-A" weight=".21" rank="1"/>
<Attribute name="CUST_INCOME_LEVEL" actualValue="B: 30\,000 - 49\,999" weight=".047"
rank="2"/>
<Attribute name="FLAT_PANEL_MONITOR" actualValue="0" weight=".043" rank="3"/>
<Attribute name="HOME_THEATER_PACKAGE" actualValue="1" weight=".03" rank="4"/>
<Attribute name="BOOKKEEPING_APPLICATION" actualValue="1" weight=".023" rank="5"/>
</Details>
100508 <Details algorithm="Support Vector Machines" class="1">
<Attribute name="EDUCATION" actualValue="Bach." weight=".19" rank="1"/>
<Attribute name="CUST_INCOME_LEVEL" actualValue="L: 300\,000 and above" weight=".046"
rank="2"/>
<Attribute name="HOME_THEATER_PACKAGE" actualValue="1" weight=".031" rank="3"/>
<Attribute name="BULK_PACK_DISKETTES" actualValue="1" weight=".026" rank="4"/>
<Attribute name="BOOKKEEPING_APPLICATION" actualValue="1" weight=".024" rank="5"/>
</Details>
100980 <Details algorithm="Support Vector Machines" class="1">
<Attribute name="EDUCATION" actualValue="Assoc-A" weight=".19" rank="1"/>
<Attribute name="FLAT_PANEL_MONITOR" actualValue="0" weight=".038" rank="2"/>
<Attribute name="HOME_THEATER_PACKAGE" actualValue="1" weight=".026" rank="3"/>
<Attribute name="BULK_PACK_DISKETTES" actualValue="1" weight=".022" rank="4"/>
<Attribute name="BOOKKEEPING_APPLICATION" actualValue="1" weight=".02" rank="5"/>
</Details>
The query in Example 6-10 returns the two customers that differ the most from the rest of the customers. The prediction functions apply an anomaly detection model named SVMO_SH_Clas_sample to the data selected from mining_data_apply_v. Anomaly detection uses a one-class SVM classifier.
Example 6-10 Prediction Details for Anomaly Detection
SELECT cust_id, pd FROM
(SELECT cust_id,
PREDICTION_DETAILS(SVMO_SH_Clas_sample, 0 USING *) pd,
RANK() OVER (ORDER BY prediction_probability(
SVMO_SH_Clas_sample, 0 USING *) DESC, cust_id) rnk
FROM mining_data_one_class_v)
WHERE rnk <= 2
ORDER BY rnk;
CUST_ID PD
---------- -----------------------------------------------------------------------------------
102366 <Details algorithm="Support Vector Machines" class="0">
<Attribute name="COUNTRY_NAME" actualValue="United Kingdom" weight=".078" rank="1"/>
<Attribute name="CUST_MARITAL_STATUS" actualValue="Divorc." weight=".027" rank="2"/>
<Attribute name="CUST_GENDER" actualValue="F" weight=".01" rank="3"/>
<Attribute name="HOUSEHOLD_SIZE" actualValue="9+" weight=".009" rank="4"/>
<Attribute name="AGE" actualValue="28" weight=".006" rank="5"/>
</Details>
101790 <Details algorithm="Support Vector Machines" class="0">
<Attribute name="COUNTRY_NAME" actualValue="Canada" weight=".068" rank="1"/>
<Attribute name="HOUSEHOLD_SIZE" actualValue="4-5" weight=".018" rank="2"/>
<Attribute name="EDUCATION" actualValue="7th-8th" weight=".015" rank="3"/>
<Attribute name="CUST_GENDER" actualValue="F" weight=".013" rank="4"/>
<Attribute name="AGE" actualValue="38" weight=".001" rank="5"/>
</Details>