For the most likely cluster assignments of customer 100955 (probability of assignment > 20%), the query in Example 6-6 produces the five attributes that have the most impact for each of the likely clusters. The clustering functions apply an Expectation Maximization model named em_sh_clus_sample
to the data selected from mining_data_apply_v
. The "5" specified in CLUSTER_DETAILS
is not required, because five attributes are returned by default.
SELECT S.cluster_id, probability prob, CLUSTER_DETAILS(em_sh_clus_sample, S.cluster_id, 5 USING T.*) det FROM (SELECT v.*, CLUSTER_SET(em_sh_clus_sample, NULL, 0.2 USING *) pset FROM mining_data_apply_v v WHERE cust_id = 100955) T, TABLE(T.pset) S ORDER BY 2 DESC; CLUSTER_ID PROB DET ---------- ----- ---------------------------------------------------------------------------- 14 .6761 <Details algorithm="Expectation Maximization" cluster="14"> <Attribute name="AGE" actualValue="51" weight=".676" rank="1"/> <Attribute name="HOME_THEATER_PACKAGE" actualValue="1" weight=".557" rank="2"/> <Attribute name="FLAT_PANEL_MONITOR" actualValue="0" weight=".412" rank="3"/> <Attribute name="Y_BOX_GAMES" actualValue="0" weight=".171" rank="4"/> <Attribute name="BOOKKEEPING_APPLICATION"actualValue="1" weight="-.003" rank="5"/> </Details> 3 .3227 <Details algorithm="Expectation Maximization" cluster="3"> <Attribute name="YRS_RESIDENCE" actualValue="3" weight=".323" rank="1"/> <Attribute name="BULK_PACK_DISKETTES" actualValue="1" weight=".265" rank="2"/> <Attribute name="EDUCATION" actualValue="HS-grad" weight=".172" rank="3"/> <Attribute name="AFFINITY_CARD" actualValue="0" weight=".125" rank="4"/> <Attribute name="OCCUPATION" actualValue="Crafts" weight=".055" rank="5"/> </Details>