SVM-Based Supervised Classification Example

The following example uses SVM-based classification. It uses essentially the same steps as the Decision Tree example. Some differences between the examples are as follows:

In this example, we set the SVM_CLASSIFIER preference with CTX_DDL.CREATE_PREFERENCE rather than setting it in CTX_CLS.TRAIN. (You can do it either way.)
In this example, our category table includes category descriptions, unlike the category table in the Decision Tree example. (You can do it either way.)
CTX_CLS.TRAIN takes fewer arguments than in the Decision Tree example, as rules are opaque to the user.

Perform the following steps to create a SVM-based supervised classification.

Create and populate the training document table

create table doc (id number primary key, text varchar2(2000));
insert into doc values(1,'1 2 3 4 5 6');
insert into doc values(2,'3 4 7 8 9 0');
insert into doc values(3,'a b c d e f');
insert into doc values(4,'g h i j k l m n o p q r');
insert into doc values(5,'g h i j k s t u v w x y z');

Create and populate the category table

create table testcategory (
        doc_id number, 
        cat_id number, 
        cat_name varchar2(100)
         );
insert into testcategory values (1,1,'number');
insert into testcategory values (2,1,'number');
insert into testcategory values (3,2,'letter');
insert into testcategory values (4,2,'letter');
insert into testcategory values (5,2,'letter');

Create the CONTEXT index on the document table

In this case, we create the index without populating.
```
create index docx on doc(text) indextype is ctxsys.context 
       parameters('nopopulate');
```

Set the SVM_CLASSIFIER

This can also be done in CTX.CLS_TRAIN.

exec ctx_ddl.create_preference('my_classifier','SVM_CLASSIFIER'); 
exec ctx_ddl.set_attribute('my_classifier','MAX_FEATURES','100');

Create the result (rule) table

create table restab (
  cat_id number,
  type number(3) not null,
  rule blob
 );

Perform the training

exec ctx_cls.train('docx', 'id','testcategory','doc_id','cat_id',
     'restab','my_classifier');

Create a CTXRULE index on the rules table

exec ctx_ddl.create_preference('my_filter','NULL_FILTER');
create index restabx on restab (rule) 
       indextype is ctxsys.ctxrule 
       parameters ('filter my_filter classifier my_classifier');

Now we can classify two unknown documents:

select cat_id, match_score(1) from restab 
       where matches(rule, '4 5 6',1)>50;

select cat_id, match_score(1) from restab 
       where matches(rule, 'f h j',1)>50;

drop table doc;
drop table testcategory;
drop table restab;
exec ctx_ddl.drop_preference('my_classifier');
exec ctx_ddl.drop_preference('my_filter');