A MATCHES
query finds all rows in a query table that match a given document. Assuming that a table querytable
has a CTXRULE
index associated with it, enter the following query:
SELECT classification FROM querytable WHERE MATCHES(query_string,:doc_text) > 0;
Note the bind variable :doc_text
which contains the document CLOB
to be classified.
Combining everything into a simple example:
create table queries ( query_id number, query_string varchar2(80) ); insert into queries values (1, 'oracle'); insert into queries values (2, 'larry or ellison'); insert into queries values (3, 'oracle and text'); insert into queries values (4, 'market share'); create index queryx on queries(query_string) indextype is ctxsys.ctxrule; select query_id from queries where matches(query_string, 'Oracle announced that its market share in databases increased over the last year.')>0
This query will return queries 1 (the word oracle appears in the document) and 4 (the phrase market share appears in the document), but not 2 (neither the word larry nor the word ellison appears, and not 3 (there is no text in the document, so it does not match the query).
Note that, in this example, the document was passed in as a string for simplicity. Typically, your document would be passed in a bind variable.
The document text used in a matches query can be VARCHAR2
or CLOB
. It does not accept BLOB
input, so you cannot match filtered documents directly. Instead, you must filter the binary content to CLOB
using the AUTO_FILTER
filter. For the following example, we make two assumptions: one, that the document data is in the bind variable :doc_blob
; and, two, that we have already defined a policy, my_policy
, that CTX_DOC.POLICY_FILTER
can use. For example:
declare doc_text clob; begin -- create a temporary CLOB to hold the document text doc_text := dbms_lob.createtemporary(doc_text, TRUE, DBMS_LOB.SESSION); -- create a simple policy for this example ctx_ddl.create_preference(preference_name => 'fast_filter', object_name => 'AUTO_FILTER'); ctx_ddl.set_attribute(preference_name => 'fast_filter', attribute_name => 'OUTPUT_FORMATTING', attribute_value => 'FALSE'); ctx_ddl.create_policy(policy_name => 'my_policy', filter => 'fast_filter); -- call ctx_doc.policy_filter to filter the BLOB to CLOB data ctx_doc.policy_filter('my_policy', :doc_blob, doc_text, FALSE); -- now do the matches query using the CLOB version for c1 in (select * from queries where matches(query_string, doc_text)>0) loop -- do what you need to do here end loop; dbms_lob.freetemporary(doc_text); end;
The procedure CTX_DOC.POLICY_FILTER
filters the BLOB
into the CLOB
data, because you need to get the text into a CLOB
to enter a MATCHES
query. It takes as one argument the name of a policy you have already created with CTX_DDL.CREATE_POLICY
.
Oracle Text Reference for information on CTX_DOC.POLICY_FILTER
If your file is text in the database character set, then you can create a BFILE
and load it to a CLOB
using the function DBMS_LOB.LOADFROMFILE
, or you can use UTL_FILE
to read the file into a temp CLOB
locator.
If your file needs AUTO_FILTER
filtering, then you can load the file into a BLOB
instead, and call CTX_DOC.POLICY_FILTER
, as previously shown.
Classifying Documents in Oracle Text for more extended classification examples