Oracle® Database Semantic Technologies Developer's Guide 11g Release 2 (11.2) Part Number E11828-10 |
|
|
PDF · Mobi · ePub |
The SEM_RDFCTX package contains subprograms (functions and procedures) to manage extractor policies and semantic indexes created for documents. To use the subprograms in this chapter, you should understand the conceptual and usage information in Chapter 4, "Semantic Indexing for Documents".
This chapter provides reference information about the subprograms, listed in alphabetical order.
SEM_RDFCTX.CREATE_POLICY(
policy_name IN VARCHAR2,
extractor mdsys.rdfctx_extractor,
preferences sys.XMLType DEFAULT NULL);
or
SEM_RDFCTX.CREATE_POLICY(
policy_name IN VARCHAR2,
base_policy IN VARCHAR2,
user_models mdsys.rdf_models default null);
Creates an extractor policy. (The first format is for a base policy; the second format is for a policy that is dependent on a base policy.)
Name of the extractor policy.
An instance of a subtype of the RDFCTX_EXTRACTOR type that encapsulates the extraction logic for the information extractor.
Any preferences associated with the policy.
Base extractor policy for a dependent policy.
List of user models for a dependent policy.
An extractor policy created using this procedure determines the characteristics of a semantic index that is created using the policy. Each extractor policy refers to an instance of an extractor type, either directly or indirectly. An extractor policy with a direct reference to an extractor type instance can be used to compose other extractor policies that include additional RDF models for ontologies.
An instance of the extractor type assigned to the extractor parameter must be an instance of a direct or indirect subtype of type mdsys.rdfctx_extractor
.
The RDF models specified in the user_models
parameter must be accessible to the user that is creating the policy.
The preferences specified for extractor policy determine the type of repository used for the documents to be indexed and other relevant information. For more information, see Section 4.8, "Indexing External Documents".
The following example creates an extractor policy using the gatenlp_extractor extractor type, which is included with the Oracle Database support for semantic indexing.
begin sem_rdfctx.create_policy (policy_name => 'SEM_EXTR', extractor => mdsys.gatenlp_extractor()); end; /
The following example creates a dependent policy for the previously created extractor policy, and it adds the user-defined RDF model geo_ontology
to the dependent policy.
begin sem_rdfctx.create_policy (policy_name => 'SEM_EXTR_PLUS_GEOONT', base_policy => 'SEM_EXTR', user_models => SEM_MODELS ('geo_ontology')); end; /
SEM_RDFCTX.DROP_POLICY(
policy_name IN VARCHAR2);
Deletes (drops) an unused extractor policy.
Name of the extractor policy.
An exception is generated if the specified policy being is used for a semantic index for documents or if a dependent extractor policy exists for the specified policy.
The following example drops the SEM_EXTR_PLUS_GEOONT
extractor policy.
begin sem_rdfctx.drop_policy (policy_name => 'SSEM_EXTR_PLUS_GEOONT'); end; /
SEM_RDFCTX.MAINTAIN_TRIPLES(
index_name IN VARCHAR2,
where_clause IN VARCHAR2,
rdfxml_content sys.XMLType,
policy_name IN VARCHAR2 DEFAULT NULL,
action IN VARCHAR2 DEFAULT 'ADD');
Adds one or more triples to graphs that contain information extracted from specific documents.
Name of the semantic index for documents.
A SQL predicate (WHERE clause text without the WHERE
keyword) on the table in which the documents are stored, to identify the rows for which to maintain the index.
Triples, in the form of an RDF/XML document, to be added to the individual graphs corresponding to the documents.
Name of the extractor policy. If policy_name
is null (the default), the triples are added to the information extracted by the default (or the only) extractor policy for the index; if you specify a policy name, the triples are added to the information extracted by that policy.
Type of maintenance operation to perform on the triples. The only value currently supported in ADD
(the default), which adds the triples that are specified in the rdfxml_content
parameter.
The information extracted from the semantically indexed documents may be incomplete and lacking in proper context. This procedure enables a domain expect to add triples to individual graphs pertaining to specific semantically indexed documents, so that all subsequent SEM_CONTAINS queries can consider these triples in their document search criteria.
This procedure accepts the index name and WHERE clause text to identify the specific documents to be annotated with the additional triples. For example, the where_clause might be specified as a simple predicate involving numeric data, such as 'docId IN (1,2,3)'
.
The following example annotates a specific document with the semantic index ArticleIndex
by adding triples to the corresponding individual graph.
begin sem_rdfctx.maintain_triples( index_name => 'ArticleIndex', where_clause => 'docid = 15', rdfxml_content => sys.xmltype( '<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:pred="http://myorg.com/pred/"> <rdf:Description rdf:about=" http://newscorp.com/Org/ExampleCorp"> <pred:hasShortName rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> Example </pred:hasShortName> </rdf:Description> </rdf:RDF>')); end; /
SEM_RDFCTX.SET_DEFAULT_POLICY(
index_name IN VARCHAR2,
policy_name IN VARCHAR2);
Sets the default extractor policy for a semantic index that is configured with multiple extractor policies.
Name of the semantic index for documents.
Name of the extractor policy to be used as the default extractor policy for the specified semantic index. Must be one of the extractor policies listed in the PARAMETERS clause of the CREATE INDEX statement that created index_name
.
When you create a semantic index for documents, you can specify multiple extractor policies as a space-separated list of names in the PARAMETERS clause of the CREATE INDEX statement. As explained in Section 4.3, "Semantically Indexing Documents", the first policy from this list is used as the default extractor policy for all SEM_CONTAINS queries that do not identify an extractor policy by name. You can use the SEM_RDFCTX.SET_DEFAULT_POLICY procedure to set a different default policy for the index.
The following example sets CITY_EXTR
as the default extractor policy for the ArticleIndex
index.
begin sem_rdfctx.set_default_policy (index_name => 'ArticleIndex', policy_name => 'CITY_EXTR'); end; /
SEM_RDFCTX.SET_EXTRACTOR_PARAM(
param_key IN VARCHAR2,
patam_value IN VARCHAR2,
param_desc IN VARCHAR2);
Configures the Oracle Database semantic indexing support to work with external information extractors, such as Calais and GATE.
Key for the parameter to be set.
Value for the parameter to be set.
Short description for the parameter to be set.
You must have the SYSDBA role to use this procedure.
To work with the Calais extractor type (see Section 4.9), you must specify values for the following parameters:
CALAIS_WS_ENDPOINT
: Web service end point for Calais.
CALAIS_KEY
: License key for Calais.
CALAIS_WS_SOAPACTION
: SOAP action for the Calais Web service.
To work with the General Architecture for Text Engineering (GATE) extractor type (see Section 4.10), you must specify values for the following parameters:
GATE_NLP_HOST
: Host for the GATE NLP Listener.
GATE_NLP_PORT
: Port for the GATE NLP Listener.
In addition to these parameters, you may need to specify a value for the HTTP_PROXY
parameter to work with information extractors or index documents that are outside the firewall.
A database instance only has one set of values for these parameters, and they are used for all instances of semantic indexes using the corresponding information extractor. You can use this procedure if you need to change the existing values of any of the parameters.
For examples, see the following sections: