The default control of access to the Oracle Database semantic data store is at the model level: the owner of a model can grant select, delete, and insert privileges on the model to other users by granting appropriate privileges on the view named RDFM_<model_name>. However, for applications with stringent security requirements, you can enforce a fine-grained access control mechanism by using the Oracle Label Security option of Oracle Database:
Oracle Label Security (OLS) for RDF data allows sensitivity labels to be associated with individual triples stored in an RDF model. For each query, access to specific triples is granted by comparing their labels with the user's session labels. Furthermore, a minimum sensitivity label for all triple describing a specific resource or all triples defined with a specific predicate can be enforced by assigning a sensitivity label directly to the resource or the predicate, respectively.
For information about using OLS, see Oracle Label Security Administrator's Guide.
Deprecation Notice:
Effective with Oracle Database Release 12c (12.1), Virtual Private Database (VPD) support in RDF Semantic Graph is deprecated for providing fine-grained access control, and will be removed in an upcoming major release. (Meanwhile, Appendix C contains information about this deprecated support.)You should not develop new RDF Semantic Graph applications that depend on VPD, and you should transition existing RDF Semantic Graph applications that depend on VPD to use Oracle Label Security (OLS) instead.
For more information, see My Oracle Support Note 1468273.1.
Oracle Label Security (OLS) for RDF data provides two options for securing semantic data:
Triple-level security (explained in Section 5.1), which is highly recommended for its performance and ease of use
Resource-level security (explained in Section 5.2), which is generally not recommended
To specify an option, use the SEM_RDFSA.APPLY_OLS_POLICY procedure with the appropriate rdfsa_options
parameter value.
To switch from one option to the other, remove the existing policy by using the SEM_RDFSA.REMOVE_OLS_POLICY procedure, and then apply the new policy by using the SEM_RDFSA.APPLY_OLS_POLICY procedure with the appropriate rdfsa_options
parameter value.
The triple-level security option provides a thin layer of RDF-specific capabilities on top of the Oracle Database native support for label security. This option provides better performance and is easier to use than the resource-level security (described in Section 5.2), especially for performing inference while using OLS. The main difference is that with triple-level security there is no need to assign labels, explicitly or implicitly, to individual triple resources (subjects, properties, objects).
To use triple-level security, specify SEM_RDFSA.TRIPLE_LEVEL_ONLY
as the rdfsa_options
parameter value when you execute the SEM_RDFSA.APPLY_OLS_POLICY procedure. For example:
EXECUTE sem_rdfsa.apply_ols_policy('defense', SEM_RDFSA.TRIPLE_LEVEL_ONLY);
Do not specify any of the other available parameters for the SEM_RDFSA.APPLY_OLS_POLICY procedure.
When you use triple-level security, OLS is applied to each semantic model in the network. That is, label security is applied to the relevant internal tables and to all the application tables; there is no need to manually apply policies to the application tables of existing semantic models. However, if you need to create additional models after applying the OLS policy, you must use the SEM_OLS.APPLY_POLICY_TO_APP_TAB procedure to apply OLS to the application table before creating the model. Similarly, if you have dropped a semantic model and you no longer need to protect the application table, you can use the SEM_OLS.REMOVE_POLICY_FROM_APP_TAB procedure. (These procedures are described in Chapter 12.)
With triple-level security, duplicate triples with different labels can be inserted in the semantic model. (Such duplicates are not allowed with resource-level security.) For example, assume that you have a triple with a very sensitive label, such as:
(<urn:X>,<urn:P>,<urn:Y>, "TOPSECRET")
This does not prevent a low-privileged (UNCLASSIFIED
) user from inserting the triple (<urn:X>,<urn:P>,<urn:Y>, "UNCLASSIFIED")
. Because SPARQL and SEM_MATCH do not return label information, a query will return both rows (assuming the user has appropriate privileges), and it will not be easy to distinguish between the TOPSECRET
and UNCLASSIFIED
triples.
To filter out such low-security triples when querying the semantic models, you can one or more the following options with SEM_MATCH:
POLICY_NAME
specifies the OLS policy name.
MIN_LABEL
specifies the minimum label for triples that are included in the query
In other words, every triple that contains a label that is strictly dominated by MIN_LABEL
is not included in the query. For example, to filter out the "UNCLASSIFIED" triple, you could use the following query (assuming the OLS policy name is DEFENSE
and that the query user has read privileges over UNCLASSIFIED
and TOPSECRET
triples):
SELECT s,p,y FROM table(sem_match('{?s ?p ?y}' , sem_models(TEST'), null, null, null, null, 'MIN_LABEL=TOPSECRET POLICY_NAME=DEFENSE'));
Note that the filtering in the preceding example occurs in addition to the security checks performed by the native OLS software.
After a triple has been inserted, you can view and update the label information through the CTXT1
column in the application table for the semantic model (assuming that you have the WRITEUP
and WRITEDOWN
privileges to modify the labels).
There are no restrictions on who can perform inference or bulk loading with triple-level security; all of the inferred or bulk loaded triples are inserted with the user's session row label. Note that you can change the session labels by using the SA_UTL package. (For more information, see Oracle Label Security Administrator's Guide.)
When triple-level security is turned on for RDF data stored in Oracle Database, asserted facts are tagged with data labels to enforce mandatory access control. In addition, when a user invokes the forward-chaining based inference function through the SEM_APIS.CREATE_ENTAILMENT procedure, the newly inferred relationships will be tagged with the current row label (SA_UTL.NUMERIC_ROW_LABEL
).
These newly inferred relationships are derived solely based on the information that the user is allowed to access. These relationships do, however, share the same data label. This is understandable because a SEM_APIS.CREATE_ENTAILMENT call can be viewed as a three-step process: read operation, followed by a logical inference computation, followed by a write operation. The read operation gathers information upon which inference computation is based, and it is restricted by access privileges, the user's label, and the data labels; the logical inference computation step is purely mathematical; and the final write of inferred information into the entailed graph is no different from the same user asserting some new facts (which happen to be calculated by the previous step).
Having all inferred assertions tagged with a single label is sufficient if a user only owns a single label. It is, however, not fine-grained enough when there are multiple labels owned by the same user, which is a common situation in a multitenancy setup.
For example, assume a user sets its user label and data label as TopSecret
, invokes SEM_APIS.CREATE_ENTAILMENT, switches to a weaker label named Secret
, and finally performs a SPARQL query. The query will not be able to see any of those newly inferred relationships because they were all tagged with the TopSecret
label. However, if the user switches back to the TopSecret
label, now every single inferred relationship is visible. It is "all or nothing" (that is, all visible or nothing visible) as far as inferred relationships are concerned.
When multiple labels are available for use by a given user, you normally want to assign different labels to different inferred relationships. There are two ways to achieve this goal:
Ladder-based inference, effective with Oracle Database 12c Release 1 (12.1), is probably the simpler and more convenient of the two approaches.
Invoking SEM_APIS.CREATE_ENTAILMENT Multiple Times
Assume a security policy named DEFENSE, a user named SCOTT, and a sequence of user labels Label1, Label2,..., Labeln owned by SCOTT. The following call by SCOTT sets the label as Label1, runs the inference for the first time, and tags the newly inferred triples with Label1:
EXECUTE sa_utl.set_label('defense',char_to_label('defense','Label1')); EXECUTE sa_utl.set_row_label('defense',char_to_label('defense','Label1')); EXECUTE sem_apis.create_entailment('inf', sem_models('contracts'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, null,'');
Now, SCOTT switches the label to Label2, runs the inference a second time, and tags the newly inferred triples with Label2. Obviously, if Label2 is dominated by Label1, then no new triples will be inferred because Label2 cannot see anything beyond what Label1 is allowed to see. If Label2 is not dominated by Label1, the read step of the inference process will probably see a different set of triples, and consequently the inference call can produce some new triples, which will in turn be tagged with Label2.
For the purpose of this example, assume the following condition holds true: for any 1 <= i < j <= n, Labelj is not dominated by Labeli.
EXECUTE sa_utl.set_label('defense',char_to_label('defense','Label2')); EXECUTE sa_utl.set_row_label('defense',char_to_label('defense','Label2')); EXECUTE sem_apis.create_entailment('inf', sem_models('contracts'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, null, 'ENTAIL_ANYWAY=T');
SCOTT continues the preceding actions using the rest of the labels in the label sequence: Label1, Label2, ..., Labeln. The last step will be as follows:
EXECUTE sa_utl.set_label('defense',char_to_label('defense','Labeln')); EXECUTE sa_utl.set_row_label('defense',char_to_label('defense','Labeln')); EXECUTE sem_apis.create_entailment('inf', sem_models('contracts'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, null, 'ENTAIL_ANYWAY=T');
After all these actions are performed, the inference graph probably consists of triples tagged with various different labels.
Using Ladder-Based Inference (LBI)
Basically, ladder-based inference (LBI) wraps in one API call all the actions described in the Invoking SEM_APIS.CREATE_ENTAILMENT Multiple Times approach. Visually, those actions are like climbing up a ladder. When proceeding from one label to the next, more asserted facts become visible or accessible (assuming the new label is not dominated by any of the previous ones), and therefore new relationships can be inferred.
The syntax to invoke LBI is shown in the following example.
EXECUTE sem_apis.create_entailment('inf', sem_models('contracts'), sem_rulebases('owlprime'), SEM_APIS.REACH_CLOSURE, null, null, ols_ladder_inf_lbl_seq=>'numericLabel1 numericLabel2 numericLabel3 numericLabel4' );
The parameter ols_ladder_inf_lbl_seq
specifies a sequence of labels. This sequence is provided as a list of numeric labels delimited by spaces. When using LBI, it is a good practice to arrange the sequence of labels so that weaker labels are put before stronger labels. This will reduce the size of the inferred graph. (If labels do not dominate each other, they can be specified in any order.)
This section presents an extended example illustrating how to apply OLS triple-level security to semantic data. It assumes that OLS has been configured and enabled. The examples are very simplified, and do not reflect recommended practices regarding user names and passwords.
Unless otherwise indicated, perform the steps while connected AS SYSDBA.
Perform some necessary setup steps.
As SYSDBA, create database users named A, B, and C.
create user a identified by <password-for-a>; grant connect, unlimited tablespace, resource to a; create user b identified by <password-for-b>; grant connect, unlimited tablespace, resource to b; create user c identified by <password-for-c>; grant connect, unlimited tablespace, resource to c;
As SYSDBA, create a security administrator and grant privileges.
CREATE USER fgac_admin identified by <password-for-fgac_admin>; GRANT connect, unlimited tablespace,resource to fgac_admin; GRANT SELECT ON mdsys.rdf_link$ to fgac_admin; GRANT EXECUTE ON sa_components TO fgac_admin; GRANT EXECUTE ON sa_user_admin TO fgac_admin; GRANT EXECUTE ON sa_label_admin TO fgac_admin; GRANT EXECUTE ON sa_policy_admin TO fgac_admin; GRANT EXECUTE ON sa_sysdba to fgac_admin; GRANT EXECUTE ON TO_LBAC_DATA_LABEL to fgac_admin; GRANT lbac_dba to fgac_admin;
Connect as the security administrator and create a policy named defense.
CONNECT fgac_admin/<password-for-fgac_admin> EXECUTE SA_SYSDBA.CREATE_POLICY('defense','ctxt1');
Create three security levels (For simplicity, compartments and groups are omitted.)
EXECUTE SA_COMPONENTS.CREATE_LEVEL('defense',3000,'TS','TOP SECRET'); EXECUTE SA_COMPONENTS.CREATE_LEVEL('defense',2000,'SE','SECRET'); EXECUTE SA_COMPONENTS.CREATE_LEVEL('defense',1000,'UN','UNCLASSIFIED');
Create three labels.
EXECUTE SA_LABEL_ADMIN.CREATE_LABEL('defense',1000,'UN'); EXECUTE SA_LABEL_ADMIN.CREATE_LABEL('defense',1500,'SE'); EXECUTE SA_LABEL_ADMIN.CREATE_LABEL('defense',3100,'TS');
Assign labels and privileges.
EXECUTE SA_USER_ADMIN.SET_USER_LABELS('defense','A','UN'); EXECUTE SA_USER_ADMIN.SET_USER_LABELS('defense','B','SE'); EXECUTE SA_USER_ADMIN.SET_USER_LABELS('defense','C','TS'); EXECUTE SA_USER_ADMIN.SET_USER_LABELS('defense','fgac_admin','TS'); EXECUTE SA_USER_ADMIN.SET_USER_PRIVS('defense','FGAC_ADMIN', 'full');
Create a semantic model.
Create a model and share it with some other users.
CONNECT a/<password-for-a> CREATE TABLE project_tpl (triple sdo_rdf_triple_s) compress for oltp; EXECUTE sem_apis.create_sem_model('project', 'project_tpl', 'triple'); GRANT select on mdsys.rdfm_project to B; GRANT select on mdsys.rdfm_project to C; GRANT select, insert, update, delete on project_tpl to B, C;
Ensure that the bulk loading API can be executed.
GRANT insert on project_tpl to mdsys;
Apply the OLS policy for RDF.
CONNECT fgac_admin/fgac_admin BEGIN sem_rdfsa.apply_ols_policy('defense', sem_rdfsa.TRIPLE_LEVEL_ONLY); END; /
Note that the application table now has an extra column named CTXT1:
CONNECT a/<password-for-a>a DESCRIBE project_tpl; Name Null? Type ----------------------------------------- -------- -------------------------- TRIPLE PUBLIC.SDO_RDF_TRIPLE_S CTXT1 NUMBER(10)
Add data to the semantic model.
-- User A uses incremental APIs to add semantic data connnect a/<password-for a) INSERT INTO project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:A>','<urn:hasManager>','<urn:B>')); INSERT INTO project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:B>','<urn:hasManager>','<urn:C>')); INSERT INTO project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:A>','<urn:expenseReportAmount>','"100"')); INSERT INTO project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:expenseReportAmount>','rdfs:subPropertyOf','<urn:projExp>')); COMMIT; -- User B uses bulk API to add semantic data connect b/<password-for-b> CREATE TABLE project_stab(RDF$STC_GRAPH varchar2(4000), RDF$STC_sub varchar2(4000), RDF$STC_pred varchar2(4000), RDF$STC_obj varchar2(4000)) compress; GRANT select on project_stab to mdsys; -- For simplicity, data types are omitted. INSERT INTO project_stab values(null, '<urn:B>','<urn:expenseReportAmount>','"200"'); INSERT INTO project_stab values(null, '<urn:proj1>','<urn:deadline>','"2012-12-25"'); EXECUTE sem_apis.bulk_load_from_staging_table('project','b','project_stab'); -- As User B, check the contents in the application table connect b/<password-for-b> SELECT * from a.project_tpl order by ctxt1; SDO_RDF_TRIPLE_S(8.5963E+18, 7, 1.4711E+18, 2.0676E+18, 8.5963E+18) 1000 SDO_RDF_TRIPLE_S(5.1676E+18, 7, 8.5963E+18, 2.0676E+18, 5.1676E+18) 1000 SDO_RDF_TRIPLE_S(2.3688E+18, 7, 1.4711E+18, 4.6588E+18, 2.3688E+18) 1000 SDO_RDF_TRIPLE_S(7.6823E+18, 7, 4.6588E+18, 1.1911E+18, 7.6823E+18) 1000 SDO_RDF_TRIPLE_S(6.6322E+18, 7, 8.5963E+18, 4.6588E+18, 6.6322E+18) 1500 SDO_RDF_TRIPLE_S(8.4800E+18, 7, 6.2294E+18, 5.4118E+18, 8.4800E+18) 1500 6 rows selected. SELECT count(1) from mdsys.rdfm_project; 6 -- As User A, check the contents in the application table -- As expected, A can only see 4 triples SQL> conn a/<password> SQL> select * from a.project_tpl order by ctxt1; SDO_RDF_TRIPLE_S(8.5963E+18, 7, 1.4711E+18, 2.0676E+18, 8.5963E+18) 1000 SDO_RDF_TRIPLE_S(5.1676E+18, 7, 8.5963E+18, 2.0676E+18, 5.1676E+18) 1000 SDO_RDF_TRIPLE_S(2.3688E+18, 7, 1.4711E+18, 4.6588E+18, 2.3688E+18) 1000 SDO_RDF_TRIPLE_S(7.6823E+18, 7, 4.6588E+18, 1.1911E+18, 7.6823E+18) 1000 SQL> select count(1) from mdsys.rdfm_project; 4 -- User C uses incremental APIs to add semantic data including 2 quads connect c/<password-for-c> INSERT INTO a.project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:C>','<urn:expenseReportAmount>','"400"')); INSERT INTO a.project_tpl(triple) values (sdo_rdf_triple_s('project','<urn:proj1>','<urn:hasBudget>','"10000"')); INSERT INTO a.project_tpl(triple) values (sdo_rdf_triple_s('project:<urn:proj2>','<urn:proj2>','<urn:hasBudget>','"20000"')); INSERT INTO a.project_tpl(triple) values (sdo_rdf_triple_s('project:<urn:proj2>','<urn:proj2>','<urn:dependsOn>','<urn:proj1>')); COMMIT;
Query the data as different users using the default label.
-- Now as user A, B, C, execute the following query select lpad(nvl(g, ' '), 20) || ' ' || s || ' ' || p || ' ' || o from table(sem_match('{ graph ?g { ?s ?p ?o }}', sem_models('project'), null, null, null, null, 'GRAPH_MATCH_UNNAMED=T' )) order by g, s, p, o; connect a/<password-for-a> -- Repeat the preceding query SQL> / urn:A urn:expenseReportAmount 100 urn:A urn:hasManager urn:B urn:B urn:hasManager urn:C urn:expenseReportAmount http://www.w3.org/2000/01/rdf-schema#subPropertyOf urn:projExp SQL> connect b/<password-for-b> SQL> / urn:A urn:expenseReportAmount 100 urn:A urn:hasManager urn:B urn:B urn:expenseReportAmount 200 urn:B urn:hasManager urn:C urn:expenseReportAmount http://www.w3.org/2000/01/rdf-schema#subPropertyOf urn:projExp urn:proj1 urn:deadline 2012-12-25 SQL> connect c/<password-for-c> SQL> / urn:proj2 urn:proj2 urn:dependsOn urn:proj1 urn:proj2 urn:proj2 urn:hasBudget 20000 urn:A urn:expenseReportAmount 100 urn:A urn:hasManager urn:B urn:B urn:expenseReportAmount 200 urn:B urn:hasManager urn:C urn:C urn:expenseReportAmount 400 urn:expenseReportAmount http://www.w3.org/2000/01/rdf-schema#subPropertyOf urn:projExp urn:proj1 urn:deadline 2012-12-25 urn:proj1 urn:hasBudget 10000
As expected, different users (with different labels) can see different sets of triples in the project RDF graph.
Query the same data as a single user using different labels.
The same query used in the preceding step produces just 6 matches:
urn:A urn:expenseReportAmount 100 urn:A urn:hasManager urn:B urn:B urn:expenseReportAmount 200 urn:B urn:hasManager urn:C urn:expenseReportAmount http://www.w3.org/2000/01/rdf-schema#subPropertyOf urn:projExp urn:proj1 urn:deadline 2012-12-25 6 rows selected.
If user C picks the weakest label ("unclassified"), then user C sees even less
exec sa_utl.set_label('defense',char_to_label('defense','UN')); exec sa_utl.set_row_label('defense',char_to_label('defense','UN'));
The same query used in the preceding step produces just 4 matches:
urn:A urn:expenseReportAmount 100 urn:A urn:hasManager urn:B urn:B urn:hasManager urn:C urn:expenseReportAmount http://www.w3.org/2000/01/rdf-schema#subPropertyOf urn:projExp
If user C wants to run the query only against triples/quads with data label that dominates "Secret":
-- First set the label back exec sa_utl.set_label('defense',char_to_label('defense','TS')); exec sa_utl.set_row_label('defense',char_to_label('defense','TS')); select lpad(nvl(g, ' '), 20) || ' ' || s || ' ' || p || ' ' || o from table(sem_match('{ graph ?g { ?s ?p ?o }}', sem_models('project'), null, null, null, null, 'MIN_LABEL=SE POLICY_NAME=DEFENSE GRAPH_MATCH_UNNAMED=T' )) order by g, s, p, o;
The query response excludes those assertions made by user A:
urn:proj2 urn:proj2 urn:dependsOn urn:proj1 urn:proj2 urn:proj2 urn:hasBudget 20000 urn:B urn:expenseReportAmount 200 urn:C urn:expenseReportAmount 400 urn:proj1 urn:deadline 2012-12-25 urn:proj1 urn:hasBudget 10000 6 rows selected.
The same query can be executed as User A. However, no matches are returned, as expected.
You can delete semantic data when OLS is enabled for RDF. In the following example, assume that SEM_RDFSA.APPLY_OLS_POLICY has been executed successfully, and that the same user setup and label designs are used as in the preceding example.
-- First, create a test model as user A and grant access to users B and C connect a/<password-for-a> create table test_tpl (triple sdo_rdf_triple_s) compress for oltp; grant select on mdsys.rdfm_test to B,C; grant select, insert, update, delete on test_tpl to B, C; -- The following will fail with an error message -- "Error while creating triggers: If OLS -- is enabled, you have to apply table policy -- before creating an OLS-enabled model" -- EXECUTE sem_apis.create_sem_model('test', 'test_tpl', 'triple'); -- You need to run this API first connect fgac_admin/<password-for-fgac_admin> EXECUTE sem_ols.apply_policy_to_app_tab('defense', 'A', 'TEST_TPL'); -- Now model creation (after OLS policy has been applied) can go through connect a/<password-for-a> EXECUTE sem_apis.create_sem_model('test', 'test_tpl', 'triple'); -- Add a triple as User A INSERT INTO test_tpl(triple) values (sdo_rdf_triple_s('test','<urn:A>','<urn:p>','<urn:B>')); COMMIT; -- Add the same triple as User B connect b/<password-for-b> INSERT INTO a.test_tpl(triple) values (sdo_rdf_triple_s('test','<urn:A>','<urn:p>','<urn:B>')); COMMIT; -- Now User B can see both triples in the application table as well as the model view set numwidth 20 SELECT * from a.test_tpl; SDO_RDF_TRIPLE_S(8596269297967065604, 19, 1471072612573670395, 28121856352072361 78, 8596269297967065604) 1000 SDO_RDF_TRIPLE_S(8596269297967065604, 19, 1471072612573670395, 28121856352072361 78, 8596269297967065604) 1500 SELECT count(1) from mdsys.rdfm_test; 2 -- User A can only see one triple due to A's label assignment, as expected. SELECT * from a.test_tpl; SDO_RDF_TRIPLE_S(8596269297967065604, 19, 1471072612573670395, 28121856352072361 78, 8596269297967065604) 1000 SELECT count(1) from mdsys.rdfm_test; 1 -- User A issues a delete to remove A's assertions SQL> delete from a.test_tpl; 1 row deleted. COMMIT; Commit complete. -- Now user A has no assertions left. SELECT * from a.test_tpl; no rows selected SELECT count(1) from mdsys.rdfm_test; 0 -- Note that the preceding delete does not affect the same assertion made by B. connect b/<password-for-b> SELECT * from a.test_tpl; SDO_RDF_TRIPLE_S(8596269297967065604, 19, 1471072612573670395, 28121856352072361 78, 8596269297967065604) 1500 SELECT count(1) from mdsys.rdfm_test; 1 -- User B can remove this assertion using a DELETE statement. -- The following DELETE statement uses the oracle_orardf_res2vid function -- to narrow down the scope to triples with a particular subject. DELETE FROM a.test_tpl app_tab where app_tab.triple.rdf_s_id = sdo_sem_inference.oracle_orardf_res2vid('<urn:A>'); 1 row deleted.
Note:
Oracle recommends that you generally use triple-level security rather than resource-level security. Triple-level security is described in Section 5.1.The resource-level security option enables you to assign one or more security labels that define a security level for table rows. Conceptually, a table in a relational data model can be mapped to an equivalent RDF graph. Specifically, a row in a relational table can be mapped to a set of triples, each asserting some facts about a specific Subject. In this scenario, the subject represents the primary key for the row and each non-key column-value combination from the row is mapped to a predicate-object value combination for the corresponding triples.
A row in a relational data model is identified by its key, and OLS, as a row-level access control mechanism, effectively restricts access to the values associated with the key. With this conceptual mapping between relational and RDF data models, restricting access to a row in a relational table is equivalent to restricting access to a subgraph involving a specific subject. In a model that supports sensitivity labels for each triple, this is enforced by applying the same label to all the triples involving the given subject. However, you can also achieve greater flexibility by allowing the individual triples to have different labels, while maintaining a minimum bound for all the labels.
OLS support for RDF data employs a multilevel approach in which sensitivity labels associated with the triple components (subject, predicate, object) collectively form a minimum bound for the sensitivity label for the triple. With this approach, a data sensitivity label associated with an RDF resource (used as subject, predicate, or object) restricts unauthorized users from accessing any triples involving the resource and from creating new triples with the resource. For example, projectHLS
as a subject may have a minimum sensitivity label, which ensures that all triples describing this subject have a sensitivity label that at least covers the label for projectHLS
. Additionally, hasContractValue
as a predicate may have a higher sensitivity label; and when this predicate is used with projectHLS
to form a triple, that triple minimally has a label that covers both the subject and the predicate labels, as in the following example:
Triple 1: <http://www.myorg.com/contract/projectHLS> :ownedBy <http://www.myorg.com/department/Dept1> Triple 2: <http://www.myorg.com/contract/projectHLS> :hasContractValue "100000"^^xsd:integer
Sensitivity labels are associated with the RDF resources (URIs) based on the position in which they appear in a triple. For example, the same RDF resource may appear in different positions (subject, predicate, or object) in different triples. Three unique labels can be assigned to each resource, so that the appropriate label is used to determine the label for a triple based on the position of the resource in the triple. You can choose the specific resource positions to be secured in a database instance when you apply an OLS policy to the RDF repository. You can secure subjects, objects, predicates, or any combination, as explained in separate sections to follow. The following example applies an OLS policy named defense
to the RDF repository and allows sensitivity labels to be associated with RDF subjects and predicates.
begin sem_rdfsa.apply_ols_policy( policy_name => 'defense', rdfsa_options => sem_rdfsa.SECURE_SUBJECT+ sem_rdfsa.SECURE_PREDICATE); end; /
The same RDF resource can appear in both the subject and object positions (and sometime even as the predicate), and such a resource can have distinct sensitivity labels based on its position. A triple using the resource at a specific position should have a label that covers the label corresponding to the resource's position. In such cases, the triple can be asserted or accessed only by the users with labels that cover the triple and the resource labels.
In a specific RDF repository, security based on data classification techniques can be turned on for subjects, predicates, objects, or a combination of these. This ensures that all the triples added to the repository automatically conform to the label relationships described above.
An RDF resource (typically a URI) appears in the subject position of a triple when an assertion is made about the resource. In this case, a sensitivity label associated with the resource has following characteristics:
The label represents the minimum sensitivity label for any triple using the resource as a subject. In other words, the sensitivity label for the triple should dominate or cover the label for the subject.
The label for a newly added triple is initialized to the user initial row label or is generated using the label function, if one is specified. Such operations are successful only if the triple's label dominates the label associated with the triple's subject.
Only a user with an access label that dominates the subject's label and the triple's label can read the triple.
By default, the sensitivity label for a subject is derived from the user environment when an RDF resource is used in the subject position of a triple for the first time. The default sensitivity label in this case is set to the user's initial row label (the default that is assigned to all rows inserted by the user).
However, you can categorize an RDF resource as a subject and assign a sensitivity label to it even before it is used in a triple. The following example assigns a sensitivity label named SECRET:HLS:US
to the projectHLS
resource, thereby restricting the users who are able to define new triples about this resource and who are able to access existing triples with this resource as the subject:
begin
sem_rdfsa.set_resource_label(
model_name => 'contracts',
resource_uri => '<http://www.myorg.com/contract/projectHLS>',
label_string => 'SECRET:HLS:US',
resource_pos => 'S');
end;
An RDF predicate defines the relationship between a subject and an object. You can use sensitivity labels associated with RDF predicates to restrict access to specific types of relationships with all subjects.
RDF predicates are analogous to columns in a relational table, and the ability to restrict access to specific predicates is equivalent to securing relational data at the column level. As in the case of securing the subject, the predicate's sensitivity label creates a minimum bound for any triples using this predicate. It is also the minimum authorization that a user must posses to define a triple with the predicate or to access a triple with the predicate.
The following example assigns the label HSECRET:FIN
(in this scenario, a label that is Highly Secret and that also belongs to the Finance department) to triples with the hasContractValue
predicate, to ensure that only a user with such clearance can define the triple or access it:
begin sem_rdfsa.set_predicate_label( model_name => 'contracts', predicate => '<http://www.myorg.com/pred/hasContractValue>', label_string => 'HSECRET:FIN'); end; /
You can secure predicates in combination with subjects. In such cases, the triples using a subject and a predicate are ensured to have a sensitivity label that at least covers the labels for both the subject and the predicate. Extending the preceding example, if projectHLS
as a subject is secured with label SECRET:HLS:US
and if hasContractValue
as a predicate is secured with label HSECRET:FIN:
, a triple assigning a monetary value for projectHLS
should at least have a label HSECRET:HLS,FIN:US
. Effectively, a user's label must dominate this triple's label to be able to define or access the triple.
An RDF triple can have an URI or a literal in its object position. The URI in object position of a triple represents some resource. You can secure a resource in the object position by associating a sensitivity label to it, to restrict the ability to use the resource as an object in triples.
Typically, a resource (URI or non-literal) appearing in the object position of a triple may itself be described using additional RDF statements. Effectively, an RDF resource in the object position could appear in the subject position in some other triples. When the RDF resources are secured at the object position without explicit sensitivity labels, the label associated with the same resource in the subject position is used as the default label for the object.
RDF data model allows for specification of declarative rules, enabling it to infer the presence of RDF statements that are not explicitly added to the repository. The following shows some simple declarative rules associated with the logic that projects can be owned by departments and departments have Vice Presidents, and in such cases the project leader is by default the Vice President of the department that owns the project.
RuleID -> projectLedBy Antecedent Expression -> (?proj :ownedBy ?dept) (?dept :hasVP ?person) Consequent Expression -> (?proj :isLedBy ?person)
An RDF rule uses some explicitly asserted triples as well as previously inferred triples as antecedents, and infers one or more consequent triples. Traditionally, the inference process is executed as an offline operation to pregenerate all the inferred triples and to make them available for subsequent query operations.
When the underlying RDF graph is secured using OLS, any additional data inferred from the graph should also be secured to avoid exposing the data to unauthorized users. Additionally, the inference process should run with higher privileges, specifically with full access to data, in order to ensure completeness.
OLS support for RDF data offers techniques to generate sensitivity labels for inferred triples based on labels associated with one or more RDF artifacts. It provides label generation techniques that you can invoke at the time of inference. Additionally, it provides an extensibility framework, which allows an extensible implementation to receive a set of possible labels for a specific triple and determine the most appropriate sensitivity label for the triple based on some application-specific logic. The techniques that you can use for generating the labels for inferred triples include the following (each technique, except for Use Antecedent Labels, is associated with a SEM_RDFSA package constant):
Use Rule Label (SEM_RDFSA.LABELGEN_RULE
): An inferred triple is directly generated by a specific rule, and it may be indirectly dependent on other rules through its antecedents. Each rule may have a sensitivity label, which is used as the sensitivity label for all the triples directly inferred by the rule.
Use Subject Label (SEM_RDFSA.LABELGEN_SUBJECT
): Derives the label for the inferred triple by considering any sensitivity labels associated with the subject in the new triple. Each inferred triple has a subject, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the subject in the newly inferred triple may have one or more labels associated with it. With the Use Subject Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.
Use Predicate Label (SEM_RDFSA.LABELGEN_PREDICATE
): Derives the label for the inferred triple by considering any sensitivity labels associated with the predicate in the new triple. Each inferred triple has a predicate, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the predicate in the newly inferred triple may have one or more labels associated with it. With the Use Predicate Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.
Use Object Label (SEM_RDFSA.LABELGEN_OBJECT
): Derives the label for the inferred triple by considering any sensitivity labels associated with the object in the new triple. Each inferred triple has an object, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the object in the newly inferred triple may have one or more labels associated with it. With the Use Object Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.
Use Dominating Label (SEM_RDFSA.LABELGEN_DOMINATING
): Each inferred triple minimally has four direct components: subject, predicate, object, and the rule that produced the triple. With the Use Dominating Label technique, at the time of inference the label generator computes the most dominating of the sensitivity labels associated with each of the component and assigns it as the sensitivity label for the inferred triple. Exception labels are assigned when a clear dominating relationship cannot be established between various labels.
Use Antecedent Labels: In addition to the four direct components for each inferred triple (subject, predicate, object, and the rule that produced the triple), a triple may have one or more antecedent triples, which are instrumental in deducing the new triple. With the Use Antecedent Labels technique, the labels for all the antecedent triples are considered, and conflict resolution criteria are implemented to determine the most appropriate label for the new triple. Since an inferred triple may be dependent on other inferred triples, a strict order is followed while generating the labels for all the inferred triples.
The Use Antecedent Labels technique requires that you use a custom label generator. For information about creating and using a custom label generator, see Section 5.2.5.
The following example creates an entailment (rules index) for the contracts data using a specific rulebase. This operation can only be performed by a user with FULL access privilege with the OLS policy applied to the RDF repository. In this case, the labels generated for the inferred triples are based on the labels associated with their predicates, as indicated by the use of the SEM_RDFSA.LABELGEN_PREDICATE
package constant in the label_gen
parameter.
begin
sem_rdfsa.create_entailment(
index_name_in => 'contracts_inf',
models_in => SDO_RDF_Models('contracts'),
rulebases_in => SDO_RDF_Rulebases('contracts_rb'),
options => 'USER_RULES=T',
label_gen => sem_rdfsa.LABELGEN_PREDICATE);
end;
When the predefined or extensible label generation implementation cannot compute a unique label to be applied to an inferred triple, an exception label is set for the triple. Such triples are not accessible by any user other than the user with full access to RDF data (also the user initiating the inference process). The triples with exception labels are clearly marked, so that a privileged user can access them and apply meaningful labels manually. After the sensitivity labels are applied to inferred triples, only users with compatible labels can access these triples. The following example updates the sensitivity label for triples for which an exception label was set:
update mdsys.rdfi_contracts_inf set ctxt1 = char_to_label('defense', 'SECRET:HLS:US') where ctxt1 = -1;
Inferred triples accessed through generated labels might not be same as conceptual triples inferred directly from the user accessible triples and rules. The labels generated using system-defined or custom implementations cannot be guaranteed to be precise. See the information about Fine-Grained Access Control (OLS and VPD) Considerations in the Usage Notes for the SEM_APIS.CREATE_ENTAILMENT procedure in Chapter 11 for details.
The MDSYS.RDFSA_LABELGEN type is used to apply appropriate label generator logic at the time of index creation; however, you can also extend this type to implement a custom label generator and generate labels based on application logic. The label generator is specified using the label_gen
parameter with the SEM_APIS.CREATE_ENTAILMENT procedure. To use a system-defined label generator, specify a SEM_RDFSA package constant, as explained in Section 5.2.4; to use a custom label generator, you must implement a custom label generator type and specify an instance of that type instead of a SEM_RDFSA package constant.
To create a custom label generator type, you must have the UNDER privilege on the RDFSA_LABELGEN type. In addition, to create an index for RDF data , you must should have the EXECUTE privilege on this type. The following example grants these privileges to a user named RDF_ADMIN:
GRANT under, execute ON mdsys.rdfsa_labelgen TO rdf_admin;
The custom label generator type must implement a constructor, which should set the dependent resources and specify the getNumericLabel method to return the label computed from the information passed in, as shown in the following example:
CREATE OR REPLACE TYPE CustomSPORALabel UNDER mdsys.rdfsa_labelgen ( constructor function CustomSPORALabel return self as result, overriding member function getNumericLabel ( subject rdfsa_resource, predicate rdfsa_resource, object rdfsa_resource, rule rdfsa_resource, anteced rdfsa_resource) return number);
The label generator constructor uses a set of constants defined in the SEM_RDFSA package to indicate the list of resources on which the label generator relies. The dependent resources are identified as an inferred triple's subject, its predicate, its object, the rule that produced the triple, and its antecedents. A custom label generator can rely on any subset of these resources for generating the labels, and you can specify this in its constructor by using the constants defined in SEM_RDFSA package : USE_SUBJECT_LABEL, USE_PREDICATE_LABEL, USE_OBJECT_LABEL, USE_RULE_LABEL, USE_ANTCED_LABEL. The following example creates the type body and specifies the constructor:
Example 5-1 creates the type body, specifying the constructor function and the getNumericLabel member function. (Application-specific logic is not included in this example.)
Example 5-1 Creating a Custom Label Generator Type
CREATE OR REPLACE TYPE BODY CustomSPORALabel AS constructor function CustomSPORALabel return self as result as begin self.setDepResources(sem_rdfsa.USE_SUBJECT_LABEL+ sem_rdfsa.USE_PREDICATE_LABEL+ sem_rdfsa.USE_OBJECT_LABEL+ sem_rdfsa.USE_RULE_LABEL+ sem_rdfsa.USE_ANTECED_LABELS); return; end CustomSPORALabel; overriding member function getNumericLabel ( subject rdfsa_resource, predicate rdfsa_resource, object rdfsa_resource, rule rdfsa_resource, anteced rdfsa_resource) return number as labellst mdsys.int_array := mdsys.int_array(); begin -- Find dominating label of S P O R A – –- Application specific logic for computing the triple label – -- Copy over all labels to labellst -- for li in 1 .. subject.getLabelCount() loop labellst.extend; labellst(labellst.COUNT) = subject.getLabel(li); end loop; --- Copy over other labels as well --- --- Find a dominating of all the labels. Generates –1 if no --- dominating label within the set return self.findDominatingOf(labellst); end getNumericLabel; end CustomSPORALabel; /
In Example 5-1, the sample label generator implementation uses all the resources contributing to the inferred triple for generating a sensitivity label for the triple. Thus, the constructor uses the setDepResources
method defined in the superclass to set all its dependent components. The list of dependent resources set with this step determines the exact list of values passed to the label generating routine.
The getNumericLabel
method is the label generation routine that has one argument for each resource that an inferred triple may depend on. Some arguments may be null values if the corresponding dependent resource is not set in the constructor implementation.
The label generator implementation can make use of a general-purpose static routine defined in the RDFSA_LABELGEN type to find a domination label for a given set of labels. A set of labels is passed in an instance of MDSYS.INT_ARRAY type, and the method finds a dominating label among them. If no such label exists, an exception label –1 is returned.
After you have implemented the custom label generator type, you can use the custom label generator for inferred data by assigning an instance of this type to the label_gen
parameter in the SEM_APIS.CREATE_ENTAILMENT procedure, as shown in the following example:
begin
sem_apis.create_entailment(
index_name_in => 'contracts_rdfsinf',
models_in => SDO_RDF_Models('contracts'),
rulebases_in => SDO_RDF_Rulebases('RDFS'),
options => '',
label_gen => CustomSPORALabel());
end;
/
The MDSYS.RDFOLS_SECURE_RESOURCE view contains information about resources secured with Oracle Label Security (OLS) policies and the sensitivity labels associated with these resources.
Select privileges on this view can be granted to appropriate users. To view the resources associated with a specific model, you must also have select privileges on the model (or the corresponding RDFM_model-name view).
The MDSYS.RDFOLS_SECURE_RESOURCE view contains the columns shown in Table 5-1.
Table 5-1 MDSYS.RDFOLS_SECURE_RESOURCE View Columns
Column Name | Data Type | Description |
---|---|---|
MODEL_NAME |
VARCHAR2(25) |
Name of the model. |
MODEL_ID |
NUMBER |
Internal identifier for the model. |
RESOURCE_ID |
NUMBER |
Internal identifier for the resource; to be joined with MDSYS.RDF_VALUE$.VALUE_ID column for information about the resource. |
RESOURCE_TYPE |
VARCHAR2(16) |
One of the following string values to indicate the resource type for which the label is assigned: |
CTXT1 |
NUMBER |
Sensitivity label assigned to the resource. |