This chapter explains the fundamentals of XPath rewrite for object-relational storage of XMLType
data in Oracle XML DB. It details the rewriting of XPath-expression arguments to various SQL functions.
This chapter contains these topics:
See Also:
"Performance Tuning for XQuery"Oracle XML DB can often optimize queries that use XPath expressions — for example, queries involving SQL functions such as XMLQuery
, XMLTable
, and XMLExists
, which take XPath (XQuery) expressions as arguments. The XPath expression is, in effect, evaluated against the XML document without ever constructing the XML document in memory.
This optimization is called XPath rewrite. It is a proper subset of XML query optimization, which also involves optimization of XQuery expressions, such as FLWOR expressions, that are not XPath expressions. XPath rewrite also enables indexes, if present on the column, to be used in query evaluation by the Optimizer.
The XPath expressions that can be rewritten by Oracle XML DB are a proper subset of those that are supported by Oracle XML DB. Whenever you can do so without losing functionality, use XPath expressions that can be rewritten.
XPath rewrite can occur in these contexts (or combinations thereof):
When XMLType
data is stored in an object-relational column or table or when an XMLType
view is built on relational data.
When you use an XMLIndex
index.
See Also:
"XMLIndex"This chapter covers the first case: rewriting queries that use object-relational XML data or XMLType
views. The XMLType
views can be XML schema-based or not. Object-relational storage of XMLType
data is always XML schema-based. Examples in this chapter are related to XML schema-based tables.
When XPath rewrite is possible for object-relational XML data, the database optimizer can derive an execution plan based on conventional relational algebra. This in turn means that Oracle XML DB can leverage all of the features of the database and ensure that SQL statements containing XQuery and XPath expressions are executed in a highly performant and efficient manner. There is little overhead with this rewriting, so Oracle XML DB executes XQuery-based and XPath-based queries at near-relational speed.
In certain cases, XPath rewrite is not possible. This typically occurs when there is no SQL equivalent of the XPath expression. In this situation, Oracle XML DB performs a functional evaluation of the XPath expressions, which is generally more costly, especially if the number of documents to be processed is large.
Example 19-1 illustrates XPath rewrite for a simple query that uses an XPath expression.
SELECT po.OBJECT_VALUE FROM purchaseorder po WHERE XMLCast(XMLQuery('$p/PurchaseOrder/Requestor' PASSING po.OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(128)) = 'Sarah J. Bell';
The XMLCast(XMLQuery...))
expression here is rewritten to the underlying relational column that stores the requestor information for the purchase order. The query is rewritten to something like the following:Foot 1
SELECT OBJECT_VALUE FROM purchaseorder p WHERE CAST (p."XMLDATA"."REQUESTOR" AS VARCHAR2(128)) = 'Sarah J. Bell';
Table 19-1 describes some XPath expressions that are rewritten during XPath rewrite.
Table 19-1 Sample of XPath Expressions that Are Rewritten to Underlying SQL Constructs
XPath Expression for Translation | Description |
---|---|
Simple XPath expressions (expressions with
|
Involves traversals over object type attributes only, where the attributes are simple scalar or object types themselves. |
Collection traversal expressions:
|
Involves traversal of collection expressions. The only axes supported are child and attribute axes. Collection traversal is not supported if the SQL function is used during a |
Predicates:
|
Predicates in the XPath are rewritten into SQL predicates. |
List index (positional predicate):
|
Indexes are rewritten to access the nth item in a collection. |
Wildcard traversals:
|
If the wildcard can be translated to one or more simple XPath expressions, then it is rewritten. |
Descendant axis (XML schema-based data only), without recursion:
|
Similar to a wildcard expression. The |
Descendant axis (XML schema-based data only), with recursion:
|
The
|
XPath functions |
Some XPath functions are rewritten. These functions include |
See Also:
"Performance Tuning for XQuery" for information about rewrite of XQuery expressionsXPath expressions that involve elements stored out of line can be automatically rewritten. The rewritten query involves a join with the out-of-line table. Example 19-2 shows such a query.
Example 19-2 XPath Rewrite for an Out-Of-Line Table
SELECT XMLCast(XMLQuery('declare namespace x = "http://www.oracle.com/emp.xsd"; (: :) /x:Employee/Name' PASSING OBJECT_VALUE RETURNING CONTENT) AS VARCHAR2(20)) FROM emp_tab WHERE XMLExists('declare namespace x = "http://www.oracle.com/emp.xsd"; (: :) /x:Employee/Addr[City="San Francisco"]' PASSING OBJECT_VALUE); XMLCAST(XMLQUERY(... -------------------- Abe Bee Eve Fong George Hu Iris Jones Karl Luomo Marina Namur Omar Pinano Quincy Roberts 8 rows selected.
The XQuery expression here is rewritten to a SQL EXISTS
subquery that queries table addr_tab
, joining it with table emp_tab
using the object identifier column in addr_tab
. The optimizer uses full table scans of tables emp_tab
and addr_tab
. If there are many entries in the addr_tab
, then you can try to make this query more efficient by creating an index on the city, as shown in Example 19-3. An explain-plan fragment for the same query as in Example 19-2 shows that the city index is picked up.
Example 19-3 Using an Index with an Out-Of-Line Table
CREATE INDEX addr_city_idx
ON addr_tab (extractValue(OBJECT_VALUE, '/Addr/City'));
| 2 | TABLE ACCESS BY INDEX ROWID| ADDR_TAB | 1 | 2012 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | ADDR_CITY_IDX | 1 | | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP_TAB | 16 | 32464 | 2 (0)| 00:00:01 |
Note:
When gathering statistics for the optimizer on anXMLType
table that is stored object-relationally, Oracle recommends that you gather statistics on all of the tables defined by the XML schema, that is, all of the tables in USER_XML_TABLES
. You can use procedure DBMS_STATS.gather_schema_stats
to do this, or use DBMS_STATS.gather_table_stats
on each such table. This informs the optimizer about all of the dependent tables that are used to store the XMLType
data.This section presents some guidelines for using execution plans to do the following, for queries that use XPath expressions:
Analyze query execution, to determine whether XPath rewrite occurs.
Optimize query execution, by using secondary indexes.
Use these guidelines together, taking all that apply into consideration.
As is true also for the rest of this chapter, this section is applicable only to XMLType
data that is stored object-relationally.
XPath rewrite for object-relational storage means that a query that selects XML fragments defined by an XPath expression is rewritten to a SQL SELECT
statement on the underlying object-relational tables and columns. These underlying tables can include out-of-line tables.
You can use PL/SQL procedure DBMS_XMLSTORAGE_MANAGE.XPath2TabColMapping
to find the names of the underlying tables and columns that correspond to a given XPath expression.
See Also:
Oracle Database PL/SQL Packages and Types Reference for information about procedure XPath2TabColMapping
The execution plan of a query that has been rewritten refers to the object-relational tables and columns that underlie the queried XMLType
data.
The names of the underlying tables can be meaningful to you, if they are derived from XML element or attribute names or if the governing XML schema explicitly names them by using annotation xdb:defaultTable
. Otherwise, these names are system-generated and have no obvious meaning. In particular, they do not reflect the corresponding XML element or attribute names. Also, some system-generated columns are hidden. You do not see them if you use the SQL describe
command. They nevertheless show up in execution plans.
The plan of a query that has not been rewritten shows only the base table names, and it typically refers to user-level XML functions, such as XMLExists
. Look for this difference to determine whether a query has been optimized. The XML function name shown in an execution plan is actually the internal name (for example, XMLEXISTS2
), which is sometimes slightly different from the user-level name.
Example 19-4 shows the kind of execution plan output that is generated when Oracle XML DB cannot perform XPath rewrite. The plan here is for a query that uses SQL/XML function XMLExists
. The corresponding internal function XMLExists2
appears in the plan output, indicating that the query is not rewritten.
Example 19-4 Execution Plan Generated When XPath Rewrite Does Not Occur
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(XMLEXISTS2('$p/PurchaseOrder[User="SBELL"]' PASSING BY VALUE
SYS_MAKEXML('61687B202644E297E040578C8A175C1D',4215,"PO"."XMLEXTRA","PO"."X
MLDATA") AS "p")=1)
In this situation, Oracle XML DB constructs a pre-filtered result set based on any other conditions specified in the query WHERE
clause. It then filters the rows in this potential result set to determine which rows belong in the result set. The filtering is performed by constructing a DOM on each document and performing a functional evaluation using the methods defined by the DOM API to determine whether or not each document is a member of the result set.
When designing an XML schema, use annotation xdb:defaultTable
to name the underlying tables that correspond to elements that you select in queries where performance is important. This lets you easily recognize them in an execution plan, indicating by their presence or absence whether the query has been rewritten.
For collection tables, there is no corresponding XML schema annotation. To give user-friendly names to your collection tables you must first register the XML schema. Then you can use PL/SQL procedure DBMS_XMLSTORAGE_MANAGE.renameCollectionTable
to rename the tables that were created during registration, which have system-generated names.
See Also:
Oracle Database PL/SQL Packages and Types Reference for information about procedurerenameCollectionTable
A query resulting from XPath rewrite sometimes includes a SQL predicate (WHERE
clause). This can happen even if the original query does not use an XPath predicate, and it can happen even if the original query does not have a SQL WHERE
clause.
When this happens, you can sometimes improve performance by creating an index on the column that is targeted by the SQL predicate, or by creating an index on a function application to that column. Example 19-1 illustrates XPath rewrite for a query that includes a WHERE
clause. Example 19-5 shows the predicate information from an execution plan for this query.
Example 19-5 Analyzing an Execution Plan to Determine a Column to Index
Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(CAST("PURCHASEORDER"."SYS_NC00021$" AS VARCHAR2(128))='Sarah J. Bell' AND SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege xmlns="http://xmlns.oracle.com/xdb/acl.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.com/xdb/dav.xsd "><read-properties/><read-contents/></privilege>'))=1)
The predicate information indicates that the expression XMLCast(XMLQuery...))
is rewritten to an application of SQL function cast
to the underlying relational column that stores the requestor information for the purchase order, SYS_NC0021$
. This column name is system-generated. The execution plan refers to this system-generated name, in spite of the fact that the governing XML schema uses annotation SQLName
to name this column REQUESTOR
.
Because these two names (user-defined and system-generated) refer to the same column, you can create a B-tree index on this column using either name. Alternatively, you can use the extractValue
shortcut to create the index, by specifying an XPath expression that targets the purchase-order requestor data.
You can obtain the names of the underlying table and columns that correspond to a given XPath expression using procedure DBMS_XMLSTORAGE_MANAGE.XPath2TabColMapping
. Example 19-6 illustrates this for the XPath expression /PurchaseOrder/Requestor
used in the WHERE
clause of Example 19-1.
Example 19-6 Using DBMS_XMLSTORAGE_MANAGE.XPATH2TABCOLMAPPING
SELECT DBMS_XMLSTORAGE_MANAGE.XPath2TabColMapping(USER, 'PURCHASEORDER', '', '/PurchaseOrder/Requestor', '') FROM DUAL; DBMS_XMLSTORAGE_MANAGE.XPath2TabColMapping(US --------------------------------------------- <Result> <Mapping TableName="PURCHASEORDER" ColumnName="SYS_NC00021$"/> </Result>
If you provide an XPath expression that contains a wildcard or a descendent axis then multiple tables and columns might be selected. In that case procedure XPath2TabColMapping
returns multiple <Mapping>
elements, one for each table-column pair.
You can then use the table and column names retrieved this way in a CREATE INDEX
statement to create an index that corresponds to the XPath expression. Example 19-7 shows three equivalent ways to create a B-tree index on the predicate-targeted column.
Example 19-7 Creating an Index on a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder ("SYS_NC00021$"); CREATE INDEX requestor_index ON purchaseorder ("XMLDATA"."REQUESTOR"); CREATE INDEX requestor_index ON purchaseorder (extractvalue(OBJECT_VALUE, '/PurchaseOrder/Requestor'));
However, for this particular query it makes sense to create a function-based index, using a functional expression that matches the one in the rewritten query. Example 19-8 illustrates this.
Example 19-8 Creating a Function-Based Index for a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder (cast("XMLDATA"."REQUESTOR" AS VARCHAR2(128)));
Example 19-9 shows an execution plan that indicates that the index is picked up.
Example 19-9 Execution Plan Showing that Index Is Picked Up
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 524 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| PURCHASEORDER | 1 | 524 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | REQUESTOR_INDEX | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
xmlns="http://xmlns.oracle.com/xdb/acl.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.oracle.com/xdb/acl.xsd
http://xmlns.oracle.com/xdb/acl.xsd
DAV:http://xmlns.oracle.com/xdb/dav.xsd">
<read-properties/><read-contents/></privilege>'))=1)
2 - access(CAST("SYS_NC00021$" AS VARCHAR2(128))='Sarah J. Bell')
In the particular case of this query, the original functional expression applies XMLCast
to XMLQuery
to target a singleton element, Requestor
. This is a special case, where you can as a shortcut use such a functional expression directly in the CREATE INDEX
statement. That statement is rewritten to create an index on the underlying scalar data. Example 19-10, which targets an XPath expression, thus has the same effect as Example 19-8, which targets the corresponding object-relational column.
Example 19-10 Creating a Function-Based Index for a Column Targeted by a Predicate
CREATE INDEX requestor_index ON purchaseorder po (XMLCast(XMLQuery('$p/PurchaseOrder/Requestor' PASSING po.OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(128)));
See Also:
"Indexing Non-Repeating Text Nodes or Attribute Values" for information about using the shortcut of XMLCast
applied to XMLQuery
and the extractValue
shortcut to index singleton data
Oracle Database PL/SQL Packages and Types Reference for information about procedure XPath2TabColMapping
If a collection is stored as an ordered collection table or an XMLType
instance, then you can directly access members of the collection. Each member of the collection becomes a row in a table, so you can access it directly with SQL.
You can often improve performance by indexing such collection members. You do this by creating a composite index on (a) the object attribute that corresponds to the collection XML element or its attribute and (b) pseudocolumn NESTED_TABLE_ID
.
Example 19-11 shows the execution plan for a query to find the Reference
elements in documents that contain an order for part number 717951002372 (Part
element with an Id
attribute of value 717951002372
). The collection of LineItem
elements is stored as rows in the ordered collection table lineitem_table
.
Note:
Example 19-11 does not use thepurchaseorder
table from sample database schema OE
. It uses a purchaseorder
table that uses an ordered collection table (OCT) named lineitem_table
for the collection element LineItem
.Example 19-11 Execution Plan for a Selection of Collection Elements
SELECT XMLCast(XMLQuery('$p/PurchaseOrder/Reference' PASSING OBJECT_VALUE AS "p" RETURNING CONTENT) AS VARCHAR2(4000)) "Reference" FROM purchaseorder WHERE XMLExists('$p/PurchaseOrder/LineItems/LineItem/Part[@Id="717951002372"]' PASSING OBJECT_VALUE AS "p"); ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 21 | 2352 | 20 (10)| 00:00:01 | |* 1 | HASH JOIN RIGHT SEMI | | 21 | 2352 | 20 (10)| 00:00:01 | | 2 | JOIN FILTER CREATE | :BF0000 | 22 | 880 | 14 (8)| 00:00:01 | |* 3 | TABLE ACCESS FULL | LINEITEM_TABLE | 22 | 880 | 14 (8)| 00:00:01 | | 4 | JOIN FILTER USE | :BF0000 | 132 | 9504 | 5 (0)| 00:00:01 | |* 5 | TABLE ACCESS FULL | PURCHASEORDER | 132 | 9504 | 5 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("NESTED_TABLE_ID"="PURCHASEORDER"."SYS_NC0003400035$") 3 - filter("SYS_NC00011$"='717951002372') 5 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"PURCHASEORDER","SYS_NC0003400035$"))
The execution plan shows a full scan of ordered collection table lineitem_table
. This could be acceptable if there were only a few hundred documents in the purchaseorder
table, but it would be unacceptable if there were thousands or millions of documents in the table.
To improve the performance of such a query, you can create an index that provides direct access to pseudocolumn NESTED_TABLE_ID
, given the value of attribute Id
. Unfortunately, Oracle XML DB does not allow indexes on collections to be created using XPath expressions directly. To create the index, you must understand the structure of the SQL object that is used to manage the LineItem
elements. Given this information, you can create the required index using conventional object-relational SQL.
In this case, element LineItem
is stored as an instance of object type lineitem_t
. Element Part
is stored as an instance of SQL data type part_t
. XML attribute Id
is mapped to object attribute part_number
. Given this information, you can create a composite index on attribute part_number
and pseudocolumn NESTED_TABLE_ID
, as shown in Example 19-12. This index provides direct access to those purchase-order documents that have LineItem
elements that reference the required part.
Footnote Legend
Footnote 1: This example uses sample database schemaOE
and its table purchaseorder
. The XML schema for this table is annotated with attribute SQLName
to specify SQL object attribute names such as REQUESTOR
— see Example A-2. Without such annotations, this example would use p."XMLDATA"."
Requestor
"
, not p."XMLDATA"."
.REQUESTOR
"
.