Parser interfaces include: Parser exceptions, Validator, Parser, DOMParser, and SAXParser.
This chapter contains the following sections:
Table 5-1 summarizes the datatypes of the Parser
package.
Table 5-1 Summary of Datatypes; Parser Package
Datatype | Description |
---|---|
Parser implementation of exceptions. |
|
Defines parser identifiers. |
|
Defines type of node. |
|
Defines validator identifiers. |
Parser implementation of exceptions.
typedef enum ParserExceptionCode { PARSER_UNDEFINED_ERR = 0, PARSER_VALIDATION_ERR = 1, PARSER_VALIDATOR_ERR = 2, PARSER_BAD_ISOURCE_ERR = 3, PARSER_CONTEXT_ERR = 4, PARSER_PARAMETER_ERR = 5, PARSER_PARSE_ERR = 6, PARSER_SAXHANDLER_SET_ERR = 7, PARSER_VALIDATOR_SET_ERR = 8 } ParserExceptionCode;
Defines parser identifiers.
typedef enum DOMParserIdType { DOMParCXml = 1 } DOMParserIdType; ypedef enum CompareHowCode { START_TO_START = 0, START_TO_END = 1, END_TO_END = 2, END_TO_START = 3 } CompareHowCode;
Table 5-2 summarizes the methods available through the DOMParser
interface.
Table 5-2 Summary of DOMParser Methods; Parser Package
Function | Summary |
---|---|
Returns parser's XML context (allocation and encodings). |
|
Get parser id. |
|
Parse the document. |
|
Parse DTD document. |
|
Parse and validate the document. |
|
Set the validator for this parser. |
Each parser object is allocated and executed in a particular Oracle XML context. This member function returns a pointer to this context.
virtual Context* getContext() const = 0;
(Context*)
pointer to parser's context
Parses the document and returns the tree root node
virtual DocumentRef< Node>* parse( InputSource* isrc_ptr, boolean DTDvalidate = FALSE, DocumentTypeRef< Node>* dtd_ptr = NULL, boolean no_mod = FALSE, DOMImplementation< Node>* impl_ptr = NULL) throw (ParserException) = 0;
Parameter | Description |
---|---|
isrc_ptr |
input source |
DTDvalidate |
TRUE if validated by DTD |
dtd_ptr |
DTD reference |
no_mod |
TRUE if no modifications allowed |
impl_ptr |
optional DomImplementation pointer |
(DocumentRef)
document tree
Parse DTD document.
virtual DocumentRef< Node>* parseDTD( InputSource* isrc_ptr, boolean no_mod = FALSE, DOMImplementation< Node>* impl_ptr = NULL) throw (ParserException) = 0;
Parameter | Description |
---|---|
isrc_ptr |
input source |
no_mod |
TRUE if no modifications allowed |
impl_ptr |
optional DomImplementation pointer |
(DocumentRef)
DTD document tree
Parses and validates the document. Sets the validator if the corresponding parameter is not NULL
.
virtual DocumentRef< Node>* parseSchVal( InputSource* src_par, boolean no_mod = FALSE, DOMImplementation< Node>* impl_ptr = NULL, SchemaValidator< Node>* tor_ptr = NULL) throw (ParserException) = 0;
Parameter | Description |
---|---|
isrc_ptr |
input source |
no_mod |
TRUE if no modifications allowed |
impl_ptr |
optional DomImplementation pointer |
tor_ptr |
schema validator |
(DocumentRef)
document tree
Table 5-3 summarizes the methods available through the GParser
interface.
Table 5-3 Summary of GParser Methods; Parser Package
Function | Summary |
---|---|
Specifies if multiple entity declarations result in a warning. |
|
Returns the base URI for the document. |
|
Checks if whitespaces between elements are discarded. |
|
Checks if character references are expanded. |
|
Get schema location for this document. |
|
Get if document processing stops on warnings. |
|
Get if multiple entity declarations cause a warning. |
|
Sets the base URI for the document. |
|
Sets if formatting whitespaces should be discarded. |
|
Get if character references are expanded. |
|
Set schema location for this document. |
|
Sets if document processing stops on warnings. |
Specifies if entities that are declared more than once will cause warnings to be issued.
void setWarnDuplicateEntity( boolean par_bool);
Parameter | Description |
---|---|
par_bool |
TRUE if multiple entity declarations cause a warning |
Returns the base URI for the document. Usually only documents loaded from a URI will automatically have a base URI. Documents loaded from other sources (stdin
, buffer, and so on) will not naturally have a base URI, but a base URI may have been set for them using setBaseURI
, for the purposes of resolving relative URIs in inclusion.
oratext* getBaseURI() const;
(oratext *)
current document's base URI [or NULL
]
Checks if formatting whitespaces between elements, such as newlines and indentation in input documents are discarded. By default, all input characters are preserved.
boolean getDiscardWhitespaces() const;
(boolean)
TRUE
if whitespace between elements are discarded
Checks if character references are expanded in the DOM data. By default, character references are replaced by the character they represent. However, when a document is saved those characters entities do not reappear. To ensure they remain through load and save, they should not be expanded.
boolean getExpandCharRefs() const;
(boolean)
TRUE
if character references are expanded
Gets schema location for this document. It is used to figure out the optimal layout when loading documents into a database.
oratext* getSchemaLocation() const;
(oratext*)
schema location
When TRUE
is returned, warnings are treated the same as errors and cause parsing, validation, and so on, to stop immediately. By default, warnings are issued but the processing continues.
boolean getStopOnWarning() const;
(boolean)
TRUE
if document processing stops on warnings
Get if entities which are declared more than once will cause warnings to be issued.
boolean getWarnDuplicateEntity() const;
(boolean)
TRUE
if multiple entity declarations cause a warning
Sets the base URI for the document. Usually only documents that were loaded from a URI will automatically have a base URI. Documents loaded from other sources (stdin, buffer, and so on) will not naturally have a base URI, but a base URI may have been set for them using setBaseURI, for the purposes of resolving relative URIs in inclusion.
void setBaseURI( oratext* par);
Parameter | Description |
---|---|
par |
base URI |
Sets if formatting whitespaces between elements (newlines and indentation) in input documents are discarded. By default, ALL input characters are preserved.
void setDiscardWhitespaces( boolean par_bool);
Parameter | Description |
---|---|
par_bool |
TRUE if whitespaces should be discarded |
Sets if character references should be expanded in the DOM data. Ordinarily, character references are replaced by the character they represent. However, when a document is saved those characters entities do not reappear. To ensure they remain through load and save is to not expand them.
void setExpandCharRefs( boolean par_bool);
Parameter | Description |
---|---|
par_bool |
TRUE if character references should be discarded |
Sets schema location for this document. It is used to figure out the optimal layout when loading documents into a database.
void setSchemaLocation( oratext* par);
Parameter | Description |
---|---|
par |
schema location |
When TRUE
is set, warnings are treated the same as errors and cause parsing, validation, and so on, to stop immediately. By default, warnings are issued but the processing continues.
void setStopOnWarning( boolean par_bool);
Parameter | Description |
---|---|
par_bool |
TRUE if document processing should stop on warnings |
Table 5-4 summarizes the methods available through the ParserException
interface.
Table 5-4 Summary of ParserException Methods; Parser Package
Function | Summary |
---|---|
Get Oracle XML error code embedded in the exception. |
|
Get current language (encoding) of error messages. |
|
Get Oracle XML error message. |
|
Get parser exception code embedded in the exception. |
Virtual member function inherited from XmlException
.
virtual unsigned getCode() const = 0;
(unsigned)
numeric error code (0 on success)
Virtual member function inherited from XmlException
.
virtual oratext* getMesLang() const = 0;
(oratext*)
Current language (encoding) of error messages
Virtual member function inherited from XmlException
.
virtual oratext* getMessage() const = 0;
(oratext *)
Error message
This is a virtual member function that defines a prototype for implementation defined member functions returning parser and validator exception codes, defined in ParserExceptionCode, of the exceptional situations during execution.
virtual ParserExceptionCode getParserCode() const = 0;
(ParserExceptionCode)
exception code
Table 5-5 summarizes the methods available through the SAXHandler
interface.
Table 5-5 Summary of SAXHandler Methods; Parser Package
Function | Summary |
---|---|
Receive notification of CDATA. |
|
Receive notification of an XML declaration. |
|
Receive notification of attribute's declaration. |
|
Receive notification of character data. |
|
Receive notification of a comment. |
|
Receive notification of element's declaration. |
|
Receive notification of the end of the document. |
|
Receive notification of element's end. |
|
Receive notification of a notation declaration. |
|
Receive notification of a parsed entity declaration. |
|
Receive notification of a processing instruction. |
|
Receive notification of the start of the document. |
|
Receive notification of element's start. |
|
Receive namespace aware notification of element's start. |
|
Receive notification of an unparsed entity declaration. |
|
Receive notification of whitespace characters. |
This event handles CDATA, as distinct from Text. The data will be in the data encoding, and the returned length is in characters, not bytes. This is an Oracle extension.
virtual void CDATA( oratext* data, ub4 size) = 0;
Parameter | Description |
---|---|
data |
pointer to CDATA |
size |
size of CDATA |
This event marks an XML declaration (XMLDecl
). The startDocument
event is always first; this event will be the second event. The encoding flag says whether an encoding was specified. For the standalone flag, -1 will be returned if it was not specified, otherwise 0
for FALSE
, 1 for TRUE
. This member function is an Oracle extension.
virtual void XMLDecl( oratext* version, boolean is_encoding, sword standalone) = 0;
Parameter | Description |
---|---|
version |
version string from XMLDecl |
is_encoding |
whether encoding was specified |
standalone |
value of standalone value flag |
This event marks an attribute declaration in the DTD. It is an Oracle extension; not in SAX standard
virtual void attributeDecl( oratext* attr_name, oratext *name, oratext *content) = 0;
Parameter | Description |
---|---|
attr_name |
|
name |
|
content |
body of attribute declaration |
This event marks character data.
virtual void characters( oratext* ch, ub4 size) = 0;
Parameter | Description |
---|---|
ch |
pointer to data |
size |
length of data |
This event marks a comment in the XML document. The comment's data will be in the data encoding. It is an Oracle extension, not in SAX standard.
virtual void comment( oratext* data) = 0;
Parameter | Description |
---|---|
data |
comment's data |
This event marks an element declaration in the DTD. It is an Oracle extension; not in SAX standard.
virtual void elementDecl( oratext *name, oratext *content) = 0;
Parameter | Description |
---|---|
name |
element's name |
content |
element's content |
This event marks the end of an element. The name is the tagName
of the element (which may be a qualified name for namespace-aware elements) and is in the data encoding.
virtual void endElement( oratext* name) = 0;
The even marks the declaration of a notation in the DTD. The notation's name, public ID, and system ID will all be in the data encoding. Both IDs are optional and may be NULL
.
virtual void notationDecl( oratext* name, oratext* public_id, oratext* system_id) = 0;
Parameter | Description |
---|---|
name |
notations's name |
public_id |
notation's public Id |
sysem_id |
notation's system Id |
Marks a parsed entity declaration in the DTD. The parsed entity's name, public ID, system ID, and notation name will all be in the data encoding. This is an Oracle extension.
virtual void parsedEntityDecl( oratext* name, oratext* value, oratext* public_id, oratext* system_id, boolean general) = 0;
Parameter | Description |
---|---|
name |
entity's name |
value |
entity's value if internal |
public_id |
entity's public Id |
sysem_id |
entity's system Id |
general |
whether a general entity (FALSE if parameter entity) |
This event marks a processing instruction. The PI's target and data will be in the data encoding. There is always a target, but the data may be NULL
.
virtual void processingInstruction( oratext* target, oratext* data) = 0;
Parameter | Description |
---|---|
target |
PI's target |
data |
PI's data |
This event marks the start of an element.
virtual void startElement( oratext* name, NodeListRef< Node>* attrs_ptr) = 0;
Parameter | Description |
---|---|
name |
element's name |
attrs_ptr |
list of element's attributes |
This event marks the start of an element. Note this is the new SAX 2 namespace-aware version. The element's qualified name, local name, and namespace URI will be in the data encoding, as are all the attribute parts.
virtual void startElementNS( oratext* qname, oratext* local, oratext* ns_URI, NodeListRef< Node>* attrs_ptr) = 0;
Parameter | Description |
---|---|
qname |
element's qualified name |
local |
element's namespace local name |
ns_URI |
element's namespace URI |
attrs_ref |
NodeList of element's attributes |
Marks an unparsed entity declaration in the DTD. The unparsed entity's name, public ID, system ID, and notation name will all be in the data encoding.
virtual void unparsedEntityDecl( oratext* name, oratext* public_id, oratext* system_id, oratext* notation_name) = 0; };
Parameter | Description |
---|---|
name |
entity's name |
public_id |
entity's public Id |
sysem_id |
entity's system Id |
notation_name |
entity's notation name |
Table 5-6 summarizes the methods available through the SAXParser
interface.
Table 5-6 Summary of SAXParser Methods; Parser Package
Function | Summary |
---|---|
Returns parser's XML context (allocation and encodings). |
|
Returns parser Id. |
|
Parse the document. |
|
Parse the DTD. |
|
Set SAX handler. |
Each parser object is allocated and executed in a particular Oracle XML context. This member function returns a pointer to this context.
virtual Context* getContext() const = 0;
(Context*)
pointer to parser's context
Returns the parser id.
virtual SAXParserIdType getParserId() const = 0;
(SAXParserIdType)
Parser Id
Parses a document.
virtual void parse( InputSource* src_ptr, boolean DTDvalidate = FALSE, SAXHandlerRoot* hdlr_ptr = NULL) throw (ParserException) = 0;
Parameter | Description |
---|---|
src_ptr |
input source |
DTDValidate |
TRUE if validate with DTD |
hdlr_ptr |
SAX handler pointer |
Table 5-7 summarizes the methods available through the SchemaValidator
interface.
Table 5-7 Summary of SchemaValidator Methods; Parser Package
Function | Summary |
---|---|
Return the Schema list. |
|
Get validator identifier. |
|
Load a schema document. |
|
Unload a schema document. |
Return only the size of loaded schema list documents if "list" is NULL
. If "list" is not NULL
, a list of URL pointers is returned in the user-provided pointer buffer. Note that its user's responsibility to provide a buffer with big enough size.
virtual ub4 getSchemaList( oratext **list) const = 0;
Parameter | Description |
---|---|
list |
address of a pointer buffer |
(ub4)
list size and list of loaded schemas (I/O parameter)
Get the validator identifier corresponding to the implementation of this validator object.
virtual SchValidatorIdType getValidatorId() const = 0;
(SchValidatorIdType)
validator identifier
Load up a schema document to be used in the next validation session. Throws an exception in the case of an error.
virtual void loadSchema( oratext* schema_URI) throw (ParserException) = 0;
Parameter | Description |
---|---|
schema_URI |
URL of a schema document; compiler encoding |
Unload a schema document and all its descendants (included or imported in a nested manner from the validator. All previously loaded schema documents will remain loaded until they are unloaded. To unload all loaded schema documents, set schema_URI
to be NULL
. Throws an exception in the case of an error.
virtual void unloadSchema( oratext* schema_URI) throw (ParserException) = 0;
Parameter | Description |
---|---|
schema_URI |
URL of a schema document; compiler encoding |