Create a Section Group

Section searching is enabled by defining section groups. You use one of the system-defined section groups to create an instance of a section group. Choose a section group appropriate for your document collection.

You use section groups to specify the type of document set you have and implicitly indicate the tag structure. For instance, to index HTML tagged documents, you use the HTML_SECTION_GROUP. Likewise, to index XML tagged documents, you can use the XML_SECTION_GROUP.

Table 8-1 lists the different types of section groups you can use:


Table 8-1 Types of Section Groups

Section Group Preference Description

NULL_SECTION_GROUP

This is the default. Use this group type when you define no sections or when you define only SENTENCE or PARAGRAPH sections.

BASIC_SECTION_GROUP

Use this group type for defining sections where the start and end tags are of the form <A> and </A>.

Note: This group type dopes not support input such as unbalanced parentheses, comments tags, and attributes. Use HTML_SECTION_GROUP for this type of input.

HTML_SECTION_GROUP

Use this group type for indexing HTML documents and for defining sections in HTML documents.

XML_SECTION_GROUP

Use this group type for indexing XML documents and for defining sections in XML documents.

AUTO_SECTION_GROUP

Use this group type to automatically create a zone section for each start-tag/end-tag pair in an XML document. The section names derived from XML tags are case-sensitive as in XML.

Attribute sections are created automatically for XML tags that have attributes. Attribute sections are named in the form tag@attribute.

Stop sections, empty tags, processing instructions, and comments are not indexed.

The following limitations apply to automatic section groups:

  • You cannot add zone, field or special sections to an automatic section group.

  • Automatic sectioning does not index XML document types (root elements.) However, you can define stop-sections with document type.

  • The length of the indexed tags including prefix and namespace cannot exceed 64 bytes. Tags longer than this are not indexed.

PATH_SECTION_GROUP

Use this group type to index XML documents. Behaves like the AUTO_SECTION_GROUP.

The difference is that with this section group you can do path searching with the INPATH and HASPATH operators. Queries are also case-sensitive for tag and attribute names.

NEWS_SECTION_GROUP

Use this group for defining sections in newsgroup formatted documents according to RFC 1036.


Note:

Documents sent to the HTML, XML, AUTO and PATH sectioners must begin with \s*<, where \s* represents zero or more whitespace characters. Otherwise, the document is treated as a plaintext document, and no sections are recognized.

You use the CTX_DDL package to create section groups and define sections as part of section groups. For example, to index HTML documents, create a section group with HTML_SECTION_GROUP:

begin
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
end;