Because the system can index most document formats including HTML, PDF, Microsoft Word, and plain text, you can load any supported type into the text column.
When you have mixed formats in your text column, you can optionally include a format column to help filtering during indexing. With the format column you can specify whether a document is binary (formatted) or text (non-formatted such as HTML). If you mix HTML and XML documents in 1 index, you might not be able to configure your index to your needs; you cannot prevent stylesheet information from being added to the index.
Oracle Text Reference for more information about the supported document formats