When you monitor performance, the most important measurement is the elapsed time for a load. Other important measurements are CPU usage, memory usage, and I/O rates.
You can alter performance by increasing or decreasing the degree of parallelism. The degree of parallelism indicates the number of access drivers that can be started to process the data files. The degree of parallelism enables you to choose on a scale between slower load with little resource usage and faster load with all resources utilized. The access driver cannot automatically tune itself, because it cannot determine how many resources you want to dedicate to the access driver.
An additional consideration is that the access drivers use large I/O buffers for better performance (you can use the READSIZE
clause in the access parameters to specify the size of the buffers). On databases with shared servers, all memory used by the access drivers comes out of the system global area (SGA). For this reason, you should be careful when using external tables on shared servers.
Performance can also sometimes be increased with use of date cache functionality. By using the date cache to specify the number of unique dates anticipated during the load, you can reduce the number of date conversions done when many duplicate date or timestamp values are present in the input data. The date cache functionality provided by external tables is identical to the date cache functionality provided by SQL*Loader. See "DATE_CACHE" for a detailed description.
In addition to changing the degree of parallelism and using the date cache to improve performance, consider the following information:
Fixed-length records are processed faster than records terminated by a string.
Fixed-length fields are processed faster than delimited fields.
Single-byte character sets are the fastest to process.
Fixed-width character sets are faster to process than varying-width character sets.
Byte-length semantics for varying-width character sets are faster to process than character-length semantics.
Single-character delimiters for record terminators and field delimiters are faster to process than multicharacter delimiters.
Having the character set in the data file match the character set of the database is faster than a character set conversion.
Having data types in the data file match the data types in the database is faster than data type conversion.
Not writing rejected rows to a reject file is faster because of the reduced overhead.
Condition clauses (including WHEN
, NULLIF
, and DEFAULTIF
) slow down processing.
The access driver takes advantage of multithreading to streamline the work as much as possible.