Tuning Performance

Data Pump is designed to fully use all available resources to maximize throughput and minimize elapsed job time. For this to happen, a system must be well balanced across CPU, memory, and I/O. In addition, standard performance tuning principles apply. For example, for maximum performance you should ensure that the files that are members of a dump file set reside on separate disks, because the dump files are written and read in parallel. Also, the disks should not be the same ones on which the source or target tablespaces reside.

Any performance tuning activity involves making trade-offs between performance and resource consumption.

The following topics are discussed in this section:

Controlling Resource Consumption

The Data Pump Export and Import utilities let you dynamically increase and decrease resource consumption for each job. This is done using the Data Pump PARALLEL parameter to specify a degree of parallelism for the job. For maximum throughput, do not set PARALLEL to much more than twice the number of CPUs (two workers for each CPU).

As you increase the degree of parallelism, CPU usage, memory consumption, and I/O bandwidth usage also increase. You must ensure that adequate amounts of these resources are available. If necessary, you can distribute files across different disk devices or channels to get the needed I/O bandwidth.

To maximize parallelism, you must supply at least one file for each degree of parallelism. The simplest way of doing this is to use substitution variables in your file names (for example, file%u.dmp). However, depending upon your disk set up (for example, simple, non-striped disks), you might not want to put all dump files on one device. In this case, it is best to specify multiple file names using substitution variables, with each in a separate directory resolving to a separate disk. Even with fast CPUs and fast disks, the path between the CPU and the disk may be the constraining factor in the degree of parallelism that can be sustained.

The Data Pump PARALLEL parameter is valid only in the Enterprise Edition of Oracle Database 11g or later.

Effect of Compression and Encryption on Performance

The use of Data Pump parameters related to compression and encryption can have a positive effect on performance, particularly in the case of jobs performed in network mode. But you should be aware that there can also be a negative effect on performance because of the additional CPU resources required to perform transformations on the raw data. There are trade-offs on both sides.

Memory Considerations When Exporting and Importing Statistics

Data Pump Export dump files that are created with a release prior to 12.1, and that contain large amounts of statistics data, can cause an import operation to use large amounts of memory. To avoid running out of memory during the import operation, be sure to allocate enough memory before beginning the import. The exact amount of memory needed will depend upon how much data you are importing, the platform you are using, and other variables unique to your configuration.

One way to avoid this problem altogether is to set the Data Pump EXCLUDE=STATISTICS parameter on either the export or import operation. You can then use the DBMS_STATS PL/SQL package to regenerate the statistics on the target database after the import has completed.