Faster export with Data Pump parallel option

October 31, 2010  |  dba

Exporting Oracle database faster with parallel option is an excelent feature of oracle Datapump. Datapump export has become the defacto export tool on Oracle 10g (exp has been deprecated). Datapump has command line utils (expdp/impdp) and a pl/sql API (dbms_datapump).

Nowadays DBA’s work with databases with gigabytes or terabytes of data. Exporting data can be a very time consuming job when we needed to export lots of data in a short period of time. Datapump EXPDP has the hability to export data as compressed format, thus achieving faster write times, but this will use more processor time. Datapump also has the ability to open several parallel write channels to improve the overall export time.

Today i’ll explain how parallel parameter works. Parallel allows you to launch several dump processes, thus exporting data much faster. You can specify the numbers of processes you want to launch by setting parallel=n. You can specify the filenames of each processes individually or the filename can contain a substitution variable (%U), which implies that multiple files may be generated. The substitution variable is expanded in the resulting filenames into a 2-digit, fixed-width, incrementing integer starting at 01 and ending at 99.

If you name your files individually (I usually only use this when exporting to different disks), the value you specify for integer “n” should be less than, or equal to, the number of files in the dump file set. Because each active worker process or I/O server process writes exclusively to one file at a time, an insufficient number of files can have adverse effects. Some of the worker processes will be idle while waiting for files, thereby degrading the overall performance of the job.

In this small example I will show how to export data using 4 channels.

expdp user=bigdb parallel=4 DUMPFILE=expdata%U.dmp

This command will create 4 files in the default datapump export directory and this export will then be imported by using the %U substitution variable again.

To import the datapump in parallel we can use the following example:

impdp parallel=4 DUMPFILE=expdata%U.dmp

Here is a general guideline for what should be considered when using the PARALLEL parameter:
1 – Set the degree of parallelism to two times the number of CPUs, then tune from there.
2 – To Export, the PARALLEL parameter value should be less than or equal to the number of dump files.
3 – To Import, the PARALLEL parameter value should not be much larger than the numbers of files in the dumpset.

Happy Hallooooween!

1 Comment

  1. 1 – Set the degree of parallelism to two times the number of CPUs, then tune from there.

    I need to tune from up or from down?
    I’m running a expdp, but it’s stays on an object SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

Leave a Reply