Paul Guerin
Oracle Developers
Published in
8 min readMar 10, 2019

--

You take the high road, and I’ll take the unload

Sometimes you need to take the direct path less travelled.

An example of a simple bulk load is the Create Table As Select (CTAS) command as follows:

CREATE TABLE &tbl_1
AS
SELECT * FROM &tbl_2;

However the simple bulk load is not the best choice for either:

  • migrating an application to another database.
  • migrating a new feature to an application in another database.

Instead, the datapump export and import utilities are an excellent choice for migrating a whole schema, or just several tables and indexes.

Using the datapump export, the data is unloaded to a dump file set, and then the datapump import loads the data to the destination database using either the conventional path, or the direct path.

The direct path offers the ability to replace conventional redo records with invalidation redo records. The invalidation redo records mark a range of blocks as logically corrupt, but reduce the overhead of the load to improve insert performance.

The performance of a direct path load that generates invalidation redo records will depend on many factors.

However, what performance benefits are possible to achieve using a direct path load that generates invalidation redo records?

Initial setup

Before a load can be assessed for performance, we first need the datapump export to unload data to create a dump file set.

Creating a dump file set can be as easy as the following:

> expdp system parfile=d:\exportdp.table.index.par

Once the dump file set is complete, there is a prerequisite configuration to ensure the direct path load can generate invalidation redo records.

To generate invalidation redo records, the forced logging configuration needs to be disabled at the database level.

Disabling the forced logging of redo records is performed as follows:

ALTER DATABASE NO FORCE LOGGING;Database altered.SQL>
SELECT force_logging FROM v$database;FORCE_LOGGING
——————————---
NO
SQL>

Now that forced logging is disabled, we have the opportunity to maximise load performance.

A direct path load

The following datapump import statement uses the direct path as the default access method:

> impdp system parfile=d:\importdp.par

The head of the logfile of the datapump import is below:

Import: Release 12.2.0.1.0 — Production on Sun Oct 21 01:40:24 2018Copyright © 1982, 2018, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0–64bit Production
Database Directory Object has defaulted to: “DATA_PUMP_DIR”.
Master table “SYSTEM”.”IMPDP” successfully loaded/unloaded
Starting “SYSTEM”.”IMPDP”: system/******** parfile=d:importdp.par
Processing object type SCHEMA_EXPORT/USER
..

Now switch from command line mode, to interactive mode with a ctrl+c, then exit interactive mode:

Export> exit_client

The datapump import will continue to load in the background.

From an SQLPLUS client, it’s easy to prove that the access method of the load is the direct path by querying the transaction statistics:

col name format a40
SELECT name, status, noundo, used_ublk, log_io, phy_io
FROM v$transaction;
NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 13103 12990
SQL> /NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 40640 40398
SQL> /NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 64784 64258
SQL>

The transaction has several thousand I/O but few used undo blocks, and this behavior is consistent with the direct path access method.

The forced logging was disabled earlier, but because the objects being migrated are still in logging mode, we expect the direct path load to generate normal redo records:

col name format a30
SELECT sid, name, TRUNC(value/1024/1024,1) "MB"
FROM v$sesstat ss
INNER JOIN v$statname sn
ON (ss.statistic# = sn.statistic#)
WHERE name IN ('redo size for direct writes', —- direct path only
'redo size') -— conventional plus direct path
AND sid IN (SELECT sid
FROM v$session
WHERE program='impdp.exe' or module like 'Data Pump%')
ORDER BY sid, name;
SID NAME MB
—-— -—-—-—---------------—------ ------
20 redo size .3
20 redo size for direct writes .1
23 redo size .8
23 redo size for direct writes 0
151 redo size 244.1
151 redo size for direct writes 242.2
SQL> /SID NAME MB
—-— ———————————----------------- ------
20 redo size .3
20 redo size for direct writes .1
23 redo size .8
23 redo size for direct writes 0
151 redo size 1143.1
151 redo size for direct writes 1141.1
SQL> /SID NAME MB
--— —---------------—-—————————- ------
20 redo size .4
20 redo size for direct writes .1
23 redo size .8
23 redo size for direct writes 0
151 redo size 1618.5
151 redo size for direct writes 1616.1
SQL>

The session statistics show that the ‘redo size for direct writes’ is steadily increasing, and this suggests that normal redo records are being created for the direct path load.

For the generation of many redo records, we expect many switches of the online redo logs:

-— recent activity of the online redo logs
col 1st_time format a15
SELECT SEQUENCE#,
(sysdate-FIRST_TIME) day(1) to second (0) "1st_time",
FIRST_CHANGE#,
NEXT_CHANGE#
FROM v$log_history
WHERE NUMTODSINTERVAL(sysdate-first_time,'DAY')
< NUMTODSINTERVAL(20,'MINUTE');
SEQUENCE# 1st_time FIRST_CHANGE# NEXT_CHANGE#
————————— —————------ ------------- ------------
779 +0 00:20:01 7718053 7718210
780 +0 00:19:06 7718210 7718366
781 +0 00:18:14 7718366 7718552
782 +0 00:17:40 7718552 7718726
783 +0 00:17:02 7718726 7718882
784 +0 00:16:27 7718882 7719653
785 +0 00:15:27 7719653 7719870
786 +0 00:14:23 7719870 7720156
SQL>

There are many switches of the online redo logs during the time of the direct path load, and the reason is because the objects created are in logging mode.

The SQL Developer client can also show redo record generation, and undo utilization, through the instance viewer of the DBA menu.

The upper right-hand-side of the display shows the undo utilization at only 2.8%, which is low. The lower left-hand-side shows the rate of redo record generation as high as 5.5 MB/sec.

After the direct path load is finished, if there were any invalidation redo records created, they will be recorded:

SELECT UNRECOVERABLE_CHANGE#,
TO_CHAR(UNRECOVERABLE_TIME, 'Dy ddMonyyyy hh24:mi:ss') timestamp
FROM V$DATAFILE
WHERE UNRECOVERABLE_CHANGE# > 0
AND ts# IN (SELECT ts# FROM v$tablespace
WHERE name=UPPER('&tblspc'))
AND NUMTODSINTERVAL(sysdate-UNRECOVERABLE_TIME,'DAY')
< NUMTODSINTERVAL(20,'MINUTE');
no rows selectedSQL>

So certainly normal redo records were generated instead of invalidation redo records.

The tail of the logfile of the datapump import will also show the benchmark performance of the datapump import:

..
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Job “SYSTEM”.”IMPDP” completed with 0 error(s) at Sun Oct 21 23:32:03 2018 elapsed 0 00:08:13

So the datapump import has taken over 8 minutes for a direct path load with the generation of normal redo records.

A direct path load using the TRANSFORM clause

New for Oracle 12c, is the ability to generate invalidation redo records, even if all the objects loaded are in logging mode.

To generate invalidation redo records, the TRANSFORM clause with DISABLE_ARCHIVE_LOGGING:Y is specified.

> impdp system transform=disable_archive_logging:y parfile=d:\importdp.par

The value DISABLE_ARCHIVE_LOGGING:Y means invalidation redo records are generated for all tables and indexes in the load.

The head of the logfile of the datapump import is below:

Import: Release 12.2.0.1.0 — Production on Mon Oct 22 00:11:26 2018Copyright © 1982, 2018, Oracle and/or its affiliates. All rights reserved.
Password:
Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0–64bit Production
Database Directory Object has defaulted to: “DATA_PUMP_DIR”.
Master table “SYSTEM”.”IMPDP” successfully loaded/unloaded
Starting “SYSTEM”.”IMPDP”: system/******** transform=disable_archive_logging:y parfile=d:importdp.par
Processing object type SCHEMA_EXPORT/USER
..

Now switch from command line mode, to interactive mode with a ctrl+c, then exit interactive mode:

Export> exit_client

The datapump import will continue to load in the background.

From an SQLPLUS client, it’s easy to prove that the access method of the load is the direct path by querying the transaction statistics:

col name format a40
SELECT name, status, noundo, used_ublk, log_io, phy_io
FROM v$transaction;
NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 61 75269
SQL> /NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 6854 107258
SQL> /NAME STATUS NOU USED_UBLK LOG_IO PHY_IO
—————————————————------------- ------ --- --------- ------ ------
ACTIVE NO 1 16203 127761
SQL>

As for the previous load, the transaction statistics show there are few undo blocks used, so the direct path access method is being used.

The question is — how many redo records are generated this time?

col name format a30
SELECT sid, name, TRUNC(value/1024/1024,1) "MB"
FROM v$sesstat ss
INNER JOIN v$statname sn
ON (ss.statistic# = sn.statistic#)
WHERE name IN ('redo size for direct writes', -— direct path only
'redo size') -— conventional plus direct path
AND sid IN (SELECT sid
FROM v$session
WHERE program='impdp.exe' or module like 'Data Pump%')
ORDER BY sid, name;
SID NAME MB
—-— -—-—-—---------------—------ ------
132 redo size 2.4
132 redo size for direct writes .8
133 redo size .8
133 redo size for direct writes 0
385 redo size .4
385 redo size for direct writes .1
SQL> /SID NAME MB
—-— -—-—-—---------------—------ ------
132 redo size 3.1
132 redo size for direct writes .8
133 redo size .8
133 redo size for direct writes 0
385 redo size .4
385 redo size for direct writes .1
SQL> /SID NAME MB
—-— -—-—-—---------------—------ ------
132 redo size 3.1
132 redo size for direct writes .8
133 redo size .8
133 redo size for direct writes 0
385 redo size .4
385 redo size for direct writes .1
SQL>

For the previous load, the session statistics showed that the ‘redo size for direct writes’ was steadily increasing, but now this is not the case.

The ‘redo size for direct writes’ metric is static.

Are there many switches of the online redo logs?

—- recent activity of the online redo logs
col 1st_time format a15
SELECT SEQUENCE#,
(sysdate-FIRST_TIME) day(1) to second (0) "1st_time",
FIRST_CHANGE#,
NEXT_CHANGE#
FROM v$log_history
WHERE NUMTODSINTERVAL(sysdate-first_time,'DAY')
< NUMTODSINTERVAL(20,'MINUTE');
no rows selectedSQL>

So the load has small ‘redo size for direct writes’, and no switching of online redo logs, and this suggests that invalidation redo records are being generated.

The SQL Developer client will also show redo record generation:

The undo utilization is still low (2.8%), and the graph of the redo record generation shows the rate is 45kB/sec or lower.

The rate of redo record generation at 45kB/sec is much lower than the previous load (maximum of 5.5 MB/sec).

After the direct path load is finished, if there were any invalidation redo records created, they will be recorded:

SELECT UNRECOVERABLE_CHANGE#,
TO_CHAR(UNRECOVERABLE_TIME, 'Dy ddMonyyyy hh24:mi:ss') timestamp
FROM V$DATAFILE
WHERE UNRECOVERABLE_CHANGE# > 0
AND ts# IN (SELECT ts# FROM v$tablespace
WHERE name = UPPER('&tblspc'))
AND NUMTODSINTERVAL(sysdate-UNRECOVERABLE_TIME,'DAY') < NUMTODSINTERVAL(20,'MINUTE');
UNRECOVERABLE_CHANGE# TIMESTAMP
————————————--------— ———-------------------
7727610 Mon 22Oct2018 00:12:49
SQL>

For the last load there were no invalidation redo records, but now there is a record in recent history.

With less redo records generated than before, there should be an improved load performance.

The tail of the logfile of the datapump import will show the improvement of the load performance:

..
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Job “SYSTEM”.”IMPDP” completed with 0 error(s) at Mon Oct 22 00:12:49 2018 elapsed 0 00:01:17

So previously the elapsed time of the datapump import took over 8 minutes, but now the same load is just over 1 minute!

In Oracle 12c, using the TRANSFORM clause for the datapump import, it’s never been easier to load objects at the highest levels of Oracle database performance.

See Also

The options for the TRANSFORM clause are here:

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sutil/datapump-import-utility.html#GUID-64FB67BD-EB67-4F50-A4D2-5D34518E6BDB

About the Author

Paul Guerin is an international consultant that specialises in Oracle database performance. Paul is based from a global delivery center in South East Asia, but has clients from Australia, Europe, Asia, and North America. Moreover, he has presented at some of the world’s leading Oracle conferences, including Oracle Open World 2013. His work has also been featured in the IOUG Best Practices Tip Booklet of 2015, and also in the Oracle Technology Network (OTN). He is a DBA OCP, and continues to be a participant of the Oracle ACE program.

--

--