Where to check Data Guard gap?

At work, we had a discussion with well-known colleagues, Luca Canali and Ludovico Caldara, about where we check that Data Guard recovery works as expected without gap. Several views can be queried, depending on the context. Here are a few comments about them.

v$database

This is my preferred because it relies on the actual state of the database, whatever the recovery process is:

SQL> select scn_to_timestamp(current_scn) 
from v$database;
SCN_TO_TIMESTAMP(CURRENT_SCN)
----------------------------------------------------------
22-JAN-19 03.08.32.000000000 PM

This reads the current System Change number (DICUR_SCN from X$KCCDI) and maps it to a timestamp (using the mapping SMON_SCN_TIME table).

However, relying on a function which is available only when the database is opened, this is easy only with Active Data Guard. When the database is in mount state, you will get ORA-00904: “SCN_TO_TIMESTAMP”: invalid identifier.

Note that I’ve also seen cases where, in case of gap, the SMON_SCN_TIME was not up-to-date and I got ORA-08181: specified number is not a valid system change number. Then this is not for automatic monitoring.

Without Active Data Guard, you need to do the SCN to timestamp conversion on the primary. Or read the SCN from the datafiles, but this is not the latest apply but the lastest checkpointed:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.
SQL> select max(checkpoint_time) from v$datafile_header;
MAX(CHECKPOINT_TI
-----------------
26-01-19 17:45:04

Reading from V$DATABASE does not rely on Data Guard and then is also available when the MRP is not started and also in Standard Edition non-managed standby.

gv$recovery_progress

This is Luca’s favored one (see is adg.sql script among many other interesting ones in https://github.com/LucaCanali/Oracle_DBA_scripts):

SQL> select inst_id, max(timestamp) 
from gv$recovery_progress group by inst_id;
INST_ID MAX(TIMESTAMP)
---------- --------------------
1 22-JAN-2019 15:08:51

Where does this information come from? If you look at the execution plan you will see that it reads X$KSULOP which is the X$ that is behind V$SESSION_LONGOPS. You can get the same timestamp from it:

SQL> alter session set nls_date_format='dd-mm-yy hh24:mi:ss';
Session altered.
SQL> select inst_id,opname,timestamp from gv$session_longops 
where opname='Media Recovery' and target_desc='Last Applied Redo';
INST_ID OPNAME               TIMESTAMP
---------- -------------------- -----------------
1 Media Recovery 26-01-19 18:56:39
1 Media Recovery 26-01-19 19:40:35

As this information comes from what the MRP (Managed Recovery Process) logs, this view is available only when the recovery is running (APPLY-ON).

v$managed_standby;

Talking about what is logged by MRP, Ludo goes to the MRP status in v$managed_standby to see the sequence and block# increase. This is very interesting as we can compare the remaining work to do, from what is received by RFS:

19:59:46 SQL> select inst_id,process,status,client_process,thread#,sequence#,block#,blocks from gv$managed_standby;
I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH CLOSING ARCH 1 34 32768 481
1 DGRD ALLOCATED N/A 0 0 0 0
1 DGRD ALLOCATED N/A 0 0 0 0
1 ARCH CLOSING ARCH 1 41 28672 2046
1 ARCH CLOSING ARCH 1 37 30720 1804
1 ARCH CLOSING ARCH 1 40 26624 1608
1 RFS IDLE Archival 1 0 0 0
1 RFS IDLE UNKNOWN 0 0 0 0
1 RFS IDLE LGWR 1 42 121 1
1 RFS IDLE UNKNOWN 0 0 0 0
1 MRP0 APPLYING_LOG N/A 1 42 121 40960
11 rows selected.
19:59:50 SQL> /
I PROCESS   STATUS       CLIENT_P THREAD# SEQUENCE# BLOCK# BLOCKS
- --------- ------------ -------- ------- --------- ------ ------
1 ARCH CLOSING ARCH 1 34 32768 481
1 DGRD ALLOCATED N/A 0 0 0 0
1 DGRD ALLOCATED N/A 0 0 0 0
1 ARCH CLOSING ARCH 1 41 28672 2046
1 ARCH CLOSING ARCH 1 37 30720 1804
1 ARCH CLOSING ARCH 1 40 26624 1608
1 RFS IDLE Archival 1 0 0 0
1 RFS IDLE UNKNOWN 0 0 0 0
1 RFS IDLE LGWR 1 42 124 1
1 RFS IDLE UNKNOWN 0 0 0 0
1 MRP0 APPLYING_LOG N/A 1 42 124 40960
11 rows selected.

v$archived_log

V$ARCHIVED_LOG has an ‘APPLIED’ flag, but it is not really helpful here as it does not consider the real-time apply. In the following screenshot the changes up to 20:35:46 in sequence# 55 have been applied but V$ARCHIVED_LOG shows sequence 52 as not applied:

This view is about archived logs. But before being archived, the redo stream is received to the standby logs.

v$standby_log

With real-time apply (RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE) the redo is applied as soon as it is received in the standby redo logs. The gap should be small and is visible in v$managed_standby (number of blocks between RFS from LGWR and MRP apply). I’ve seen some monitoring queries on V$STANDBY_LOG. The idea is to read the actual state of the transport, in the same idea that when I read V$DATABASE for the actual state of apply, without relying on what is logged by the processes:

SQL> select max(last_time) 
from gv$standby_log;
MAX(LAST_TIME)
--------------------
22-JAN-2019 15:08:55

However, this query reads X$KCCSL which is not very efficient as it reads the standby redo log files. This can be long when they are large and full:

So… be careful with this one.

v$dataguard_stats

The DG Broker ‘show database’ displays the gap information. This comes from V$DATAGUARD_STATS as Data Guard checks the state at regular interval and stores the latest here:

SQL> select name||' '||value ||' '|| unit
||' computed at '||time_computed
from v$dataguard_stats;
NAME||''||VALUE||''||UNIT||'COMPUTEDAT'||TIME_COMPUTED
------------------------------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval computed at 01/27/2019 22:08:33
apply lag +00 00:00:01 day(2) to second(0) interval computed at 01/27/2019 22:08:33
apply finish time +00 00:00:06.493 day(2) to second(3) interval computed at 01/27/2019 22:08:33

You must always check when the value was calculated (TIME_COMPUTED) and may add this to gap to estimate the gap from the current time, as with DGMGRL:

Role:               PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 1 second (computed 1 second ago)
Average Apply Rate: 0 Byte/s
Real Time Query: OFF
Instance(s):
CDB2