Oracle Data Guard Fast-Start Failover: Increase Database Availability

Sourav Biswas
Airtel Digital
Published in
4 min readMar 18, 2023

Introduction

This document will guide you on how to configure Oracle Data Guard Fast-Start Failover (FSFO) using a physical standby database. FSFO provide substantial gains in high availability and disaster recovery preparedness on Cloud-based systems as well as global distributed data centers. This feature allows failover a failed primary database automatically to a predetermined standby database. Thereby increasing the availability of the database by eliminating the need for DBA involvement as part of the failover process.

Why FSFO?

In a typical failover scenario, the DBA receives a notification about failed primary database. Then, the DBA logs into the system and execute the required commands to failover the database and if possible reinstate that failed primary as a standby. This entire episode will take around 10-15 minutes at best, hoping, a DBA is monitoring every move of the standby. On the other hand, FSFO augments the broker to perform these tasks automatically within couple of minutes based on predefined parameters.

What happens in background?

The key component to this feature is a process called Observer which monitors the availability of the primary database. It is configured on a system (a third server, other than Primary and Standby) running the Oracle Data Guard configuration. As soon as it detects the primary database (all instances in an Oracle RAC environment) has become unavailable or unable to establish connection to the primary database, it will issue a failover after waiting for predefined threshold period. After failover process, the Oracle Data Guard Broker will attempt to reinstate the failed primary database as a standby when connectivity is re-established.

Note: Fast-Start Failover will not be initiated when a database is shut down using normal, transactional, or immediate options.

Relationship of Primary and Standby Databases and the Observer

Configurable conditions for Fast-Start Failover:-

  • Datafile Offline
  • Corrupted Dictionary
  • Corrupted Controlfile
  • Inaccessible Log File
  • Stuck Archiver Failover
  • Application Induced Failover (This is manual failover invoked by application by calling dbms_dg.initiate_fs_failover function)

Steps to configure FSFO

Assumption:

  • 2 node Primary DB
  • 2 node Physical Standby
  • Data Guard Broker is configured

Create new VM or provision new BM for hosting Observer process

Install Oracle Binaries on new VM/BM

Modify “tnsnames.ora” on Observer server

CDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = srv78-scan.airlab.com)(PORT = 1531))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = cdb.airlab.com)))

cdb_serv910 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = srv910-scan.airlab.com)(PORT = 1531)))
(CONNECT_DATA =
(SERVICE_NAME = cdb_serv910.airlab.com)))

Create wallet keys on Observer server

### Create wallet
[oracle@oem wallet]$ mkstore -wrl $ORACLE_BASE/wallet/ -create
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright © 2004, 2021, Oracle and/or its affiliates. All rights reserved.
Enter password:
Enter password again:

### Create credentials - Primary | Credential name should match tns alias
[oracle@oem wallet]$ mkstore -wrl $ORACLE_BASE/wallet/ -createCredential 'cdb' sys Airtel_123
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright © 2004, 2021, Oracle and/or its affiliates. All rights reserved.
Enter wallet password:

### Create credentials - Standby | Credential name should match tns alias
[oracle@oem wallet]$ mkstore -wrl $ORACLE_BASE/wallet/ -createCredential 'cdb_serv910' sys Airtel_123
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright © 2004, 2021, Oracle and/or its affiliates. All rights reserved.
Enter wallet password:

### In our sample environment we have multiple observer process running from same Observer Server.
### So we see multiple credentials. Find below list of multiple credentials:-
[oracle@oem wallet]$ mkstore -wrl $ORACLE_BASE/wallet/ -listCredential
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright © 2004, 2021, Oracle and/or its affiliates. All rights reserved.
Enter wallet password:
List credential (index: connect_string username)
4: cdb sys
3: cdb_serv910 sys
2: lion56 sys
1: lion sys

Update sqlnet.ora on Oberser server. DIRECTORY should use wallet location created in above step

[oracle@oem ~]$ cat $TNS_ADMIN/sqlnet.ora
NAMES.DIRECTORY_PATH= (TNSNAMES, ONAMES, HOSTNAME,EZCONNECT)
WALLET_LOCATION=(SOURCE=(METHOD=FILE)(METHOD_DATA=(DIRECTORY=/d01/app/oracle/wallet/)))
SQLNET.WALLET_OVERRIDE=TRUE

Enable Flashback on primary

SQL> alter system set db_recovery_file_dest_size=10G scope=both sid='*';
System altered.

SQL> alter system set db_recovery_file_dest='+DATA' scope=both sid='*';
System altered.

SQL> alter database flashback on;
Database altered.

Enable Flashback on Standby

SQL> alter database recover managed standby database cancel;
Database altered.

SQL> alter system set db_recovery_file_dest_size=10G scope=both sid='*';
System altered.

SQL> alter system set db_recovery_file_dest='+DATA' scope=both sid='*';
System altered.

SQL> alter database flashback on;
Database altered.

SQL> alter database recover managed standby database nodelay disconnect;
Database altered.

Set DGConnectIdentifier & ObserverConnectIdentifier for both database. This connect identified should match tns alias.

DGMGRL> edit database cdb_serv910 set property DGConnectIdentifier='cdb_serv910';
Property "dgconnectidentifier" updated

DGMGRL> edit database cdb set property DGConnectIdentifier='cdb';
Property "dgconnectidentifier" updated

DGMGRL> edit database cdb_serv910 set property ObserverConnectIdentifier='cdb_serv910';
Property "observerconnectidentifier" updated

DGMGRL> edit database cdb set property ObserverConnectIdentifier='cdb';
Property "observerconnectidentifier" updated

Enable Fast-Start Failover

DGMGRL> enable fast_start failover
Enabled in Potential Data Loss Mode.

DGMGRL> show fast_start failover
Fast-Start Failover: Enabled in Potential Data Loss Mode
Protection Mode: MaxPerformance
Lag Limit: 30 seconds
Threshold: 30 seconds
Active Target: cdb_serv910
Potential Targets: "cdb_serv910"
cdb_serv910 valid
Observer: (none)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Observer Reconnect: (none)
Observer Override: FALSE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Write Errors YES
Oracle Error Conditions:
(none)

Start Observer process from Observer server

DGMGRL> start observer cdb_observer in background file is /d01/app/oracle/product/19c/network/admin/fsfo_cdb.dat logfile is /d01/app/oracle/product/19c/network/admin/observer_cdb.log connect identifier is cdb
Connected to "cdb"
Submitted command "START OBSERVER" using connect identifier "cdb"
DGMGRL> DGMGRL for Linux: Release 19.0.0.0.0 - Production on Mon Nov 28 14:13:13 2022
Version 19.15.0.0.0
Copyright © 1982, 2019, Oracle and/or its affiliates. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected to "cdb"
Connected as SYSDBA.
Succeeded in opening the observer file "/d01/app/oracle/product/19c/network/admin/fsfo_cdb.dat".
Observer 'cdb_observer' started
The observer log file is '/d01/app/oracle/product/19c/network/admin/observer_cdb.log'.

--

--