Setting up TAF with Data Guard: A new ORA-16456 is coming!

Hi there

Normally, all of us have already setup TAF for client connection in a dataguard configuration.
We always do a simple check with a switch-over to show that a select continue to be executed on the new primary after the switch-over is completed right?

But, during my configuration test, I have a strange behavior while running select during a switch-over test.

The configuration:

Configuration Single Instance on Grid Infrastructure 12.1.2.0 PSU JULY
ODA1 srvodap01,srvodap03
ODA2 srvodap02,srvodap04

Database is 11.2.0.4
Standby created DG config done

DGMGRL> show configuration
 
Configuration - jiraprd
 
  Protection Mode: MaxPerformance
  Databases:
    jiraprd - Primary database
    jirastb - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS

On Primary

srvctl add database -d JIRAPRD -o /u01/app/oracle/product/11.2.0.4/dbhome_1 -x srvodap01 -p /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/spfileJIRAPRD.ora -c single -r PRIMARY -s open

srvctl modify database -d JIRAPRD -y automatic -j "/u01/app/oracle/oradata/datastore,/u02/app/oracle/oradata/datastore,/u01/app/oracle/fast_recovery_area/datastore"

 

oracle@srvodap01:~/ [JIRAPRD] srvctl config database -d JIRAPRD
Database unique name: JIRAPRD
Database name:
Oracle home: /u01/app/oracle/product/11.2.0.4/dbhome_1
Oracle user: oracle
Spfile: /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/spfileJIRAPRD.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: JIRAPRD
Database instance: JIRAPRD
Disk Groups:
Mount point paths: /u01/app/oracle/oradata/datastore,/u02/app/oracle/oradata/datastore,/u01/app/oracle/fast_recovery_area/datastore
Services: JIRAPRD_RW
Type: SINGLE
Database is administrator managed

On Standby

srvctl add database -d JIRASTB -i JIRAPRD -n JIRAPRD -o /u01/app/oracle/product/11.2.0.4/dbhome_1 -x srvodap02 -p /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/spfileJIRAPRD.ora -c single -r PHYSICAL_STANDBY -s mount

srvctl modify database -d JIRASTB -y automatic -j "/u01/app/oracle/oradata/datastore,/u02/app/oracle/oradata/datastore,/u01/app/oracle/fast_recovery_area/datastore" -s open

 

oracle@srvodap02:~/ [JIRAPRD]  srvctl config database -d JIRASTB
Database unique name: JIRASTB
Database name: JIRAPRD
Oracle home: /u01/app/oracle/product/11.2.0.4/dbhome_1
Oracle user: oracle
Spfile: /u01/app/oracle/product/11.2.0.4/dbhome_1/dbs/spfileJIRAPRD.ora
Domain:
Start options: open
Stop options: immediate
Database role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
Server pools: JIRASTB
Database instance: JIRAPRD
Disk Groups:
Mount point paths: /u01/app/oracle/oradata/datastore,/u02/app/oracle/oradata/datastore,/u01/app/oracle/fast_recovery_area/datastore
Services: JIRAPRD_RW
Type: SINGLE
Database is administrator managed

Services
On primary

srvctl remove service -d JIRAPRD -s JIRAPRD_RW

srvctl add service -d JIRAPRD -s JIRAPRD_RW  -l PRIMARY -y AUTOMATIC -P basic -e select -m BASIC -z 200 -w 1

 

oracle@srvodap01:~/ [JIRAPRD] srvctl config service -d JIRAPRD -s JIRAPRD_RW
Service name: JIRAPRD_RW
Service is enabled
Server pool: JIRAPRD
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 200
TAF failover delay: 1
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: JIRAPRD
Available instances:

On Standby

srvctl remove service -d JIRASTB -s JIRAPRD_RW

srvctl add service -d JIRASTB -s JIRAPRD_RW  -l PRIMARY -y AUTOMATIC -P basic -e select -m BASIC -z 200 -w 1

 

oracle@srvodap02:~/ [JIRAPRD] srvctl config service -d JIRASTB -s JIRAPRD_RW
Service name: JIRAPRD_RW
Service is enabled
Server pool: JIRAPRD
Cardinality: 1
Disconnect: false
Service role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 200
TAF failover delay: 1
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: JIRAPRD
Available instances:

Tnsnames.ora entry

JIRAPRD =
(DESCRIPTION_LIST=
   (LOAD_BALANCE=off)(FAILOVER=on)
   (DESCRIPTION=
     (ADDRESS_LIST=
       (ADDRESS=(PROTOCOL=TCP)(HOST=srvodap01-vip)(PORT=1521))
       (ADDRESS=(PROTOCOL=TCP)(HOST=srvodap02-vip)(PORT=1521))
     )(CONNECT_DATA=(SERVICE_NAME=JIRAPRD_RW))
   )
)

lsnrctl status

...
Service "JIRAPRD_RW" has 1 instance(s).
  Instance "JIRAPRD", status READY, has 1 handler(s) for this service...
...

 

sqlplus dgtst/xxxxx@JIRAPRD
select * from dba_objects;
...
 
In the same time
 

dgmgrl
switchover to jirastb;
...

The select gives the following.

select * from dba_objects
*
ERROR at line 1:
<span style="color: #ff0000;"><strong>ORA-16456</strong></span>: switchover to standby in progress or completed

after few seconds, re-running the command works.

What the!!!

Oracle support says:

This looks to be expected Behavior with "Switch-over to" since the database is "shutdown immediate" and not "shutdown abort" like in "Fail-over to" and expected with OCI and TAF connections

First, this is the first time I see this error and googling gives almost an empty list...
Second, no way the primary is stopped with a "shutdown immediate" but with a "shutdown abort" initiated by DMON process.

Then after few exchanges with the support engineer, this is his statement:

Since you are performing switch-over, primary gets shutdown ( and changes its role as STANDBY) and standby is brought up as Primary-Role The Oracle Data Guard Broker uses Fast Application Notification (FAN) to send notifications to clients when a fail-over occurs. Integrated Oracle clients automatically fail-over connections and applications can mask the failure from end-users.
Due to shutdown of primary and ROLE change happens, above ORA-16456 is expected .

When this has appears as I have tested this many times with no problems at all? The answer is, from July PSU!
So before July PSU, the select works during switch-over and after July PSU it's not working anymore.
Same apply with the Oct PSU...

Might be the same with 12.2

Conclusion:

TAF has changed during switch-over. Does it make sense, well I would say maybe as now the client get notified with a real error telling that a switch-over is happening...

No I am quiet upset on the absence of communication.

Cheers
jko

 

The following two tabs change content below.

Jacques

I am Oracle Certified Master 11g & 12c database architect with significant experience in heterogeneous environments, and strong ability to lead complex and critical projects requiring multiple technical implementations. at Trivadis SA

Leave a Reply

Your email address will not be published. Required fields are marked *