Wednesday, May 16, 2012

DB2 LUW HADR old primary reintegration

An overview of "Reintegrating a database after a takeover operation" is mentioned in the official doc's.
Steps (or Procedure) on how to reintegrate a database are also mentioned in the same document.


I was recently asked -- what would be returned if customer attempted to start the old-Primary as a Standby but the log-streams were different (ie. data not consistent).

Well, even if the old primary is started as standby with the command "start hadr on db <dbName> as standby" and the log stream on old-Primary is different (ahead of standby) a user would still get a successful return code "DB20000I  The START HADR ON DATABASE command completed successfully" but internally the pairing will fail and the new standby database will shutdown. As the official document mentioned, "Monitor the standby states to ensure that the reintegration of the new standby is successful, meaning that it stays online and proceeds with the normal state transition. You can do this using the GET SNAPSHOT FOR DATABASE command or the db2pd tool. If necessary, you can check the administration log file db2diag.log to find out the status of the database."  the user has to monitor the states and the db2diag.log. 

The following messages will be dumped to the db2diag.log  --- notice the message "standby is ahead of primary" below...


 

db2diag.log messages on old-Primary coming up as new standby:
----------------------------------------------------------------------------------------------

2012-05-16-08.26.33.400961-420 E203001E656         LEVEL: Error
PID     : 7425                 TID  : 47494469773632PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 344                  EDUNAME: db2hadrs (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduS, probe:20870
MESSAGE : ADM12500E  The HADR standby database cannot be made consistent with
          the primary database. The log stream of the standby database is
          incompatible with that of the primary database. To use this database
          as a standby, it must be recreated from a backup image or split
          mirror of the primary database.


2012-05-16-08.26.33.416153-420 I203658E360         LEVEL: Warning
PID     : 7425                 TID  : 47494469773632PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 344                  EDUNAME: db2hadrs (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduS, probe:20870
MESSAGE : HADR: Pair validation rejected by primary.

2012-05-16-08.26.33.423572-420 I204019E424         LEVEL: Error
PID     : 7425                 TID  : 47494469773632PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 344                  EDUNAME: db2hadrs (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduS, probe:20880
RETCODE : ZRC=0x87800145=-2021654203=HDR_ZRC_VALIDATION_REJECT
          "HADR shuts down due to validation rejection"
 

db2diag.log messages on new-primary
--------------------------------------------------
2012-05-16-08.26.33.379656-420 I657118E373         LEVEL: Warning
PID     : 14075                TID  : 47572295084352PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 330                  EDUNAME: db2hadrp (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduP, probe:20482
MESSAGE : Old primary requesting rejoining HADR pair as a standby

2012-05-16-08.26.33.385077-420 E657492E656         LEVEL: Error
PID     : 14075                TID  : 47572295084352PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 330                  EDUNAME: db2hadrp (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduP, probe:20485
MESSAGE : ADM12500E  The HADR standby database cannot be made consistent with
          the primary database. The log stream of the standby database is
          incompatible with that of the primary database. To use this database
          as a standby, it must be recreated from a backup image or split
          mirror of the primary database.

2012-05-16-08.26.33.396369-420 I658149E431         LEVEL: Warning
PID     : 14075                TID  : 47572295084352PROC : db2sysc
INSTANCE: kkchinta             NODE : 000
EDUID   : 330                  EDUNAME: db2hadrp (SSD)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduP, probe:20485
MESSAGE : HADR Pair validation failed. Standby is ahead of Primary.  Standby
          LSO: 40857824 Primary LSO: 38510302