WCCILdata - REDU/WARNING - DataManager, recoveryTimeoutExpired, Recovery timeout expired, aborting recovery and restarting data manager

The following log message can be displayed during the startup of the project in a redundant system when the recovery of the database failed.

The log message is written to the PVSS_II.log-file .

WCCILdata (0), 2014.09.24 10:31:14.121, REDU, WARNING, 54, Unexpected state, DataManager, recoveryTimeoutExpired, Recovery timeout expired, aborting recovery and restarting data
manager

The log message with symbolic names:

WCCILdata (0), <TIMESTAMP>, REDU, WARNING, 54, Unexpected state, DataManager, recoveryTimeoutExpired, Recovery timeout expired, aborting recovery and restarting data manager

The log message is written when the allowed time is exceeded on the system which is starting up and therefore making the passive recovery.

  • The maximum time for the recovery of the database is specified with the following config entry in the config-redu-file in the [data]-section (the value is specified in seconds):

passiveRecoveryTimeout = 1800

  • If the timeout is reached, check what lead to the timeout. It can be caused by a slow network, a hard disc with an insufficient read/write performance or when a lot of data must be copied.
  • If you want to change the timeout, change it in the config.redu-file of your project.

When the recovery is started, you will normally see the following block of log messages. Also the timeout message is shown:

WCCILdata (0), <TIMESTAMP>, REDU, INFO, 0, , Sending recovery request to other replica
WCCILdata (0), <TIMESTAMP>, REDU, INFO, 0, , Recovery request accepted, sending file list request
WCCILdata (0), <TIMESTAMP>, REDU, INFO, 0, , File transfer request sent
WCCILdata (0), <TIMESTAMP>, REDU, WARNING, 54, Unexpected state, DataManager, recoveryTimeoutExpired, Recovery timeout expired, aborting recovery and restarting data manager

In rare cases the recovery request in not answered correctly by the running project on the other server in a redundant system. Therefore, you will see the following block of log messages. The time between the messages is 2 minutes.

WCCILdata (0<TIMESTAMP>, REDU, INFO, 0, , Sending recovery request to other replica
WCCILdata (0), <TIMESTAMP>, REDU, WARNING, 54, Unexpected state, DataManager, recoveryTimeoutExpired, Recovery timeout expired, aborting recovery and restarting data manager

In this case changing the config entry has no effect. This timeout of 2 minutes is hardcoded in the source code.

If this case, restart and start the recovery again. Normally the restart will solve the problem.

The following FAQ entry describes how to check the hardware performance of the recovery:

winccoa.com