PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering msgs

Discussions about product bugs & problems!
Note: This is no replacement for the Official ETM Support!
5 posts • Page 1 of 1
tpjctrl
Posts:145
Joined: Tue May 08, 2018 10:30 am

PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering msgs

Post by tpjctrl »

Noticed this error today when doing failovers:

https://www.winccoa.com/knowledge-base/ ... we-ar.html

Got two WinCC OA servers as VMs, let's call them A and B, A is selected as preffered, they are running on two separate physical servers, they can ping each other without issues both using IP and their names, times are sync'd between them. I've observed the following:

- if I shutdown A, B takes over and all looks good, but when A is powered on again, the project on server B is killed via the command mentioned in the link above (issued by server A). Then after a while B restarts it's managers and both servers seem to run fine

- if I shutdown B, A carries on and all looks good, but when B is powered on again, the project on server A is killed via the command mentioned in the link above (issued by server A). Then after a while B restarts it's managers and both servers seem to run fine

The link says to check network connections, but those seem fine, I've had a ping to B running on server A whilst B was being powered on / off and there's absolutely no glitches after server B starts up.

Any ideas what might be causing this?

leoknipp
Posts:2928
Joined: Tue Aug 24, 2010 7:28 pm

Re: PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering

Post by leoknipp »

If you get these log message you have to check why the connection between the servers was closed before.
The log message is written when the servers reconnect and because they are not in recovery mode one server needs to be restarted.

Please have a look at the PVSS_II.log file and search for log message written by the Event and/or the Redu manager.

Best Regards
Leopold Knipp
Senior Support Specialist

tpjctrl
Posts:145
Joined: Tue May 08, 2018 10:30 am

Re: PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering

Post by tpjctrl »

As usual, huge thanks for your reply Leopold!

I've actually noticed that there seems to be a time difference between the servers, I'm seeing entries in the log which state that currently there's a 47sec time difference between A and B, curious if that can be somehow linked to this error? OS / Windows times are absolutely spot on on both servers, so I'm not actually sure why WinCC OA thinks there's such a large difference (all OSs are sync'ed using NTP).

I'll have a look at the PVSS_II.log.

leoknipp
Posts:2928
Joined: Tue Aug 24, 2010 7:28 pm

Re: PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering

Post by leoknipp »

If there is a really a time difference between the servers it can be a possible cause for the connection problems.
When the time difference is too high the connection between the redundant servers cannot be established.
Also for the log messages which refer to the time difference you have to check when the log message is written. Most of the time you have to read several log message in combination to find the cause for the log entry.

Best Regards
Leopold Knipp
Senior Support Specialist

tpjctrl
Posts:145
Joined: Tue May 08, 2018 10:30 am

Re: PVSS00event - REDU/SEVERE - EventManager, redPeerGot, We got connection to redundant peer, but we are not buffering

Post by tpjctrl »

I've had a look at the PVSS_II.log, there's not much in there apart from REDU messages saying that msgs are not buffered and server B is getting shutdown. We have however noticed that when server B starts up, the time on it is off by around 50sec and this is most likely caused by the fact that the host on which this server VM runs is also off by 50sec. After correcting this, the problem went away ie. when B starts up it's no longer shut down by server A.

5 posts • Page 1 of 1