Hello,
I've created a CTRL extension to connect to an AMQP broker and subscribe to a given queue.
An event named 'onMessage' is raised everytime a message is consumed from an AMQP queue and inside this event startFunc is called as below:
int onMessage(AMQPMessage *msg)
{
// some condition variable operations to wait notify from StartFuncDoneCB() to ensure startFunc is called sequentially
// some message processing
// reading some globals to prepare args for the function to be started on WinCC OA side (queue name, ip, host etc and received message)
// prepare args for startFunc (callbackArgs)
callbackArgs->append(new TextVar(msgData));
callbackArgs->append(new TextVar(AMQPObj->getHost().c_str()));
callbackArgs->append(new TextVar(AMQPObj->getPort()));
callbackArgs->append(new TextVar(queueName.c_str()));
// end of prepare args
CtrlScript *currentScript = queueArgs.getScript(); // read currentScript (CtrlScript) from globals, as this is registered during 'subscribe' call is made from Ctrl Extension in the first time
StartFuncDoneCB *done = new StartFuncDoneCB()
ExecReturn ret = currentScript->startFunc(cbName, callbackArgs.get(), done);
}
For testing purposes, when I send too frequent messages (e.g. 100 ms frequency) to the subscribed queue, I get an exception from the given dll (libctrlv316) in my Ctrl Extension.
Thrown exception is simply because it tries to do some operation on a null pointer.
As a solution, I've added the waiting for the condition variable before making the next call with StartFunc. I'm sure the calls to the startFunc is done sequentially as I planned, because I can see it from the debug messages that I put that, one is called only after the DoneCB is called for the previous one.
I'm kind of sure this is a timing issue because when I activate my debug flags with -dbg 10, it increases the processing time and the crash does not occur on my PC, which is of course not reliable.
As I do not have symbol information for libctrlv316, I cannot make the debug and see which pointer is null.
I would expect that such an exception is handled in this library, so that it would at least not cause a crash on the CTRL Manager making the call to the Ctrl extension.
I wonder if I'm violating any interface while using CtrlScript, as I do not use any other object like CtrlModule etc.
I simply initialize it as:
CtrlScript* currentScript = param.thread->getScript();
Another interesting point is, I've created a testing function to mock this 'startFunc' behaviour with some hard coded parameters (same parameters that I'd receive from the AMQP queue), and it does not cause the exception under the same conditions.
As one difference I can say that CtrlScript and Callback arguments are not read from the globals in this function, but prepared inside the function.
But it is still weird that, this parameter is only initialized for 'once' inside the subcribe function, and never called again.
I'm not familiar with the behavior of this CtrlScript variable, but I'm sure that this is not modified in my code between two frequent calls.
Can you please give me information if CtrlScript pointer might e.g. re-initialized between two startFunc calls ?
If this is the cause of the exception, can you please give me a hint if there is a standart way suggested for CtrlScript usage to prevent such a race condition?
I hope I could describe the issue clear enough. I can also provide some code pieces if you are interested.
I'm looking forward to your suggestions
Edit: A very basic flowchart and a screenshot from the exception while debugging is attached.
Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
- kilianvp
- Posts:443
- Joined: Fri Jan 16, 2015 10:29 am
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
I would expect that such an exception is handled in this library, so that it would at least not cause a crash on the CTRL Manager making the call to the Ctrl extension.
no you always have to take care of all exceptions by yourself.
- Sabragan
- Posts:12
- Joined: Fri Feb 01, 2019 4:56 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
Thank you for your response.kilianvp wrote: ↑ Fri May 29, 2020 10:29 amI would expect that such an exception is handled in this library, so that it would at least not cause a crash on the CTRL Manager making the call to the Ctrl extension.![]()
![]()
![]()
![]()
no you always have to take care of all exceptions by yourself.
Unfortunately, the reason that I'm not capable of taking care of this exception is, it is thrown by the library libctrlv16.dll, which I don't have any control on. And it is probably (I'm only guessing) thrown because of some pointer inside CtrlScript object is set to nullptr, which is not only managed by me (because I guess it is still being initialized in between startFunc calls by the interpreter)
- kilianvp
- Posts:443
- Joined: Fri Jan 16, 2015 10:29 am
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
if you think it is a bug, report it to Siemens Support!
- Sabragan
- Posts:12
- Joined: Fri Feb 01, 2019 4:56 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
Thank you for the suggestion.kilianvp wrote: ↑ Fri May 29, 2020 1:37 pm if you think it is a bug, report it to Siemens Support!
I still suspect if I'm violating the usage of this CtrlScript thing as there is no clear documentation about it. That is why I ask people having an idea about the insights of this library, to give me an idea about the root cause.
If it becomes obvious that this is a bug, of course I can create a bug in Siemens Support.
- Sabragan
- Posts:12
- Joined: Fri Feb 01, 2019 4:56 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
Are there any other opinions?
- gschijndel
- Posts:376
- Joined: Tue Jan 15, 2019 3:12 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
Yes, try it with a 'CtrlModule'. I have only used 'DoneCB' for single callbacks (asynchronous functions).
- Sabragan
- Posts:12
- Joined: Fri Feb 01, 2019 4:56 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
I could not find any documentation regarding use of CtrlModule.gschijndel wrote: ↑ Tue Jun 02, 2020 11:30 am Yes, try it with a 'CtrlModule'. I have only used 'DoneCB' for single callbacks (asynchronous functions).
Do you have an example for it?
- Sabragan
- Posts:12
- Joined: Fri Feb 01, 2019 4:56 pm
Re: Timing issue causing exception in libctrlv316 when calling CtrlScript->StartFunc()
As I've seen the View count reached 1200, I like to make a short status update.
Unfortunately, I still do not have the solution for this. I've created an issue for the Siemens Support, and waiting for a response.
If anyone has experienced the same issue, I'm still open to your suggestions here
Unfortunately, I still do not have the solution for this. I've created an issue for the Siemens Support, and waiting for a response.
If anyone has experienced the same issue, I'm still open to your suggestions here