Timing bug in the System Management panel and best practice for avoiding it

Discussions about product bugs & problems!
Note: This is no replacement for the Official ETM Support!
2 posts • Page 1 of 1
v3gard
Posts:9
Joined: Tue Aug 23, 2016 7:27 am

Timing bug in the System Management panel and best practice for avoiding it

Post by v3gard »

Sorry for the wall of text below. I felt it was necessary to explain my problem properly as I am in fact pointing out a bug in the System Management panel (that probably never occurs unless dpGet becomes atomically fast :))

In order to improve my understanding of how OOP is used to dynamically generate buttons in a LayoutManagement controlled panel in OA, I reverse engineered the System Management panel and analyzed each line of code, stripping out all lines not relevant for my goal which was to learn the relation between panels and object instances.

Before I modified any of the files, I made a copy of the following panels and renamed them accordingly (note that I deliberately ignored the Breadcrumb part in the System Management panel to simplify things):
  • classes/Tile.ctl -> classes/NETile/NETile.ctl
  • classes/TileScreen.ctl -> classes/NETile/NETileScreen.ctl
  • vision/SystemMgm/sysMgm_iconview_icon.pnl -> vision/NETile/NETile_overview_icon.pnl
  • vision/SystemMgm/sysMgm_GridPanel.pnl -> vision/NETile/NETile_GridPanel.pnl
After that, I renamed Tile and TileScreen to NETile and NETileScreen (both the files and the class names). I also created a brand new panel (NEGridTest.pnl) and imported the necessary code from vision/SysMgm.pnl (scopelib, init) and panel objects into the new panel. This was required for for making instances of the aforementioned classes (NETile, NETileScreen) as well as for drawing the buttons dynamically and handling the #Clicked trigger event with uiConnect(). As previously mentioned, I deliberately ignored the breadcrumb part to simplify and the resize code was not relevant (at this point) and thus not included.

Then I started removing all references to para and the sysMgm library and replaced the information with local variables. Instead of getting the icon and label text from a data point like the original SysMgm panel, I simply modified the NETile.ctl class to also store this information as properties and adjusted the sysMgm_iconview_icon.pnl (now known as NETile_overview_icon.pnl) to get the required information from NETile instead of a DP.

Eventually, I got everything up and running without any use of datapoints. I still had one line of code that kept me from removing the reference to the #SysMgm library, and that was smDbLoadRoot(dps). The weird thing here is that the buttons would not be drawn unless that line of code was included and run as the panel was started. If I commented it out, a call to getShape() in NETile.ctl failed to execute. This is unchanged from Tile.ctl:

Code: Select all

grid = getShape(moduleName + "." + panelName + ":" + layout);
The error message received was

Code: Select all

Line: 178, In "getShape()": Object "NEGridTest_Module.panel:GRID" does not exist
This did not initially make any sense. Why would a reference to smDbLoadRoot(dps) affect the getShape() function, and why was the panel reference found if the function was run? After discussing this problem with a colleague, and inspecting what the smDbLoadRoot function actually did (some dpGet/dpSet functions ++), we discovered that the only relation between these functions and the error message seen was timing! If I replaced smDbLoadRoot(dps); with delay(0,10); the error vanished and the buttons appeared.

Since we are using multiple panel objects spread over three different files, this sort of makes sense. The keyword here is synchronization. The panels need to know about each other at some point since dynamically placed buttons needs to be dynamically addressed in a CONTROL++ function that runs outside of the panel. Failure to do causes the getShape() function to fail as it was not properly synchronized to VISION (yet). At this point, the getShape failed to get a reference to an object whose name had been modified in CONTROL during runtime(NEGridTest_Module.panel:GRID which originally was called SysMgmt_Module.panel:GRID). By forcing a 10ms delay however, getShape() was successfully run.

My question in all of this is; what is considered best practice for solving this timing issue with CONTROL? The current solution with delay is ugly and more of a hack than anything else. I am interested in discussing the proper way of addressing this issue. Obviously a mutex of sorts is required. Is there a built-in function in CONTROL for knowing when a dynamically created panelobject is ready for processing (i.e. retrievable by getShape)? Can it be solved using the synchronized keyword on certain classes as discussed in qthelp://wincc_oa/doc/OOP/oop_basics.htm or is there a better way? I feel that using a triggerEvent with a uiConnect to a callback for solving this is a bit excessive..

I have attached my code for those interested. Simply extract the zip file in your project folder, restart GEDI and run NEGridTest.pnl. Note that resizing is not properly handled yet. https://www.winccoa.com/fileadmin/image ... OPTest.zip
Attachments
OOPTest.zip
(64.47 KiB) Downloaded 222 times

fmulder
Posts:330
Joined: Wed Feb 03, 2010 9:46 am

Re: Timing bug in the System Management panel and best practice for avoiding it

Post by fmulder »

We've ran into these kind of timing issues many times. They are often quite hard to find and difficult to fix.

We recently had the following issue:
* We had one callback that is showing come camera's on a videowall
* A piece of scripting cleans up the videowall when you logout

The following issues happened
* A menu item was chosen to logout and a script function started clearing the videowall
* At the same time the callback triggered and decided to show the cameras.
* This callback was terminated 'in the middle' because the panel was closed
* Effect -> Most cameras were 'cleared' but the callback managed to show some of them (before being terminated)

I might believe that I'm a good programmer but still..... it took me a day to find the timing bug (that i personally 'wrote')

I'm still looking for a good solution

What I do quite often is use a callback to 'channel' commands. Example: we always have one dpConnect() that takes care of the screen. You do a dpSet() to tell the system what screen configuration you want. Lets say that you want to open a process overview and then quickly someone says 'Logout'. You probably know that callbacks are always executed completely before doing the next callback. You also know that the callbacks are always executed in the right order.
This means that the command 'logout' will always be shown after the command 'floor plan'. Result -> screen is opened in the right manner

Bad :
One script is building up your screen opening several panels while menu item 'Logout' is closing several panels. When you quickly select menu item 'Logout' while the screen is actually still building up then there's a good change that 'logout' tries to close some screens that are currently in the middle of opening up.

Good:
Use the callback. This will guarantee that 'Logout' is executed after 'Open screens' is completely finished !

Hope this helps. Interested to see what others can add here .

Share the fun
Frenk Mulder

2 posts • Page 1 of 1