One of our customers has Lookout 6.0.2 communicating with several CMI Scada Packs over wireless modems using Modbus RTU protocol. Suddenly last Friday night he got comm fail alarm pages on all but one site. The Modbus statistics window revealed that Lookout had stopped polling all sites except the one site that stayed in communications. The only activity was the one site.
He watched it poll this way for 2 hours, never returning to the normal polling cycle.
I had him exit and re-launch Lookout and polling returned to normal and has been that way since late Friday night.
This system has been in service for several years with very near 100% good communications sice we deployed it, up until now.
It was obviously a Lookout failure, since the only action taken was to exit and re-launch Lookout.
Isn't there a watchdog timer for this sort of error?
Is this a failure of the Modbus cbx object? (We are using the plain vanilla Modbus driver)
Foote Control Systems
Solved! Go to Solution.
I have run into a similar issue before, with modbus and a custom driver. The modbus issue was never resolved, we simply rebuilt the process on the same machine and it worked fine since.
The custom driver was a weird issue. If we had certain datamembers on a popup window the driver would stop working, all of the driver objects not one, after a few hours. The driver works completely fine in all other systems. We changed the window to a regular full size, issue stopped.
Mike, thanks for responding.
I saw this behavior in V3.8, 4.5, not in 5.1 but now in 6.0.2. Thought it was fixed...
Makes Lookout unusable in remote places where there may be no supervision. In places where you have operators on hand the fix seems to be to exit lookout and relaunch.
But getting customers to do this has been quite difficult.
This is really starting to scare me since in one case a main reservoir went critically low and NI doesn't even seem interested enough to respond. This was a municipal water system btw.
Same for use, water/wastewater. We had a new software update for the client, went to lunch, operator we went with gets a call, communications fails.
Roger, sorry for late response. Could you let the customer enable the Diagnostic file on serial port? Select Option->Serial ports, enable the Diagnostic file setting for communication port, input the file name such as d:\log.txt, check the Value in HEX and timestamp. The communication will be logged into the file. Then, after this problem happens again, upload the log file to ftp://ftp.ni.com/incoming.
If possible, please also upload the process file. I want to have a look at the process.
It is probably a bug in modbus driver. The driver is supposed to retry after the communication failure.
Doing the serial port diagnostics won't help here because we cannot let this happen again. The customers were advised by me to exit lookout, reboot the PCs and relaunch lookout once a week. One customer is a small municipality with a couple thousand connections, the other is a large supplier of agricultural water to growers. In the second case an overflow means damage to some very $$$$ real estate.
It would be better for NI to set up a polling system and test it. It could take months to see this behavior so you will need a dedicated setup.
RTUs are ScadaPack P1s connected to wireless 900 mHz ISM band FHSS modems. PC is connected to master radio in a point to multipioint configuration.
Serial is 96008N1. Polling is set to 00:00:01.
We have around 450 I/O controlling 50 pumps at 6 sites. (This is the ag setup)
Polling every 1 second? We have a custom driver and when the buffer for poll requests gets to large (lots of sites waiting) it can cause a lockup of communications. Is possible modbus driver does the same.
I administer a system similar to the one you described with roughly 80 remote sites (you might be surprisingly familiar with it). We would have communication failures similar to yours when assigning a fixed time using the ‘pollrate’ data member. I set up a sequencer object to trigger the ‘poll now’ data member(s) at a fixed rate instead. The modbus object poll queue doesn’t seem to do well with large numbers of sites, or rapid poll rates. On a side note, have you ever tried the DNP object with your ScadaPacks? I’m not suggesting it as a solution; I’m wondering if DNP works.
Hey Mr Domer
If it's the one I think it is, I designed and built the RTU system and repeater system after the "Packaged" system became obsolete.
That was back when 1200 baud was fast! I guess it was compared to the 300 baud limit on the old stuff. Probably been re-built a couple of times since.
Thanks for the sequencer suggestion. I will have to think on this since we are pushing the limits for polling... Some ScadaPacks in this syatem have to do up to 25 modbus transactions per site... We have serious I/O in this sytem lots of which is in floats. Haven't tried DNP yet but I am getting pressure from some distributors to go Clear SCADA and use DNP to thin out redundant polling. Thing is this system has worked flawlessly for years and now suddenly we have this issue, so I am not ready to throw in the towel yet and I doubt the board would go for the moneys to implement it. We are adding another 8 pump site this October after the canals are off line but before they are drained for winter. This is so we can test the system before the frost protection season for the almond crop.
If NI can't fix this I will probably just insist that my clients reboot every week. I am fortunate that these are not unmanned districts. Only problem is that each of them are at least a 2 hour drive from any one point so I need to keep callbacks to a minimum which would be zero callbacks if it weren't for software screwups.