One Watchdog for each Timed Loop?

ivto · ‎11-21-2007

My labview code has 14 timed loop to control 14 channels which are used to cycle (charge then discharge) batteries.

The problem is any channel could have a random lock-up. When that happened, the batteries attached to that channel got drained completely and that damaged the batteries and some hardware. During the course of debugging, I found that adding some file I/O's to the code can make this problem happen every time, and I think an unknown error happened at the analog read vi (no error popped up) that causes the timed loop to hang. The RT controller also communicates to the host PC. From time to time, it is possible that the busy traffic causes the system to retry sending messages.

So, my theory is the occasional high ethernet traffic causes some unknown error to the analog read and that crashes the timed loop.

1. Is my theory possible?
2. When the code in a timed loop crashes, is it true that the the main program and other timed loop can still run?
3. I understand that only one watchdog is available in the RT system. Is there any workaround so I can put one watchdog to each timed loop? A standalone watchdog loop could not catch this problem because the main program was still running.

Thanks much for your help.

My system config: PXI-1031, PXI-8184, PXI-6221, LV 8.2, LV RT 8.2.

RavensFan · ‎11-21-2007

ivto wrote:

1. Is my theory possible?
2. When the code in a timed loop crashes, is it true that the the main program and other timed loop can still run?
3. I understand that only one watchdog is available in the RT system. Is there any workaround so I can put one watchdog to each timed loop? A standalone watchdog loop could not catch this problem because the main program was still running.

1. I'd say anything is possible.

2. Yes. Perhaps a loop is frozen, waiting on some condition that is never met. Parallel loops that are independent of that loop could still run.

3. I think you could create a standalone watchdog loop in software that would work that I'll describe below. And it relies on the rest of the program still running even if the one loop stops.

A screenshot of the code or the actual VI that is running on the RT controller would be helpful to look at. When you say that the timed loop hangs, do you mean it actually stops iterating? Or is it a case where an error wire is fed into a shift register of a loop and is propagated back to the beginning of the loop on each iteration preventing VI functions within the loop from operating normally? Often times I see recommendations that the error wire be fed into a loop's shift register, but I think this can often be a bad idea. For example, one random error on a serial communication or DAQ function would continue forever and prevent the subsequent loop iterations from behaving properly. If the loops iterate, but it is an error problem, then adding code to handle particular types of errors would help.

In the event that the loops truly hang (i.e. stop iterating), I think there could be a way to do a software watchdog for each of your timed loops. Have each loop update an indicator, perhaps just the iteration loop number. Have a 15th loop that monitors those 14 indicators (or an array of indicators or a functional global variable, ....). If an indicator doesn't get updated in a reasonable amount of time, then have that special code run to safely shutdown whatever processes and hardware would be damaged by the fact the loop hung.

ivto · ‎11-26-2007

If I put a "write text file.vi" to log a message before the "DAQmx Read vi" and one to log after it, I only saw the label before the DAQmx Read after a lock-up. So it either took forever to run the DAQmx read like you mentioned, or the timed loop just crashed.

I am going to try your suggestion by adding a software watchdog. It would be very helpful if you could send me some sample code. 😉

Why would the DAQmx Read behave like this? Is there anything I can do to prevent it from happening?

In general, what is a preferred way to debug a RT system?

Thanks much for your help, Ravens Fan.

RavensFan · ‎11-26-2007

ivto wrote:
If I put a "write text file.vi" to log a message before the "DAQmx Read vi" and one to log after it, I only saw the label before the DAQmx Read after a lock-up. So it either took forever to run the DAQmx read like you mentioned, or the timed loop just crashed.

Why would the DAQmx Read behave like this? Is there anything I can do to prevent it from happening?

Hard to say what may be causing that. You may want to post your code. If it was something like waiting for a thousand pulses that don't occur, or waiting for a trigger level that doesn't occur, that would make sense. You may want to put a timeout on the DAQmx read function. Look for a timeout error and handle that in your code.

ivto wrote:

I am going to try your suggestion by adding a software watchdog. It would be very helpful if you could send me some sample code. 😉

I'm attaching a VI of the concept (LV 8.2.1). This is only acting on one loop. To expand to 14 loops, I would recommend creating an array of clusters, so that you could iterate through each of the loops with a single set of code. Rather than using an indicator (which you could hide on screen) as a "variable" to hold the loop iteration, you could set things up as a functional global variable or action engine to pass the loop iteration into the monitoring loop. Of course this assumes that nothing would crash the monitoring loop, and you will be able to issue the commands from this loop to set off a warning, shutdown the test, whatever is necessary to prevent the problems you are having now.

ivto wrote:

In general, what is a preferred way to debug a RT system?

I don't know enough about debugging RT system's to comment on this. What you have done so far with the log file was a good technique to allow you to discover where in your code the problem is.

Message Edited by Ravens Fan on 11-26-2007 03:57 PM

ivto · ‎11-28-2007

Your software watchdog loop works very nicely! It helps me a lot. Thanks!

LabVIEW

One Watchdog for each Timed Loop?

One Watchdog for each Timed Loop?

Re: One Watchdog for each Timed Loop?

Re: One Watchdog for each Timed Loop?

Re: One Watchdog for each Timed Loop?

Re: One Watchdog for each Timed Loop?