04-29-2008 02:02 PM
I have an issue that I would like some help with. I have a
status monitoring program that will stop working after random periods of time.
There are 4 mail parts to the program. There is a DAQ loop
which acquires 16 analog channels at 100ks/S as 16bit integers. One channel is
a 5V signal that acts as a trigger. The data is collected continuously in 50 ms
blocks and passed via a named queue to a data handler program. The data handler
uses the trigger signal to build data blocks that contain integer number of
triggered blocks. The trigger rate runs from 0 to 1kHz. The data handler will
only build blocks up to 1.5 seconds (667 mHz). The triggered data blocks are as
long as 1.5 seconds, but only as short as 50ms. When the trigger rate is high
enough, the blocks are built to contain multiple triggers in a 50ms block. The
data handler also builds 1 second data blocks that are untriggered. The output
of the data handler is two queues. One for triggered data and one for 1 second
timed data. The DAQ and data handler are built in such a way that they have to
run together, but will run indefinitely and use a minimal of processing (~5% on
a 2.4GHz Core2Duo). The data handler also send out notifications when it outputs
data blocks.
This data feeds independently operating (individually called
sub routines) analysis routines that push their results to a UI loop using
references. The structure of the analysis subroutines are all very similar: A
combination of Wait on Notification and Wait (ms) controls timing and then
Preview Queue Element is used to get data. The data is processed and the
results are passed to the UI via a reference. There is one analysis routine
that does processing that many other functions use. It also sends out its own
notifier to let functions that need its data know there is new data. Since its
results are small (5x7 DBL array) and I use notifiers to prevent race
conditions, I use a global variable. The UI is very large. It covers a 1920x1200
screen and a 1600x1200 screen. There are 9 analysis processing subroutines that
run and one program for monitoring the sub-programs. All total, there are 15
sub-programs running. I was very careful in the construction of my programs, so
every array is preallocated and index values are used instead of reallocating
space. All buffer allocations occur at the initialization of subroutines
instead of happening when sub-vi's are called. When the program is running,
utilization runs between 30-45%. Also, memory use is stable and does not
increase as the program runs.
The problem I have is that the program will go from 40%
utilization to 100% after some random period of time and this will cause the
DAQ buffer to overflow and the loop will crash. Without DAQ, the processing is
pointless, though the functions are still running; they are just waiting for
new data. This problem will occur anywhere from 2 minutes to 14 hours after the
program is started. Sometime the useage jumps to 100% for just a few seconds
and drops back before the DAQ buffer overflows. Then things just keep right on
going. I have run the profiler and when this happens, it seems to be a
different sub vi each time that has an exceptionally long run time. Its
something to do with everything running together because when this doesn't
happen with just the DAQ and data handler, or the DAQ, data handler and any one
or two sub-routines.
The DAQ and data handler have to ties to any of the other
functions. All their output is either queues, notifiers, or globals. The queues
are handled internally and sub-routines are not required to dequeue to prevent
the queues from filling up. Also, the globals are loop counters and boolean
status values, no data arrays are being passed. The notifier is simply a boolean
as well.
Any thoughts will be greatly appreciated. I'd like to get
this stable so I can focus on improving the analysis, rather than stability
issues. I will be as helpful as I can if you need any more info for troubleshooting. I may have been as clear as mud about how this thing works.
Thanks,
Chris
04-29-2008 02:24 PM - edited 04-29-2008 02:25 PM
Is this part of a Dean Koontz novel???
Posting the vi might help...even though it sounds like a rather large complicated program...
Preview Queue Element won't remove it from the queue, so eventually you will have mem problems.
If the CPU is being pegged then you might be needing (missing) a wait function somewhere.
The Wait(ms) and Wait on Notification in same loop seems redundant...but who knows...maybe there's a good reason for it...
Are there other programs running, such as virus scans???They can always be a problem...
04-29-2008 02:36 PM
Sorry for the long post. There are tons of posts where
people just say they are having a problem, but give no details. To respond
though:
There would be a lot of work involved with posting the vi. I can do parts, but
not the whole for various reasons.
The data handler dequeues the data from the DAQ queue. It also dequeues
elements from its own data queues before enqueueing new ones. I have to do it
this way since I have multiple independent functions that need data from the
same output queues.
I use both the Wait(ms) and Wait on Notification because the data handler runs
at 20Hz, but most of the sub functions run at slower rates (1-5Hz) This way
those functions run on the most recent data, but at a slower rate. It usually
goes Wait on Notification >> Perform analysis >> Wait for 250ms
There may be a virus scanner of on the system. I will check. Other than that
though, LabVIEW is the only thing running. Its a dedicated status monitoring
system.
04-29-2008 02:42 PM
COsiecki wrote:
I use both the Wait(ms) and Wait on Notification because the data handler runs at 20Hz, but most of the sub functions run at slower rates (1-5Hz) This way those functions run on the most recent data, but at a slower rate. It usually goes Wait on Notification >> Perform analysis >> Wait for 250ms
The Wait on Notification will wait as long as it needs to though....
15 tasks is quiet a few, but I don't think it should really be a problem...
When you first start things going how much CPU time does each one soak up? Does the task that monitors the subtasks have a potential to lock up in any manner? what exactly is it doing? I assume you have made sure any and all loops have at least a small ms wait in them?.?.?.?.?
04-29-2008 02:54 PM
04-29-2008 02:55 PM
04-29-2008 02:59 PM
04-29-2008 03:23 PM
There is a couple of things you might want to check out:
1. Is you machine controlled by an IT organization? If so, they may be sneaking things in the system cause a 100% utilization.
2. I remember somewhere, they mentioned that Labview uses a garbage collector (I could be wrong about this). If this is true, the peaks you are seeing could be a garbage collection phase.
3. You did not mention if the crashes is relatively uniform in time. If it is, you might want to try adding or removing memory on your computer to see if it affects the time between crashes.
04-29-2008 04:34 PM
04-29-2008 04:36 PM