LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Unknown lockup cause

I have an issue that I would like some help with. I have a status monitoring program that will stop working after random periods of time.
    There are 4 mail parts to the program. There is a DAQ loop which acquires 16 analog channels at 100ks/S as 16bit integers. One channel is a 5V signal that acts as a trigger. The data is collected continuously in 50 ms blocks and passed via a named queue to a data handler program. The data handler uses the trigger signal to build data blocks that contain integer number of triggered blocks. The trigger rate runs from 0 to 1kHz. The data handler will only build blocks up to 1.5 seconds (667 mHz). The triggered data blocks are as long as 1.5 seconds, but only as short as 50ms. When the trigger rate is high enough, the blocks are built to contain multiple triggers in a 50ms block. The data handler also builds 1 second data blocks that are untriggered. The output of the data handler is two queues. One for triggered data and one for 1 second timed data. The DAQ and data handler are built in such a way that they have to run together, but will run indefinitely and use a minimal of processing (~5% on a 2.4GHz Core2Duo). The data handler also send out notifications when it outputs data blocks.
    This data feeds independently operating (individually called sub routines) analysis routines that push their results to a UI loop using references. The structure of the analysis subroutines are all very similar: A combination of Wait on Notification and Wait (ms) controls timing and then Preview Queue Element is used to get data. The data is processed and the results are passed to the UI via a reference. There is one analysis routine that does processing that many other functions use. It also sends out its own notifier to let functions that need its data know there is new data. Since its results are small (5x7 DBL array) and I use notifiers to prevent race conditions,  I use a global variable. The UI is very large. It covers a 1920x1200 screen and a 1600x1200 screen. There are 9 analysis processing subroutines that run and one program for monitoring the sub-programs. All total, there are 15 sub-programs running. I was very careful in the construction of my programs, so every array is preallocated and index values are used instead of reallocating space. All buffer allocations occur at the initialization of subroutines instead of happening when sub-vi's are called. When the program is running, utilization runs between 30-45%. Also, memory use is stable and does not increase as the program runs.
    The problem I have is that the program will go from 40% utilization to 100% after some random period of time and this will cause the DAQ buffer to overflow and the loop will crash. Without DAQ, the processing is pointless, though the functions are still running; they are just waiting for new data. This problem will occur anywhere from 2 minutes to 14 hours after the program is started. Sometime the useage jumps to 100% for just a few seconds and drops back before the DAQ buffer overflows. Then things just keep right on going. I have run the profiler and when this happens, it seems to be a different sub vi each time that has an exceptionally long run time. Its something to do with everything running together because when this doesn't happen with just the DAQ and data handler, or the DAQ, data handler and any one or two sub-routines.
    The DAQ and data handler have to ties to any of the other functions. All their output is either queues, notifiers, or globals. The queues are handled internally and sub-routines are not required to dequeue to prevent the queues from filling up. Also, the globals are loop counters and boolean status values, no data arrays are being passed. The notifier is simply a boolean as well.
    Any thoughts will be greatly appreciated. I'd like to get this stable so I can focus on improving the analysis, rather than stability issues. I will be as helpful as I can if you need any more info for troubleshooting. I may have been as clear as mud about how this thing works.
Thanks,
Chris


   

0 Kudos
Message 1 of 10
(3,697 Views)

Is this part of a Dean Koontz novel???

 

Posting the vi might help...even though it sounds like a rather large complicated program...

Preview Queue Element won't remove it from the queue, so eventually you will have mem problems.

If the CPU is being pegged then you might be needing (missing) a wait function somewhere.

The Wait(ms) and Wait on Notification in same loop seems redundant...but who knows...maybe there's a good reason for it...

Are there other programs running, such as virus scans???They can always be a problem...



Message Edited by TWGomez on 04-29-2008 02:25 PM
________________________________________________________

Use the rating system, otherwise its useless; and please don't forget to tip your waiters!
using LV 2010 SP 1, Windows 7
________________________________________________________
0 Kudos
Message 2 of 10
(3,691 Views)

Sorry for the long post. There are tons of posts where people just say they are having a problem, but give no details. To respond though:

There would be a lot of work involved with posting the vi. I can do parts, but not the whole for various reasons.

The data handler dequeues the data from the DAQ queue. It also dequeues elements from its own data queues before enqueueing new ones. I have to do it this way since I have multiple independent functions that need data from the same output queues.

I use both the Wait(ms) and Wait on Notification because the data handler runs at 20Hz, but most of the sub functions run at slower rates (1-5Hz) This way those functions run on the most recent data, but at a slower rate. It usually goes Wait on Notification >> Perform analysis >> Wait for 250ms

There may be a virus scanner of on the system. I will check. Other than that though, LabVIEW is the only thing running. Its a dedicated status monitoring system.

0 Kudos
Message 3 of 10
(3,684 Views)


COsiecki wrote:

I use both the Wait(ms) and Wait on Notification because the data handler runs at 20Hz, but most of the sub functions run at slower rates (1-5Hz) This way those functions run on the most recent data, but at a slower rate. It usually goes Wait on Notification >> Perform analysis >> Wait for 250ms



The Wait on Notification will wait as long as it needs to though....

 

15 tasks is quiet a few, but I don't think it should really be a problem...

When you first start things going how much CPU time does each one soak up? Does the task that monitors the subtasks have a potential to lock up in any manner? what exactly is it doing? I assume you have made sure any and all loops have at least a small ms wait in them?.?.?.?.?

 


________________________________________________________

Use the rating system, otherwise its useless; and please don't forget to tip your waiters!
using LV 2010 SP 1, Windows 7
________________________________________________________
0 Kudos
Message 4 of 10
(3,680 Views)
If you put a non-negative timeout on the Wait on Notification, then it will not wait forever (could a notification be missing?). Just check the Timed out? output to determine whether a valid notification was sent. It might be possible to use the timeouts for some or all of your delays in those places where you use notifiers or queues to transfer the data.

Lynn
0 Kudos
Message 5 of 10
(3,676 Views)
There are some functions that don't need to run on every block of data, so I just wait while some of the data goes unused by that particular function. Once the wait is over, it will get the notifier on the next new block of data.

I have some startup delays so that everything isn't starting at once. Useage peaks at 70% or so, but for less than a second. Each of my sub routines writes its iteration counter value to a global. They also set a boolean true when they start and false when they stop. The task that monitors them reads the global counter values and status booleans every 500ms.

I'll verify about the waits, but I am pretty sure they do.
0 Kudos
Message 6 of 10
(3,675 Views)
What does the function do though...It checks the interation globals, but for what purpose? Does it do something if they don't all match?
________________________________________________________

Use the rating system, otherwise its useless; and please don't forget to tip your waiters!
using LV 2010 SP 1, Windows 7
________________________________________________________
0 Kudos
Message 7 of 10
(3,672 Views)

There is a couple of things you might want to check out:

1.  Is you machine controlled by an IT organization? If so, they may be sneaking things in the system cause a 100% utilization.

2.  I remember somewhere, they mentioned that Labview uses a garbage collector (I could be wrong about this). If this is true, the peaks you are seeing could be a garbage collection phase.

3.  You did not mention if the crashes is relatively uniform in time. If it is, you might want to try adding or removing memory on your computer to see if it affects the time between crashes.

0 Kudos
Message 8 of 10
(3,665 Views)
The function just tells me how many times the loops have run and whether or not they are still running. It doesn't do anything else. I can use the rate of a program and the number of times it ran to tell me how long it had been running when it stopped. Since things run independently, I wouldn't have any other way of knowing what stopped when.
0 Kudos
Message 9 of 10
(3,644 Views)
This computer is not connected to any network. The crashes are not uniform in time. They happen anywhere from a few minutes to many hours after starting the program. Memory useage reported by Windows never goes above 750MB. The system has 2 GB.
0 Kudos
Message 10 of 10
(3,643 Views)