From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

real time crash

Good afternoon.

 

I have implemented a set of software on a cRIO (NI-9074) that controls units under test, acquires analog data from 85 channels, analyzes the data for failure conditions, and creates a data file each time that a failure event occurs.

 

The control portion of the system turns lighting ballasts on and off and signals the ballasts to switch between normal (full) power and dim.  The dim control for some products uses relay outputs.  For other products the dim control is performed using RS-232 communication.

 

The data acquisition portion of the system samples the analog channels at 10 Hz and "inserts" the data into a ring buffer of 20,000 elements.  This portion also determines when the electrical current is outside of the specified range.  When the electrical current goes outside of the specified range, an event is triggered to extract the data related to the unit that failed from the ring buffer and write it to a binary file.

 

The software causes the cRIO to crash after several hours (typically 4 to 8 hours).  The OS crashes, not just the software that I developed.  I know this because I cannot connect to the system using NI-MAX, Distributed System Manager, or a web browser.  I am working to isolate the section of code that is causing the problem.  I would like to think that I write perfect code :-).  I have determined that I am closing any resources that I open.  I am logging any errors to a file.  The software is not reporting any errors.  I have added functions to report the system resources a few times per second when they are over-utilized.  I am reporting CPU load and free physical memory.  The CPU load goes to 100% for around 4 seconds each time a failure event data file is written to the compact flash.  Is this a potential problem?  If so, what is the recommended approach for writing large blocks of data to a file in a way that reduces CPU load? 

 

I am interested to know what would happen if, for example, I try to access an element of an array that has not been defined.  I have built and run a test VI to see what happens when I try to retrieve an element at an index beyond the size of an array and to replace a subset at a location beyond the size of an array.  It doesn't seem to break anything.  Does anyone know of catastrophic behavior resulting from providing an invalid array index?

 

I am executing the software in ways that exercise only portions of the system to isolate the problem. 

 

Does anyone have suggestions for ways to debug such a system?

 

Thanks for any suggestions,

Hamilton Woods

0 Kudos
Message 1 of 7
(2,534 Views)

 

I am reporting CPU load and free physical memory.  The CPU load goes to 100% for around 4 seconds each time a failure event data file is written to the compact flash.  Is this a potential problem?  If so, what is the recommended approach for writing large blocks of data to a file in a way that reduces CPU load? 

  


I wouldn't expect any issues from briefly using 100% of the CPU and that certainly shouldn't cause a crash. What sort of free physical memory numbers are you seeing?

 


 

Does anyone have suggestions for ways to debug such a system?

 


If you can connect to console out on the cRIO, there may be some crash output there that will help debug the issue. I noted that you're using RS-232 in some cases, but hopefully it will reproduce with just the relays?

http://digital.ni.com/public.nsf/allkb/354A5124E6A667988625701B004A77CD has some information on how to set this up. I recommend using putty, and I don't think the WIF console will show what we need to see, but might be worth a shot.

0 Kudos
Message 2 of 7
(2,493 Views)

I'd also add, make sure your scrollback buffer is large (at least 10,000 lines) to avoid losing any information.

0 Kudos
Message 3 of 7
(2,484 Views)

Physical free memory is 37.8 MB of 124.5 MB total. 

 

I have reconfigured to use the built-in COM port for Console Out.  I am now running the application to see what output gets produced.  What is the WIF console?

 

Thanks,

Hamilton Woods

0 Kudos
Message 4 of 7
(2,480 Views)

Assuming it's stable, that much physical memory should be plenty.

 

The WIF is a web interface that's available for cRIO which, among other things, provides access to the console. The COM port is better in this case since we can't gurantee that any errors will get written to the web interface in the event of a crash.

0 Kudos
Message 5 of 7
(2,429 Views)

Update:

 

I have seen some errors related to shared variables not being available in some of the runs that led to a cRIO crash.  I am beginning to suspect that there is something amiss in my shared variables.

 

I understand that network access is suspended when CPU load is high.  Does anyone know if the shared variable engine is affected by high CPU load?

 

Thanks,

Hamilton Woods

0 Kudos
Message 6 of 7
(2,422 Views)

I could see the data transfer being briefly interrupted, especially if whatever is pegging the CPU is high priority.

 

What kind of errors are you seeing?

 

Sebastian

0 Kudos
Message 7 of 7
(2,388 Views)