LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

RT program stops running after ~13 hrs. Strange errors in log.

I have an RT application that is essentially a datalogger from an 7852R board. The FPGA grabs the data at some interval (20 ms, 10 ms, 20 ms...) and passed the data to Labview with a DMA FIFO. The RT system passes the data to a client running windows through TCPIP. I did a long run test over the weekend (running the program) and it seemed to crash after 13 hours. As far as I can tell, the rt program just stopped working. For the test, I had disconnected the host computer (right click the remote computer in the project and select disconnect) when I connected, it looked the program on the RT box was just waiting to be run. The only way for the RT program to programatically stop running is with a stop button on the front pannel so I'm not sure why it just stopped. I checked my error log and this is what I got:

 

 2/20/09 6:54:38.924 PM

source\mgcore\MemoryManager.cpp(137) : DWarnInternal: Memory error 2 in DSSetHandleSize

$Id: //labview/branches/Saturn/dev/source/mgcore/MemoryManager.cpp#16 $

0x01EFEB1F - <unknown> + 0

0x01EFF27E - <unknown> + 0

0x01E4A00A - <unknown> + 0

0x188A01AF - <unknown> + 0

0x18F2B2C0 - <unknown> + 0

0x01D3A232 - <unknown> + 0

0x0008E0EC - <unknown> + 0

 

And that repeats about 50 times. Oddly, I'm not sure the time stamp matches the time of the crash (~6am the next day according to the log files on the client)

 

I'm running an extended again, this time with "retain wire values".   I was just hoping someone had an idea.

CLED (2016)
0 Kudos
Message 1 of 9
(2,803 Views)

You are probably running out of memory. Monitor your memory usage while the app is running and if it steadily climbing, it will only be a matter of time before the crash.

 

THose logs are gereally in UTC time.

 

Have fun!

 

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel
Message 2 of 9
(2,796 Views)

RE: UTC... I would expect it to be 8 hours ahead, not 12 behind. Oh well, maybe the system clock is off.

 

Running out of memory seems possible. I'm watching the memory right now with the RT System Manager and it seems stable at ~34% for the last hour. I'll keep an eye on it. I'm wondering if maybe I should do garbage collection once every 4 hours or something. 

 

Message Edited by InfiniteNothing on 02-23-2009 11:48 AM
CLED (2016)
0 Kudos
Message 3 of 9
(2,791 Views)

InfiniteNothing wrote:

I'm wondering if maybe I should do garbage collection once every 4 hours or something. 


You could just break up the file into several smaller files if memory is an issue.

I know its a pain in the neck, but its an option.

Just come up with some simple logic, For example, start a new file :
- after __ time

- after __ points

- user clicks 'new file' button

 - etc

 

Just throwing an option out there

Message Edited by Cory K on 02-23-2009 02:16 PM
Cory K
0 Kudos
Message 4 of 9
(2,776 Views)
There is no file on the RT system. The RT should just grab the data and send it. There are some fancy queues, and some arrays on shift registers where memory might (but shouldn't accumulate). Actually, I've been watching the program run  (since about the start of this thread) and it's dead solid at 34% memory. There might be some case though where the memory swells (like say, the client disconnects uncleanly.)
Message Edited by InfiniteNothing on 02-23-2009 12:35 PM
CLED (2016)
0 Kudos
Message 5 of 9
(2,771 Views)

InfiniteNothing,

 

Has your program crashed again since your last post?  If this is still happening, will you attach your error log so we can take a look at what's failing?  Thanks,

ColeR
Field Engineer
0 Kudos
Message 6 of 9
(2,742 Views)

Oddly, no. There was no memory increase and no crash over 16 hours yesterday. I wanted to keep running the test but I needed to use the computer for other things. I only made minor changes. Nothing that I think would fix the problem. I'm going to run again tonight.

 

Is there a way for me to send you my error log privately. I'm worried about sensitive info burried in the XML section. 

 

My biggest worry is that this problem only occurs when I'm not looking. That is, if I turn trace on or run the program in the development environment it behaves differently than the executable environment. 

Message Edited by InfiniteNothing on 02-24-2009 02:29 PM
CLED (2016)
0 Kudos
Message 7 of 9
(2,740 Views)

InfiniteNothing,

 

Give me a call on our support line, 1-866-ASK-MYNI (1-866-275-6964) with this service request number,  1340693.  It'll route you to me and we can walk through how to get your error log to me.

ColeR
Field Engineer
0 Kudos
Message 8 of 9
(2,719 Views)
I made some changes to the code to try and avoid memory full situations like making the queues timeout when they are full, putting a more agressive timeout on the TCPIP sends, and changing the way I handle the FPGA fifos to be more CPU friendly, and it seems to have solved the problem. I ran for over 3 days straight and had 0 problems (and a 3 GB log file :smileyhappy:   ) 
Message Edited by InfiniteNothing on 03-02-2009 09:49 AM
CLED (2016)
Message 9 of 9
(2,684 Views)