Linux Users

cancel
Showing results for 
Search instead for 
Did you mean: 

Memory Errors when running Embedded Runtime vs usual LabVIEW EXE Runtime

I am running into a strange issue I am not sure how to trouble-shoot/debug.  I have a Main VI that has a static VI that launches Dynamic VIs.

 

I am needing to run headless, thus need to utilize the Embedded Runtime Engine. 

 

First I wanted to make sure my code was working properly when compiled as a normal LabVIEW EXE.  I wanted to test the long-term running of the EXE, no memory leaks, able to shutdown and load the Dynamic VIs as needed.  This worked great.  I see no memory leaks after running for 24-hours.  

 

The DAQmx based instrument uses a User Event Structure to provide N samples acquired into Buffer for the Read VI to be triggered. The data is then sent to a consumer Queue and if there is a client connection, the data is sent out over TCP.  The data is converted from 2D to 1D interleaved in the consumer, then flattened to string with Size = F.  The data dequeue and data conversion happens whether there is a Client Connection or not, but the TCP Write VI is not called if there are no clients.

 

The Dynamic VIs and the support VIs needed (i.e. DAQmx VIs) are located in a Source Distribution.  The Dynamic Launcher for both instances of EXE versions loads the Dynamic from the Source Distribution.

 

When I compile my Main.vi into a Shared Object and then call that VI with a C executable to run in the Embedded LabVIEW Runtime is where I am seeing an issue!  The system loads the Dynamic VIs as designed.  The DAQ instrument loads and launches properly.  The data is passed over the Producer/Consumer Queue just fine. The data connection to the TCP Client works properly.

 

For both tests I use the same client and just leave it connected to stream the TCP data.

 

But the C EXE launched version running in the Embedded Runtime Engine crashes after about 3 hours with a "Double Free or Corrupt" memory error.  I am not calling any kind of memory release function in my code of course.  I am not building an array or concatenating any strings in the DAQmx VI.  The Consumer Queue loop is keeping up just fine in both instances.  The data conversion from 2D to 1D interleaved and then flatten to string all happen in 1/1000 of the time between data packets incoming from the Producer Loop.

 

CentOS 7.6 running Gnome

LabVIEW 2019 SP1 (x64 of course)

LabVIEW Runtime (for normally built LabVIEW EXE)

LabVIEW Embedded Runtime (for the Main.vi in the .SO and launched by C exe)

DAQmx is up to date

 

Normal LabVIEW EXE version:

 

This version of EXE requires an X GUI for being able to run the EXE.

Normal LabVIEW Runtime.png

 

Embedded LabVIEW Runtime version.

Embedded LabVIEW Runtime.png

I am getting NO LabVIEW generated errors in my code. I am running using Syslog and each VI logs any error that might occur.

 

In my /var/log/messages file I will see the memory fault called out - 

LabVIEW caught an Error 

Double free or corrupt

Then the code will be aborted.

A memory dump appears to happen - but I do not know how to interpret this.

 

NI - what is the difference between the two run-time engines that would cause the same VIs to behave differently?

Is DAQmx fully tested in the Embedded LabVIEW runtime?  That's one possible source of the issue I can think of...

 

Also occasionally in the Embedded Runtime version when I stop the DAQmx VI - I will also get Double Free or Corrupt fasttop error.  All my opened references are closed in proper order, and I get no LabVIEW errors in closing my Queue ref or stopping my DAQmx task or releasing the User Event, etc. as the VI shuts down.

 

I get no unloading error when using the regular LabVIEW EXE/Runtime.

Ryan Vallieu CLA, CLED
Senior Systems Analyst II
NASA Ames Research Center
0 Kudos
Message 1 of 4
(1,554 Views)

OK - there seems to be two issues.

I changed the code to remove a DVR read that was storing the DAQmx task.  The read was a parallel read type with the DVR not being written back into the right side node.

The User Event Structure when triggered by the DAQmx Event would write the DAQmx Task and the Number of Elements into the DVR.  The State Machine would transition to the Read Data state and the DAQmx task and Num Elements would be read from the DVR (of Variant, used as Attribute LUT).

 

To test if the read of the DVR of the VAR LUT was the issue, I moved the Read Data state into the User Event Structure.

This removed the Write into the DVR and the Parallel Access style READ from the DVR in the Read case.  Now only the DAQmx Read and publish via Queue is in the User Event Case for the DAQmx Event.  The read is also a MALLEABLE VI - and I had trouble with a Malleable not loading properly before in the dynamic launched VIs - maybe that contributed.

 

This is the code that I skipped calling in the Read Data case.  It would pull out the Cluster of DAQmx Task and Number of Points.  Ran fine for 3 hours, but seems to have been causing the SIGSEGV faults after that time.  Only proof I have of that is that the code then ran for 12 hours and no SIGSEGV fault once I skipped this call.

 

RVallieu_0-1638294262574.png

The write that was in the USer Event Structure could also have been the issue I suppose, although I added code to skip the write if the DAQmx Ref and sampleInterval were the same (which they should be) so it seems like the read was the more likely issue:

RVallieu_1-1638294492676.png

 

 

The code ran for 12 hours this was in the C EXE form - up from 3 hours of running.  That seemingly removed the SIGSEGV fault.  I thought to try this since I saw a /var/log/messages entry of Variant writing empty (or some such verbage) around the prior crash where there was the SIGSEGV.

 

Again - none of this happens in the regular LabVIEW Runtime, only the Embedded.

 

This time the error captured by the system was a SIGBUS fault.

 

FileManager1: Reason:

FileManager1: LabVIEW caught a fatal signal

FileManager1: 19.0.1f1 - Received SIGBUS

 

After this the program seemed to still be running looking at the Resource Manager.  The memory had not increased in the whole 12 hours - so there isn't a memory leak.

 

SIGBUS seems like there is a memory addressing issue on a read.

 

I'm not sure how to debug whether this is the DAQmx read that is causing the problem or the Queue used to pass from the producer to the consumer.  I suppose I can cut out the Queue and not write the data anywhere after reading it from the DAQmx and that might narrow it down more.

 

Something is different between the normal LabVIEW EXE compiled version and the C EXE compiled into .SO version....as I have had no issue running this in the GUI version of the Runtime.

 

I suppose one way around this is to run the GUI based EXE Runtime, but then I would need to assign an X Display to the EXE on start-up - which might be possible. https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z0000019RYlSAM&l=en-GB

Ryan Vallieu CLA, CLED
Senior Systems Analyst II
NASA Ames Research Center
0 Kudos
Message 2 of 4
(1,490 Views)

More validating experimentation run last night.

 

I started the regular LabVIEW runtime engine EXE version from a terminal and assigned a display to it using >DISPLAY=:0 ./EXE_name

 

This version again has run for 24 hours with no issue, no crashes, no memory leaks.  I was able to stop and close the dynamic DAQ instrument, and then reload as designed with no crashing.  I'd rather not have to install a GUI on the system that this will be deployed onto for production use, but I guess that is one solution if we can't figure out what is happening with the Embedded Runtime.

Ryan Vallieu CLA, CLED
Senior Systems Analyst II
NASA Ames Research Center
0 Kudos
Message 3 of 4
(1,335 Views)

I've run the normal LabVIEW EXE runtime for over 4 days and been able to shutdown and reload the instrument with no errors.

 

The C EXE that calls the LabVIEW Main from the .SO to run in Embedded Runtime will only run up to about 24 hours.

 

I ran it last night and found that this morning it was still running, but that when I issued the shutdown command the C EXE crashed with a ‘abrt-hook-ccpp’ SIGABRT crash.

 

The only thing I really haven't done yet it transition from LVCLASS in the VI and instead of dynamic LVClass loading, just use regular nodes and forego the HAL aspect of the code.

 

The other option to try is going to LV2020

Ryan Vallieu CLA, CLED
Senior Systems Analyst II
NASA Ames Research Center
0 Kudos
Message 4 of 4
(1,218 Views)