release queue memory after consuming large arrays on RT/cRIO

nathand · ‎10-31-2013

I'm working on an application, running on a cRIO, that receives files from the network (up to about 7mb per file), and transfers the data to the FPGA which writes it out synchronized with an external signal. The files are never written to the cRIO disk, they're streamed directly from memory to a DMA channel. I may receive files faster than I write them to the digital outputs, so I put the incoming files into a queue. Each file is simply a large array of bytes, and the queue holds arrays of bytes. Each time I finish transmitting one array to the digital outputs, I dequeue the next one.

My issue is that in some situations, I receive enough files to fill the cRIO memory, and that memory doesn't get released as the items are dequeued. This is normal behavior, according to other threads on this forum. However, I can't find a way to force it to release that memory, and in some cases this prevents me from downloading any more files - I get error 2 ("Out of memory") when I unflatten the incoming TCP stream into an array. I've tried adding code that flushes the queue, enqueues empty arrays, and then dequeues the empty arrays, but it doesn't seem to free the memory. Anyone have thoughts?

This is LabVIEW 2012SP1, running on a cRIO-9076 with VxWorks as the operating system.

crossrulz · ‎10-31-2013

The only way I know of to release queue memory is to destroy the queue. But depending on you architecture, this might not be possible.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

GregFreeman · ‎10-31-2013

You may have to take Crossrulz suggestion, destroy the queue and get the remaining elements, create a new queue, and shove the elements back in it. I have to say, the more I use cRIOs, the more I'm finding their limitations and have to spend more time optimizing than I would like. Have you had the same experience?

nathand · ‎11-01-2013

Thanks for the comments. I'm not completely happy with the solution, but I was able to restructure my code to be able to destroy the queue, by stuffing it in a notifier and reading the notifier first, then dequeueing.

@for(imstuck) wrote:

I have to say, the more I use cRIOs, the more I'm finding their limitations and have to spend more time optimizing than I would like. Have you had the same experience?

That hasn't been my experience, but I like optimizing and especially enjoy programming FPGAs in LabVIEW. I also first worked with a cRIO 8 or 9 years ago - shortly after they were introduced, if I'm not mistaken - and between improvements to technology and improvements to LabVIEW, the new one I'm using now seems a lot less limited. I've previously filled my FPGA and needed to rework it, but this is the first time I've run into a non-FPGA limitation of the cRIO that required a complicated workaround. The cRIO is also a lot less limiting (although also much more expensive) and easier to debug than the processors for which I'd been doing embedded C programming recently, so maybe it depends on your point of comparison.

nathand · ‎11-01-2013

One more comment - I'm finding that often optimizing simplifies my code. For example, in my current application, I'm reading files over the network and then writing them to the FPGA over a DMA channel. Initially I was reading from the network in chunks, converting each chunk from a string to a U32 array, and inserting it into a larger pre-allocated array. When I finished reading the file, I put the array in a queue (as described before), and in another loop I dequeued the array and sent it to the DMA channel, again in small chunks.

I'm now getting much better performance by reading the entire file in a single TCP Read (7mb at once) and enqueueing the string. In the other loop I dequeue the string, convert to a U32 array and write the entire array to the DMA buffer in a single shot. This requires a much larger DMA buffer but simplifies the code substantially. I no longer need to poll to see if there's space available in the DMA buffer and if so write more elements; instead, I just wait for the FPGA code to raise an interrupt signaling that the DMA buffer is empty, and write the next element from the queue to it. This works in my application because I get a break between each set of data that I need to transfer to the FPGA.

By eliminating loops and reading and writing large amounts of data at once, I'm letting the operating system handle most of the details in the background. I can now transfer data 3x as fast as my initial code, nearly saturating the 100mbit network link, and my code is simpler (and I think more memory-efficient), so I'm happy.

LabVIEW

release queue memory after consuming large arrays on RT/cRIO

release queue memory after consuming large arrays on RT/cRIO

Re: release queue memory after consuming large arrays on RT/cRIO

Re: release queue memory after consuming large arrays on RT/cRIO

Re: release queue memory after consuming large arrays on RT/cRIO

Re: release queue memory after consuming large arrays on RT/cRIO