I'm writing a custom data buffer because queues won't cut it. I need to write one element in, then read multiple elements out, but only flush them from the buffer if they've been successfully processed.
My elements are strings, so my buffer is a preallocated array of strings in a DVR with an int for the next write position.
How does LabVIEW store an array of strings in memory? I assume it's one continuous memory block of pointers to strings. The strings can be anywhere else in memory.
Thanks for any insight you have on this, hopefully I described the questions good enough!
1) It depends if that wire goes anywhere else. If not it is simply used to store in the Queue, DVR or whatever. If there is any branching of the wire and LabVIEW cannot schedule the operations in a way that would avoid needing a copy (if other sinks stomp/modify the string, or if it crosses diagram structure or subVI boundaries), then it has to create a copy anyways.
2) No trivial answer here. LabVIEW knows subarrays which are basically a sort of alias to the real buffer. So if you do a Split Array it MIGHT decide to create two sub arrays that point to the original array buffer and just defines an offset and length into that buffer for each sub array. But as soon as something needs to be done with these sub arrays that goes beyond just reading elements from it, the sub array needs to be turned into a real array, and that involves a buffer copy.
3) LabVIEW does not do garbage collection ever. A buffer that is not needed (referenced) anymore is deallocated right away. However the underlaying memory manager has its own lazy deallocation mechanisme, which means the memory might technically be marked as available again but not freed from the system to be available for other applications. And it may not even be immediately used for a new allocation request in LabVIEW depending on the size of the "free" memory block and what is requested.
Thanks @rolfk, do you mean that LabVIEW will never release memory? Like if I briefly use some large chunk of memory (1GB) then deallocate it, my application will hold that until it closes? If so, that's not very good - any way to work around this?
If that huge chunk of memory is used in a subVI, you can add "request deallocation" to it, so the memory gets freed once the subVI ends.
You should be careful with that, because in a typical scenario that subVI will be called again with similar memory requirements and it would be much cheaper to reuse the allocated memory instead of needing to allocate from scratch over and over.
We probably need to see some code, because words alone are ambiguous. Then we can give more targeted advice to improve your code.
Good point, here you go!
This is just a mock-up of course. The Write and Read are both called from loops running at different rates. I have some more logic in the Read/Write to prevent over/underflows. This buffer doesn't wrap around like a classic FIFO, instead it grows 0 to N until data is successfully processed, then it resets down to 0.
I pre-allocate the buffer because I don't want LabVIEW to keep copying that array as I append elements and grow its size. If this is a non-issue, I could just append the New Elements, and write a 0-length buffer upon successful completion. But I think re-copying the array will be inefficient.
So I'm curious about the best way to blank out the buffer after a successful Read without burning too much CPU making unnecessary memory copies.
I've seen the explicit deallocate function, but can't tell what it would free. In a design such as mine (spread across subVIs) where would I need to call that to ensure the memory (of blanked out successfully processed elements) is released back to the OS?
Well, it is difficult to tell from a picture, a VI would be preferred. (also you should make sure to pick "large" when inserting images)
Code raises a lot of questions. Are there any other parameters known? (What is a typical buffer size? Are all strings fixed size? Why would you resize by swapping in an empty string array if the adding part assumes a known size? I assume you are not closing the DVR reference in parallel to the upper code, etc.)
Here's a larger view of the code. I tried uploading the VI too, but the site keeps giving me an error, not sure why.
This isn't the actual code I'm running, it's just all the important pieces in one place so we can see it instead of me badly trying to describe the code.
No, I'm not killing the DVR immediately after creating it, that's just for show on the diagram. I originally mocked up with a flat sequence, but took it out so the diagram would be more square and fit the forums better.
The other parameters are not known. This is for a lower-level library that I will be using in maybe a dozen places. The buffer-size and element length are both variable depending on who calls the library.
I tried swapping in the empty array because the In Place Split/Replace docs say it will automatically pad the array so the output length doesn't change. This is the area I'm specifically looking for input, so if you know better ways to blank out an array I'd like to hear them. The goal is to blank the array in place so I don't need to keep allocating buffer-sized chunks of memory every time I read.
Why do you think you need to make your own memory manager? What do you think queues/arrays can't handle? Trying to outsmart LVs handling most often fails. It sounds to me like you want a fixed size queue with a Lossy Enqueue.
Believe me, I'd strongly prefer to use a built-in feature of LabVIEW and not have to think about this! 😄
I need to write one element in, then read multiple elements out, but only flush them from the buffer if they've been successfully processed.
The closest existing pattern is to Get Queue Status and return all the values so I can process them. If the processing was successful, then I need to call Dequeue repeatedly to clear each of those elements from the queue.
But, I believe Get Queue Status will make a copy of the contents if you use them. Since my queue size may be large, and the string elements may be large, this is a large unnecessary copy. I am trying to join the subarray directly so it only needs to copy those large elements once (into the big string that gets processed).
Between the one-at-a-time dequeue and the large unnecessary copy, I've decided I need to upgrade from queues (I was previously using Flush Queue and accepting the lost elements if processing ever failed).
Um, the code as shown isn't going to populate the buffer. You start with an empty array of strings, and trying to replace an element of an empty array is just gonna result in an empty array still.
Once you fix that, your IPE Array split/replace functions aren't gonna work like you want either. You only manipulate the front end of the buffer, the tail end is untouched. When you try to replace the front end with your empty array, I think you *want* that to cause the former tail end to shift forward and become the front end, expecting LabVIEW to add padding at the tail end to exactly fill the buffer. But that isn't how LabVIEW actually does it. If padding is needed, LabVIEW is going to do the padding at the front end so that each chunk you replace at the right side of the IPE has the same length as what you had split out at the left side.
You *do* declare the "lines-in-buffer" to be 0, so subsequent adds will start overwriting the front end padding. But you'll never process the original tail end values. You may eventually overwrite them and then they're lost for good. And later some of your new tail end values will get lost the same way. Over and over again.
Further, with all the concern about memory and processing efficiency, why turn your array of individual strings into a single delimited string inside this core code? That'll have to cause new memory allocation some of the time. Can't your processing function be reworked to operate on the array?
So far, I think your cure is worse than the disease. Fix it up and make it function correctly. Then I'd suggest you make a 2nd version that's queue-based for comparison.
Then build up benchmarking code that grows very large arrays full of very long strings and run each method for a significant duration (but only one at a time). Initially at least 10 seconds, but once you're sure you're fully debugged, give it much longer runs. Maybe something like 30 minutes or more? See how they each do in terms of both speed and memory usage.