Memory hog/timing problems

cathie · ‎07-31-2013

Can someone tell me what is so difficult about the attached subVI?

I think I've stripped it down as far as I can. The Profiler tells me that it takes 47-70 ms to run, but it gets new data every 40 ms AND there's other stuff going on in the bigger picture.

I've included the little sub-subVI, which I thought was the culprit. Profiler tells me it only takes 16 ms by itself, and doesn't even mention it when running the caller.

Is it just because of the big cluster? That's my shift register for the whole shebang. Is it more efficient to unbundle before calling the subVI? If yes, why?

Sorry if this is a bit rambly...I've been fighting this for a while.

The_Seeker · ‎07-31-2013

Hi cathie,

It appears that you memory and timing issues are related to memory management -- specifically the Build Array function in the subVI (although, depending on the size of your arrays, the unbundle and rebundle of the caller may also be causing some issues too.)

Ideally, you should try to allocate all of the memory you need for the array first, and use the replace array function, rather than building the the array by concatenation (as you are doing in the subVI).

Nesting your array inside of the cluster may not be the best idea either, if the array is changing size, or frequently replaced. Each call to this VI will create copies, which can fragment memory, and slow things down. (The larger the arrays, the more of a problem this becomes.)

If it makes sense to your application, you could try removing the array from the cluster, building it into a functional global (FG) which maintains the array in a shift register. The FG could act as an array manager module, allocating the array at the start of execution, and replacing elements as needed during the application run. (If the array needs to change size, allocate the maximum expected size at system start-up, and maintain a pointer in a shift register that points to the end of the useful data.)

There are some useful Knowledge Base article about 'best practices' for memory management. Search ni.com for these and take a look to get some more ideas.

Good luck with your application!

-- Dave

www.movimed.com - Custom Imaging Solutions

Todd_Lesher · ‎07-31-2013

Nice documentation.

Since you're sure Array OUT will always be the same size as Array IN, I was going to suggest the In Place Element Structure - but I'm not having much luck, so far. Have you tried Delete From Array instead of Array Subset?

nathand · ‎07-31-2013

How large are the arrays you're using here? They would need to be fairly large before this would become problematic.

A couple of quick things to try: set all of the inputs of the subVI to "required" instead of "recommended." Disable debugging in the execution properties of the subVI. Change the subVI priority to "subroutine." Make sure the subVI front panel isn't open during execution. See if those help. If they do, then the subVI is the problem. If they don't, then the problem is somewhere else in your code.

cathie · ‎08-01-2013

Thanks everybody.

I've always shied away from Globals, as they seem dangerous to me. I looked up a bit about "functional globals," and it seems to me that it's almost what I'm already doing...this subVI that I showed is in one frame of a queued state machine, in which all the data and control variables are passed around in a big shift register. Everything is initialized, and never changes size. My previous attempt at a "rotate&replace" (attached) was even more of a pig: it used a for loop to get all the columns through.

Anyway, I put the data arrays (75000*9*dbl, 2500*5*dbl, 20*4*dbl) in a global variable, removed them from the shift register, and now my program flies like the wind. Now I just have to fight all the other dragons.

The_Seeker · ‎08-01-2013

Hi cathie,

Glad to hear you have made some progress!

Based on your comments, it sounds like you are using a native LabVIEW Global Variable. While this has apparently improved your situation, you may still want to consider the additional benefits of using a Functional Global.

Native LabVIEW Globals tend to get a bad rap for two reasons:

1) if you are not careful, they can be prone to race conditions

2) a data copy is typically made every time and at every location that the LV global is referenced.

Functional Globals don't necessarily guarantee freedom from all race condition situations, however when properly implemented (with forced data/execution dependencies), race conditions are not usually a problem. Perhaps more importantly, data copies can be kept to an absolute minimum.

Based on your comments, it sounds like you were originally maintaining your large data structure in a shift register in the top-level loop of your application. Any time you wanted to make a change to any element of the data, you had to unbundle from the cluster, make the change, and rebundle. In order to unbundle, the entire cluster must be copied to the subVI, unbundled, manipulated, rebundled, and passed back to the top-level. Throughtout all of these transactions, several copies of the cluster must be made. (Apparently, this was the cause of your slow down.)

While it might appear on the surface that a Functional Global is essentially the same as what you were doing with a top-level shift register, there is a fundamental difference. The data is stored in a single place, and the need for data copies is drastically reduced.

The Functional Gloal maintains a single copy of the large data structure in it's shift register. When intelligently implemented, a Functional Global can become much more than a simple data repository. Implemented as a state machine, it can become a multi-state data management module, performing all data manipulation operations internally. Data passed into and out of the FG can be limited to only the pertinent subset of the entire large cluster that is relevant to the calling VI. The unbundling and processing can be handled internally by the various states of the FG (peferrably using "in-place" memory operations), resulting in optimal memory conservation.

Anyway, it sounds like the LV Globals are solving your problem for now, so that's good news! Still, you might want to give FG memory management components/modules a try... once you have experienced the simplicity and power, you will never go back!

One final tip: It is convenient to use an enumerated control as a means to select the various processing states of your FG/data module. Just be sure to save the enum as a Type Def so you can add additional cases as your application grows. (The other option is to use strings to specify the FG states... but this is a matter for another thread, and I certainly don't want to revive that Holy War...!):smileywink:

Best of luck with your application!

-- Dave

www.movimed.com - Custom Imaging Solutions

nathand · ‎08-01-2013

cathie wrote:

Anyway, I put the data arrays (75000*9*dbl, 2500*5*dbl, 20*4*dbl) in a global variable, removed them from the shift register, and now my program flies like the wind. Now I just have to fight all the other dragons.

If you're still interested in understanding why the original approach didn't work, we'd need to see more of your higher-level code to understand how that shift register was being used. If you were doing something that forced unnecessary copies to be made, that would make it slow. It's possible that your original problem could have been solved with an in-place element structure instead of the bundle/unbundle. Also, did you try any of the simple suggestions about your subVI (making terminals required, disable debugging, etc)? Did they help?

Mark_Yedinak · ‎08-01-2013

If you have the cluster wired through your state machine you can consider replaced it with a DVR to the cluster. Initialize it once as you do now but rather than having a large cluster passing through your state machine and possibly getting copied you have a reference to it. Copying the reference will not impact anything. This will also guarantee that will not make any copies of your large data set. This is not necessarily true when using a FG since you generally have the cluster wired in and out of the FG.

Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot

The_Seeker · ‎08-01-2013

Interesting idea Mark. It proves once again that in the LV world, there is "more than one way to skin a cat" (as the saying goes...)

While your solution would definitely improve the memory copy issue, I would want to do some benchmarking before concluding that this would be truly quicker than a "correctly implemented" FG. Updating by reference can be substantially slower than directly updating a diagram node. (With front panel controls and indicators, it can be orders of magnitude slower.)

To a previous point made in one of my earlier responses, you don't need to make copies of the entire cluster when passing data into and out of the FG. This is a key point. It is true that temporary copies of the arrays potentially need to be passed into and out of the FG, but depending on the requirements, and how you structure things, the arrays could potentially be constructed inside the FG using in-place functions. (Of course, the appropriateness of this would depend on several factors pertaining to the application that are not known to anyone in this thread other than cathie.)

The primary benefit of the FG is a single data storage location with minimal copies, and internal data management to boot. It's also easier when troubleshooting to probe data in a wire than it is to trace a reference.

Still, I would be curious to see how your solution compares to the "tried and true", well-implemented FG -- at least from a performance standpoint.

Cheers,

-- Dave

www.movimed.com - Custom Imaging Solutions

LabVIEW

Memory hog/timing problems

Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems

Re: Memory hog/timing problems