Dr. Damien's Development - Memory Allocation in Loops

DFGray · ‎02-19-2010

One of the surest ways to destroy the performance of an application is to repeatedly allocate or reallocate memory in a loop. This commonly occurs with such primitives as Build Array, Insert Into Array, Delete From Array, and Concatenate Strings. Any primitive which causes an array or string to change its size will run the memory manager. In addition to causing execution speed issues, repeatedly using the memory manager can also fragment memory, causing low memory problems to become worse. There are essentially two cases — adding elements to an array (or string), and removing elements.

Adding elements to an array in a loop is usually encountered when elements are added on a conditional basis. For example, if you are logging and only want to save points when they are an amount different from the previous point, you may conditionally add points. The easy way to do this is to use the Build Array primitive to simply add the point to an array residing in a shift register. The example SimpleAddItems.vi shows this approach. This example creates an array of U8s which are randomly generated to be from 0–7 with no ‘5’s. Unzip the example file to the directory of your choice and run this VI.

A better strategy for this type of problem is to manage the growth of the array so that it minimizes the amount of times the memory manager is called. This is done by maintaining more space in the array than is actually needed and replacing elements instead of adding them. The standard strategy of doubling the amount of extra space every time you need more is fairly efficient. Open the example ManagedAddItems.vi and run it. Note that it takes about half the time to run as the previous example.

Managing the array requires keeping track of three itmes — the current index, the size or maximum index, and the next size increment. At each data addition, the array size is checked to ensure it is large enough to add the point. If it is not, it is resized by adding the next increment to the current size. The next increment is then doubled for the next addition. The element is then replaced in the array. When the loop is finished, the array is resized to the current valid data.

Deleting elements from an array in a loop can cause much worse problems. Open the example SimpleRemoveItems.vi. This VI generates a random array of points, then deletes any which are equal to five. Run this VI. It takes about two orders of magnitude longer to execute that the original simple example. This sort of operation is common when transforming data from one type to another. One example of this is replacing escape sequences in strings with the actual control characters (two bytes to one byte operation). The simple implementation deletes each element from the array as they occur.

The solution to this problem is to use a bucket brigade algorithm to move elements in the array, then trim the array to the new size at the end. This reduces the memory manager operations to one. Load BucketBrigadeRemoveItems.vi and run it. Notice that it is three orders of magnitude faster than the simple case. The bucket brigade algorithm is fairly simple. Elements of the array are shifted down into a new position to replace deleted elements, then the array is trimmed to the proper length at the end. It requires one extra piece of information — the current working index.

As a final note, all of these examples do exactly the same thing — they randomly generation one million points with a range of zero to seven then remove the fives. This should result in an array of about 875 thousand points (one eighth of the points are removed). However, there are orders of magnitude difference in the performance, mostly due to memory manager use. The performance leader, the bucket brigade algorithm, uses the memory manager once.

Examples are in LabVIEW 8.2.1.

Message Edited by DFGray on 02-19-2010 03:04 PM

Darin.K · ‎02-19-2010

Fabulous. You learn this stuff, use it, forget it, then revert to old habits.

Benchmarking is always a rather fun sport, and since you conveniently provide some benchmark ready code with the Bucket Brigade I played a little and shaved almost another factor of 2 off of the execution time. (20-24 msec to 11-14 msec)

Not sure how much each helped.

1. Ditch array size and index array for autoindexing (probably modest gains at best)

2. Ditch comparison to 5 and wire value straight to case structure. Removes redundant comparison.

3. Allow self replacement, dump shift register and inner case structure. Definitely wins in the end. You stop on average 7 self replacements at a cost of 100000 comparisons.

My guess is dumping comparisons gets most of the gains. Great fun. I try to use LV for more serious computing jobs, and tips like yours to shave execution time help immensely.

altenbach · ‎02-19-2010

Darin.K wrote:
I played a little and shaved almost another factor of 2 off of the execution time. (20-24 msec to 11-14 msec)

Hey, I had exactly the same idea. 😄

Goes from ~25ms to ~13ms. but when we disable debugging, it goes down to ~8ms. More than 3x of the original.

This is one of my standard templates. You can find many instances here in the forum. 😄

Message Edited by altenbach on 02-19-2010 02:08 PM

LabVIEW Champion.

Darin.K · ‎02-19-2010

@altenbach wrote:

Hey, I had exactly the same idea. 😄

Well then channeling my 'inner altenbach' definitely worked. And disabling debugging is something that myself and probably many others forget about way too often. There should be a button for that right on the BD.

Darren · ‎02-19-2010

Darin.K wrote:

And disabling debugging is something that myself and probably many others forget about way too often. There should be a button for that right on the BD.

That's a really good idea. A toolbar button for enable/disable debugging...it could go next to all the other debugging buttons. If you post that in the Idea Exchange, I'll vote for it.

Darin.K · ‎02-19-2010

Darren wrote:
Darin.K wrote:
And disabling debugging is something that myself and probably many others forget about way too often. There should be a button for that right on the BD.
That's a really good idea. A toolbar button for enable/disable debugging...it could go next to all the other debugging buttons. If you post that in the Idea Exchange, I'll vote for it.

Look for that idea here.

altenbach · ‎02-19-2010

There some other interesting observations, it always gets slower if we have more than two choices in a case structure.

For example if we make one case "5,255" instead of "5", it gets 50% slower overall, even though the case 255 never applies in our example and the result remains the same.

LabVIEW Champion.

Darin.K · ‎02-19-2010

altenbach wrote:
There some other interesting observations, it always gets slower if we have more than two choices in a case structure.

For example if we make one case "5,255" instead of "5", it gets 50% slower overall, even though the case 255 never applies in our example and the result remains the same.

That is very interesting. It doesn't seem to scale, it just takes the hit when going from 2 to 3+ cases.

altenbach · ‎02-19-2010

I investigated that in detail back in 2002 when working on the bit twiddling challenge. Keeping a case structure to only two simple cases is always fastest. Adding more choices slows things down.

In the given case, we could add a second case structure around the existing case structure where we select between default (containing current code) and 255 in the second case and the penalty is now only ~5% compared to the 50% increase in execution time with the "5,255" shared case. Both options produce the same result. The one with more code (extra case structure) wins! 😉

Suddenly, constructs like this make some sense (just kidding!) 😄

LabVIEW Champion.

Darin.K · ‎02-19-2010

The one with more code (extra case structure) wins! 😉

Suddenly, constructs like this make some sense (just kidding!) 😄

I was afraid of that. Sometimes the cure is worse than the disease.

LabVIEW

Dr. Damien's Development - Memory Allocation in Loops

Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops

Re: Dr. Damien's Development - Memory Allocation in Loops