LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Memory resize with change 0 - does it slow the calculation?

Solved!
Go to solution

I am adding two arrays with 1e6 elements each; the addition loops 1000 times. When I use buffer allocation examination tool, it shows a buffer allocation at one of the inputs off the Add node. When I trace the performance, it reports 1000 memory resize with change 0. The addition takes what, in my opinion, is very long time: 6 msec on i5 2.1 GHz processor. Is it related to the memory resize operation or is it normal processing time? If it is caused by the memory resize, how can I avoid it?

 Clip_2.png

0 Kudos
Message 1 of 20
(3,393 Views)

@muh1 wrote:

I am adding two arrays with 1e6 elements each; the addition loops 1000 times.


Perhaps you should give a better example of exactly what your issue is.  With this example, I see that you do not need a loop at all since the same value will always be going into the indicator.  Are you always adding the same array with itself?  If so, why not just multiply by 2?


GCentral
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5
0 Kudos
Message 2 of 20
(3,376 Views)
  • This is a bad example and does not prove anything.
  • As has ben mentioned, the loop will actually be eliminated by the compiler once you disable debugging, because the result is always the same.
  • It also makes not sense at all to have an indicator in such a tight loop. Updating the indicator with 1M elements as fast as the computer allows is probably one of the most expensive operations in your code.
  • How are you measuring the execution time?
0 Kudos
Message 3 of 20
(3,369 Views)

This example is quite adequate to demonstrate the point - memory allocation and excecution time. Is there anything wrong with it in this respect? Do you have an answer to my question for this particular example? 

 

In reality I add two matrices, one of which comes from outer product and the other from shift register. Does it matter? 

0 Kudos
Message 4 of 20
(3,349 Views)

It proves that there is momery resize and that execution time is some 6 msec. It is not intended to prove anything else. However real-life example has approcimatly same excecution time. 

 

Actually you are wrong - disabling debugging does not eliminate the loop, the execution time remains essentially the same. 

 

Indicators are not updated in each operation unless they are set to synchronous display (In which case the excecution time goes up, factor of 4 or so), so this is not a problem, really.

 

Flat sequence structure with tick count before and after. 

 

Just to make sure, I've modified the circuit slightly: 

 Clip_3.png

Surprisingly, buffer allocation dot at the Add input has disappeared, but time did not change much. It, probably, is normal calculation time for this processor, after all. Memory allocation of Labview remains mistery though. 

0 Kudos
Message 5 of 20
(3,341 Views)

What is a circuit? What is the point of the shift register? If I see words and things like that I tend to question everything else.


@muh1 wrote:

Flat sequence structure with tick count before and after. 


You need to be significantly more detailed. What is before the structure, what is after, what is in each frame? What are the debugging settings?


@muh1 wrote:

Indicators are not updated in each operation unless they are set to synchronous display (In which case the excecution time goes up, factor of 4 or so), so this is not a problem, really.


Yes, FP updates are asynchronous by default, but the transfer buffer still needs to be written with each iteration so you are doing significantly more work having the terminal inside the loop.. You are still creating way too much overhead.

0 Kudos
Message 6 of 20
(3,338 Views)

>> buffer allocation dot at the Add input has disappeared

 

It did not disappear, it moved to the shift register. This and previous examples need at least 2 arrays to store data: original and result. Technically add operation uses 3: input data 1, input data 2, and result, but it puts result into one of the possible input spaces without allocating all 3.

Display indicator is independent from these, it needs separate space, you have a dot on indicator.

0 Kudos
Message 7 of 20
(3,299 Views)

@muh1 wrote:

This example is quite adequate to demonstrate the point - memory allocation and excecution time. Is there anything wrong with it in this respect?  


Yes there is several things.  One is the compiler is smarter than you think.  Besides constant folding it will also cache results, and do things that make this an invalid test, on top of UI indicators being updated during execution.  Generally randomized data, and code to measuring timing is needed, along with other settings to get an accurate gague of performance.

0 Kudos
Message 8 of 20
(3,290 Views)

You see testing circuit. Maximally simplified.

 

Tick count before the excecution of the loop, tick count after. Arranged in flat sequence.

 

I've moved terminal out in the latest iteration. Not much change.

 

Let me ask you a simple question - do you have an answer to my original one or not. Let me re-itterate, it is simple enough : does memory resize with change 0 slow down excecution or not. Yes/no. Secondary questions - why there is memory buffer reallocation at the input of Add of two arrays of same size? Is 4 msec resonable time for adding two array of 1e6 complex dbl numbers? I prefer not to pointlessly discuss digramm, which I introduced just to illustrate the questions. The questions are self-containing,

0 Kudos
Message 9 of 20
(3,249 Views)

I've posed questions which are not directly limited/related to this illustration. If the answer to those particular questions are "Depends", I can elaborate more on the illustration. If not, I see no point to.

 

Regarding optimisation - my problem is excecution beeing too slow (or so I think, maybe it is normal).  Can optimisation slow down the excecution? If not, why mention it? I am not trying to precisely measure the performance. I am trying to see whether it is way too slow for some reason. Indicator in the loop is a possibility of course, but no, it is not the cause. For example profiler reports vi "CM Add" from NI library adding two matrices 1000x1000 taking 6 msec. That is how I noticed the problem and then wrote this maximally simple example. 

0 Kudos
Message 10 of 20
(3,238 Views)