Looking for faster strategies - 1D smoothing of 2D array

dpak · ‎08-26-2020

Hi All,

Wondering if anyone has any strategies they would recommend for speeding up a 1D log smooth of each row in a 2D array - each row needs to be smoothed with a moving point average of the log(n) of the array.

The algorithm itself is not mine - I am simply trying to implement it in LabVIEW. I may well end up modifying it - but I'd like to get the original working first.

I've attached a vi containing some example data. The size of the 2D array will be variable. The example in this file (~50k x 41) is slightly smaller than the average - simply because of the size of the data. Normally, the arrays are of the order of ~100k x 60 and the largest arrays are quite big - 200k x 200.

This particular subvi is part of a larger algorithm that iteratively processes the 2D array towards an optimum. So, this smoothing subvi may be called up to 100 to 200 times for a single result.

It's mainly the section in the sequence structure that I am trying to optimize here. (The sequence is only there to allow me to roughly time the offending loop at this development stage.)

Maybe there is even a loopless alternative?

But, if anyone has any ideas for speeding this up, then that would be great.

Thank in advance,

David

altenbach · ‎08-26-2020

Do you have a link describing the algorithm?

LabVIEW Champion.

dpak · ‎08-26-2020

Hi,

This is how this stage has been described to me: "The smoothing operation is a logarithmic mean filter of width 𝑗."

For every row (f) in the 2D array F:

Where s is the smoothed array and a, j and i are integers relating to the index in the array.

The original algorithm coerced zero values in f to 10^(-12) : to avoid errors from calculating ln(0). I have replaced this with the increment and decrement functions. I did try both versions.

The current version works correctly and the overall algorithm, including this subvi, successfully optimizes test arrays - so I do not think that I have translated the algorithm incorrectly.

I have added the rescaling function so that the integrals of s and f are the same.

Essentially, I think the maths is fine. Just a bit slow. I'm wondering if there is a better implementation, loaded with cunning and arcane knowledge?

FireFist-Redhawk · ‎08-26-2020

Save your VI in 2017 and upload.

Redhawk
Test Engineer at Moog Inc.

Saying "Thanks that fixed it" or "Thanks that answers my question" and not giving a Kudo or Marked Solution, is like telling your waiter they did a great job and not leaving a tip. Please, tip your waiters.

altenbach · ‎08-26-2020

Yes, I noticed the flanking -1, +1 which are not in the formula.

Here's a quick attempt (~2-4x faster on my Ryzen, depending on half-width. It also avoids your weird wrapping, which seems undesirable. I am sure there is a lot of slack left...)

Results are quite similar, but without the wrapping around

(Yes, you never want to parallelize a loop inside a parallelized loop!)

LabVIEW Champion.

altenbach · ‎08-26-2020

Also note that your first loop would be a couple of times faster if you put the NANcheck inside the inner loop and don't allocate that huge boolean array.

LabVIEW Champion.

altenbach · ‎08-26-2020

@altenbach wrote:

Here's a quick attempt (~2-4x faster on my Ryzen, depending on half-width. It also avoids your weird wrapping, which seems undesirable. I am sure there is a lot of slack left...)

Curious is the effect of parallelization. WIthout parallelization, mine about doubles in time (~99ms, i.e. ends up competitive to your parallelized version :D), but yours gets 4x slower without parallelization (~400ms!) ....

(This is on a 8 core AMD Ryzen 1700x , other processor families will probably differ)

LabVIEW Champion.

altenbach · ‎08-26-2020

Another curious observation is the slowness of taking the "+1& ln(x)" on the 2D array. If we include that in the timing of your code, it slows down 50% (95ms -> 150ms! in your code as shown). OTOH, if we do it on each 1D array inside the parallel FOR loop, it only costs about 2ms total (42ms->44ms) in my code as shown. One reason is that inside the loop it will be parallelized, but there is probably more to that.

LabVIEW Champion.

dpak · ‎08-26-2020

Thanks Altenbach - that is awesome.

That is a lot faster. Tomorrow morning (I'm about to go to bed as I'll be up at toddler o'clock!) I'll upload a version with all of these implemented. (Done them already - but need to tidy up) in an earlier version of LV.

Kind regards,

David

GerdW · ‎08-27-2020

Hi Christian,

there's a "ln(x+1)" primitive - did you benchmark that one too?

Best regards,
GerdW

using LV2016/2019/2021 on Win10/11+cRIO, TestStand2016/2019

LabVIEW

Looking for faster strategies - 1D smoothing of 2D array

Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array