LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Looking for faster strategies - 1D smoothing of 2D array

Solved!
Go to solution

Hi All,

 

Wondering if anyone has any strategies they would recommend for speeding up a 1D log smooth of each row in a 2D array - each row needs to be smoothed with a moving point average of the log(n) of the array.

 

The algorithm itself is not mine - I am simply trying to implement it in LabVIEW.  I may well end up modifying it - but I'd like to get the original working first.

 

I've attached a vi containing some example data.  The size of the 2D array will be variable.  The example in this file (~50k x 41) is slightly smaller than the average - simply because of the size of the data.  Normally, the arrays are of the order of ~100k x 60 and the largest arrays are quite big - 200k x 200.

 

This particular subvi is part of a larger algorithm that iteratively processes the 2D array towards an optimum.  So, this smoothing subvi may be called up to 100 to 200 times for a single result.

 

It's mainly the section in the sequence structure that I am trying to optimize here.  (The sequence is only there to allow me to roughly time the offending loop at this development stage.)

 

Maybe there is even a loopless alternative?

 

But, if anyone has any ideas for speeding this up, then that would be great.

 

Thank in advance,

 

David

0 Kudos
Message 1 of 19
(1,878 Views)

Do you have a link describing the algorithm?

0 Kudos
Message 2 of 19
(1,840 Views)

Hi,

 

This is how this stage has been described to me: "The smoothing operation is a logarithmic mean filter of width 𝑗."

 

For every row (f) in the 2D array F:

 

log mean eqn per row.png

Where s is the smoothed array and a, j and i are integers relating to the index in the array.

 

The original algorithm coerced zero values in f to 10^(-12) : to avoid errors from calculating ln(0).  I have replaced this with the increment and decrement functions.  I did try both versions.

 

The current version works correctly and the overall algorithm, including this subvi, successfully optimizes test arrays - so I do not think that I have translated the algorithm incorrectly. 

 

I have added the rescaling function so that the integrals of s and f are the same.

 

Essentially, I think the maths is fine.  Just a bit slow.  I'm wondering if there is a better implementation, loaded with cunning and arcane knowledge?

0 Kudos
Message 3 of 19
(1,833 Views)

Save your VI in 2017 and upload.

Redhawk
Test Engineer at Moog Inc.

Saying "Thanks that fixed it" or "Thanks that answers my question" and not giving a Kudo or Marked Solution, is like telling your waiter they did a great job and not leaving a tip. Please, tip your waiters.

0 Kudos
Message 4 of 19
(1,826 Views)
Solution
Accepted by topic author dpak

Yes, I noticed the flanking -1, +1 which are not in the formula.

 

Here's a quick attempt (~2-4x faster on my Ryzen, depending on half-width. It also avoids your weird wrapping, which seems undesirable. I am sure there is a lot of slack left...)

 

altenbach_0-1598471685128.png

 

 

Results are quite similar, but without the wrapping around

 

altenbach_0-1598471559927.png

 

 

 

(Yes, you never want to parallelize a loop inside a parallelized loop!)

 

 

 

 

 

 

Message 5 of 19
(1,819 Views)

Also note that your first loop would be a couple of times faster if you put the NANcheck inside the inner loop and don't allocate that huge boolean array.

 

altenbach_0-1598472574355.png

 

Message 6 of 19
(1,811 Views)

@altenbach wrote:

Here's a quick attempt (~2-4x faster on my Ryzen, depending on half-width. It also avoids your weird wrapping, which seems undesirable. I am sure there is a lot of slack left...)

 

altenbach_0-1598471685128.png

 


Curious is the effect of parallelization. WIthout parallelization, mine about doubles in time (~99ms, i.e. ends up competitive to your parallelized version :D), but yours gets 4x slower without parallelization (~400ms!) ....

 

(This is on a 8 core AMD Ryzen 1700x , other processor families will probably differ)

0 Kudos
Message 7 of 19
(1,799 Views)

Another curious observation is the slowness of taking the "+1& ln(x)" on the 2D array. If we include that in the timing of your code, it slows down 50% (95ms -> 150ms! in your code as shown). OTOH, if we do it on each 1D array inside the parallel FOR loop, it only costs about 2ms total (42ms->44ms) in my code as shown. One reason is that inside the loop it will be parallelized, but there is probably more to that.

 

altenbach_0-1598475442573.png

 

Message 8 of 19
(1,790 Views)

Thanks Altenbach - that is awesome.

 

That is a lot faster.  Tomorrow morning (I'm about to go to bed as I'll be up at toddler o'clock!) I'll upload a version with all of these implemented. (Done them already - but need to tidy up) in an earlier version of LV.

 

Kind regards,

 

David

0 Kudos
Message 9 of 19
(1,779 Views)

Hi Christian,

 

there's a "ln(x+1)" primitive - did you benchmark that one too?

Best regards,
GerdW


using LV2016/2019/2021 on Win10/11+cRIO, TestStand2016/2019
0 Kudos
Message 10 of 19
(1,754 Views)