Well, this thing is getting interesting!
If you compare Mike Porter's solution, you'll notice that it is at least 10x
faster than the other versions discussed here. Apparently, growing an array
(to an initially undetermined size) at the loop boundary is MUCH cheaper
than doing the same with the built array tool as in B&C Duffey's version. I
was not fully aware of that. I wonder if something could be more
optimized...
Of course I am always up to the challenge and I made a modification that
beats Mike's version by another 10% ( on a 1million point random array, on a
classic Pentium), YMMV. The VI is similar to Alexander's Idea, but instead
of growing the array, you initialize the blue array to the left of the shift
register with a worst-ca
se (=size of input) array, use "replace array
element" instead of "built array" inside the case, and, after the loop, you
just strip it back to the final size with "array subset".
This pre-allocates the array in one step and throws away the unused tail
later.
The speed difference is truly amazing! Could anyone comment on this?
Nobody has yet complained about attachments, so I added a picture of the
"business part". The other case just has the two blues wired through. (It's
not fully debugged and tested for "edge effects", may need another (+1) or
(-1) in some places).
So...who can built an even faster version ??? (no CIN allowed!)
Cheers
Christian
[Attachment new-2.gif, see below]