LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Parallelism with Shift Register

Hi Team,

When a For Loop is configured with parallelism and a shift register is used, LabVIEW does not allow the VI to run and the run arrow appears broken. However, when an increment function is added inside the same loop, the broken arrow disappears and LabVIEW allows the VI to execute.

This behavior seems to occur only with the increment function, while other functions still result in a broken arrow.

Could you please help clarify how this is possible? I have attached the VI snippet for your reference.

Parallelism.png

 

Alwin_Capsys_0-1770966979927.png

 

0 Kudos
Message 1 of 7
(245 Views)

Hi Alwin,

 

NI implemented some improvements to FOR loop parallelism since it was introduced…

 

See the LabVIEW help for some possible exclusions of the parallelism feature!

Best regards,
GerdW


using LV2016/2019/2021 on Win10/11+cRIO, TestStand2016/2019
0 Kudos
Message 2 of 7
(227 Views)

As Gerd said, the LabVIEW compiler can, under some conditions, determine that the order of execution has no influence on the result. In your case, the final value is independent from the order of the iterations. There are other "patterns" that are recognized, but just replacing the +1 with a -1 breaks it....

 

There are definitely other holes. for example this (dumb!) code should be parallelizable, but breaks the VI, even though we can tell the invariant result by just looking at it. 😄

 

altenbach_0-1770969677710.png

 

 

 

Of course parallelizing such a simple loops will actually slow you down. I assume that your real code is a bit more complex, because your current code could be replaced by a single scalar multiplication (or even just taking the value wired to N). It is even possible that the various compiler optimization steps will do that already, eliminating the loop entirely.

 

0 Kudos
Message 3 of 7
(214 Views)

Hi,

This section does not clearly specify which functions or palettes are compatible with parallel execution and which are not.

0 Kudos
Message 4 of 7
(197 Views)

Hi,

I didn’t fully understand your explanation. Could you please elaborate on the logic behind this?

If possible, could you explain it with an example using the same code mentioned earlier? Specifically, I would like to understand why the increment operation works correctly with a shift register even when parallelism is enabled, but the decrement operation does not.

Looking forward to your clarification.

0 Kudos
Message 5 of 7
(195 Views)

Hi Alwin,

 


@Alwin_Capsys wrote:

Specifically, I would like to understand why the increment operation works correctly with a shift register even when parallelism is enabled, but the decrement operation does not.


Because the compiler does recognize this special pattern (increment) and allows parallelism, but it doesn't recognize the other pattern using the decrement…

 


@Alwin_Capsys wrote:

Could you please elaborate on the logic behind this?

NI implemented some special patterns, but is cautious ("on the safe side") for anything else. (The "increment" use case might be more often needed than "decrement"…)

Generic rule: code inside the parallelized FOR loop should not depend on data in shift registers (aka does not depend on previous iteration results)!

Best regards,
GerdW


using LV2016/2019/2021 on Win10/11+cRIO, TestStand2016/2019
0 Kudos
Message 6 of 7
(184 Views)

@Alwin_Capsys wrote:

Specifically, I would like to understand why the increment operation works correctly with a shift register even when parallelism is enabled, but the decrement operation does not.


The LabVIEW compiler is a highly advanced work of art and engineering (start reading here) and there are many things that cannot be understood with a simple sheet of easy to understand rules. Note that while LabVIEW has direct control of DFIR optimizations, the LLVM steps can do more magic later (The degree of optimization depends on the code complexity to keep compile times reasonable. For example while subVI  in-lining eliminates call overhead, the added code complexity of the caller could push the code over a threshold resulting in fewer optimizations). Even with 30+ years of LabVIEW experience, I typically code up to 5 (or more) alternative versions for performance-critical innermost code and do extensive benchmarking as a function of data size to explore time and memory complexity with extensive benchmarking (even benchmarking is a tricky endeavor and many make mistakes and get false results!). One typical fallacy is measuring a faster speed with the parallel loop, but that might just be caused by the simple fact that it does not allow debugging and once you disable debugging globally, the parallel for loop might actually be slower. There is a cost associated with splitting the task and reassemble the results. Only if this overhead is minimal compared to the loop code complexity, the parallel FOR loop wins (example).

 

I have seen situations where insertion of a simple "always copy" in the right place would speed code up several times. Often picking the correct algorithm is more important than just blindly over-parallelize.  As an example, my Tikhonov regularization switches between two very different algorithms on the fly depending on the shape of the matrix.

 

The parallel FOR loop is a relatively new addition to the LabVIEW repertoire (LabVIEW 2009) and is unique in many ways. You must understand that the order of iterations is no longer sorted in [i] but can occur in any order and several iterations at the same time. A shift register has well defined data from the "previous" iteration [i-1], but the term "previous" no longer has any meaning once you parallelize.

 

You will see that as soon as you tap into your shift register data, your parallel loop will most likely break again.

Have a look at the following seemingly equivalent constructs. 

 

altenbach_1-1771002575811.png

Since each replacement has a predictable index (left), the order of replacement does not matter. The orange shift register is allowed because the compiler recognizes that pattern. This is no longer true on the right and you would get a different result every time if the compiler would allow you to do that.

Message 7 of 7
(145 Views)