From 04:00 PM CDT – 08:00 PM CDT (09:00 PM UTC – 01:00 AM UTC) Tuesday, April 16, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

LVOOP vs. Reentrancy vs. Parallel Execution

Solved!
Go to solution

Hi there,

 

finally I'm a bit confused.... I'm trying to do implement an object based "test sequencer" (no TestStand untill now Smiley Frustrated) on a quad core cpu. My basic idea was to define a test step as class, instantiate up to four objects and call Execute VIs operating on these objects in parallel:

 

Parallel Call.jpg

 

Actually this is, what the execute VI does:

Execute.jpg

 

What I see playing with reentrancy settings of the Execute method is:

Execute NOT reentrant --> ~53900ms run time @ ~27% cpu load (--> only one core)

Execute reentrant (share clones since dynamic dispatch terminals) --> ~38400ms run time @ 100% cpu load (all four cores)

 

My interpretation is that switching from a single to four cores, performance only increases by 30%. Is this possible or am I missing something fundamentally important?

 

Thanks for your comments!

 

Oli 

 

PS: running on LV2011

 

Message 1 of 18
(4,014 Views)

I might be wrong but is it because you are sharing clones - i have a feeling that labview will still only alow one of the vi's to execute at a time - but you get a little bit of advantage because it is always already in memory.  (i might be wrong here).  However, i guess that doesn't explain why it is using all 4 cores - unless it allows the cores to wait for the vi to become free in this mode(?). 

 

I guess you could check this out if you select the non dynamic nodes option - although presumably you wanted this for some other reason?

0 Kudos
Message 2 of 18
(4,000 Views)

Thanks for the hint!

For this simple case, I set the terminals to static and went for the Pre-Allocate option.  Performance gain was ~3% compared to the dynamic version (which is what I'd like to use). So I must be doing sth else wrong Smiley Wink

0 Kudos
Message 3 of 18
(3,992 Views)

Do you havee an event structure where all 4 elements are going against the same event?

0 Kudos
Message 4 of 18
(3,989 Views)

Ok.  Sorry for the bad suggestion.  Another thought.  Perhaps the 'in place' node doesn't like to be run in multiple places - haven't used this before myself.  Maybe try replacing it with the read out node and read in clusters might work(?).

 

I say this because although - like i say i haven't used the in place thing - I have seen this behaviour before when i had a sub vi which wasn't set for reentrency.

0 Kudos
Message 5 of 18
(3,986 Views)

It is possible that the square root operator itself is not reentrant and that is your bottle neck. I would also recommend that you put a Wait 0 in your loop. This allows the sheduler to see if other stuff is ready to run. As written once in your loop the scheduler cannot do anything else.



Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot
0 Kudos
Message 6 of 18
(3,975 Views)

@wideofthemark wrote:

I might be wrong but is it because you are sharing clones - i have a feeling that labview will still only alow one of the vi's to execute at a time - but you get a little bit of advantage because it is always already in memory.  (i might be wrong here).  However, i guess that doesn't explain why it is using all 4 cores - unless it allows the cores to wait for the vi to become free in this mode(?).


Unfortunately your choice of screen name is accurate here 😉

You've misunderstood reentrancy and shared clones.  Reentrant VIs can always run at the same time.  The difference is whether LabVIEW pre-allocates copies, or creates new copies on demand, of the reentrant subVI.  If you have a reentrant VI that's called from 20 different places in your code, but you know that at most only three of them will ever need to execute at the same time, then shared clones will be more memory efficient.  The first call to the subVI will allocate one copy.  The next time that subVI is called, if there is not already an idle copy in memory, a new clone will be created and run.  If you pre-allocate clones, 20 copies will be created when the top-level VI starts, but most of them time they'll be sitting in memory unused.  In this case, pre-allocate clones makes more sense because you know you want 4 copies, but even with shared clones they should run in parallel.

 

There is one situation where you might want to pre-allocate clones even if you don't expect multiple copies to run in parallel: if you use an uninitialized shift register inside the reentrant VI, pre-allocated clones will always use the same shift register value in any given instance, whereas with a shared clone you don't know which instance you'll get so you won't know what value is in the shift register.

 

You might experiment with the threadconfig utility or the INI file tokens that allow you to adjust the number of threads that are allocated (the threadconfig utility might just be a nicer interface to the INI tokens, I haven't checked).  Also, for benchmarking purposes you may want to disable debugging, and you should probably run the subVIs in some thread other than the user interface thread.

0 Kudos
Message 7 of 18
(3,954 Views)

Fair enouigh.  That makes it clearer. 

 

I have to say that i mostly either have reentrent + preallocate, or not re-enterent at all - mostly when i've needed something to run fast in a couple of places its a small bit of code - so i haven't worried much about memory. But worth bearing in mind for the future.

 

JP

0 Kudos
Message 8 of 18
(3,949 Views)

Q:

 

Is there any difference if you use 4 sepearte class constants instead of a single constant feeding all?

 

Just curious,

 

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel
0 Kudos
Message 9 of 18
(3,945 Views)
The object constant can be thought of as the class constructor; it's the bit which requests memory. When the wire is split into the 4 class VIs, 4 deep copy operations are performed which make separate instances. I suppose the only notable difference might be the compiler knowing to claim 4 copies now, or 1 copy now and 3 more later on.
_____________________________
- Cheers, Ed
0 Kudos
Message 10 of 18
(3,916 Views)