LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Replace array subset in parallel for loop without increasing memory

Still curious about this topic.  Did anyone else in this thread try the change I mentioned back in msg #42?   Without studying it rigorously, it seemed to run a little bit faster when I auto-concatenated the ordered array instead of doing in-place array element swaps. 

 

If it *is* faster, it likely comes at the cost of a bigger memory footprint.

 

I'm also still in the "good intentions" stage of checking out and modifying the genetic algorithm example I linked in msg #35.

 

 

-Kevin P

CAUTION! New LabVIEW adopters -- it's too late for me, but you *can* save yourself. The new subscription policy for LabVIEW puts NI's hand in your wallet for the rest of your working life. Are you sure you're *that* dedicated to LabVIEW? (Summary of my reasons in this post, part of a voluminous thread of mostly complaints starting here).
0 Kudos
Message 51 of 53
(315 Views)

@Kevin_Price wrote:

If it *is* faster, it likely comes at the cost of a bigger memory footprint.

 


You'll probably gain a (very) small memory advantage by doing the summing  in a shift register instead of creating the array just to take the sum later, even without the conditional terminal. (OTOH maybe "add array elements" can utilize SIMD instructions?). I probably won't have time to dig deeper into this, but back then I tried a few other things and none were faster. Yes, I was pleasantly surprised by the speed of the concatenating tunnel, considering that the segments have different lengths and the parallel FOR loop creates the sections out of order. Since the segments are similar in length, I event made them the same length using a 2D array padded to the longest row and it did not improve the speed. There is plenty more to investigate, but yes maybe a better algorithm might be the solution. Do we really need to absolute best path or is it sufficient to be within 10% of the best? (Also, the current algorithm does not even give the "shortest path" anyway! That would take forever :D)

0 Kudos
Message 52 of 53
(311 Views)

One thing that allows access in a parallel FOR loop to the same array are DVRs. Note that the IPE(DVR) structure protects from concurrent access, but since this is a very small critical section, it does not stall the various parallel instances much.

 

However, even the following modification only gives a marginal improvement (<5%) over my old code.  (The * box contains the code as before, not shown for simplicity)

 

ParallelDVR.png

(Note that you need to be sure that the sections don't overlap, else the result will be unpredictable because the order of replacement cannot be guaranteed)
Message 53 of 53
(302 Views)