01-08-2011 05:24 AM - edited 01-08-2011 05:26 AM
Well i think i've managed it:
This now does the same test in 17ms!!! Taking the dog for a walk worked wonders, came to me in a flash. The relative row only needs to be done on the first iteration as all other iterations have to be the same row. On the first iteration i calculate both row and column relative data, for the remainder i only do the column. As i initialise the array full of 0's i can ignore the row for all other iterations as it always equals 0.
Please, if you beat this time with your 2009 version dont tell me. Ignorance will be bliss
Is this the same adjustment you made or is there still yet another improvement i missed?
Rgs,
Lucither.
01-08-2011 12:12 PM
Good morning. Hey, you are still working in this! 🙂
@Lucither wrote:
On the first iteration i calculate both row and column relative data, for the remainder i only do the column. As i initialise the array full of 0's i can ignore the row for all other iterations as it always equals 0.
Is this the same adjustment you made or is there still yet another improvement i missed?
The idea is good, but the bulk of the speedup is due to something else that you probably did by accident. (obscure clue ;))
Still, the implementation of your idea is not optimal, because you can remove that case structure again and process the row data in the outer loop, right? That should give you another couple of ms.
As I said the idea "in general" is a good one, because I just used it over a cup of coffee to bring mine down to ~6ms. 😮 😄
(my times are with debugging disabled, make sure you do that too for another ms or two)
I'll try it in 2009 later....
Did you ever try to bechmark the original two VIs (chained together) with the current 101x dataset? I gave up after 10 minutes or so. I am just curious what kind of speedup we achieved so far, might be a record. 😉
01-09-2011 01:32 AM - edited 01-09-2011 01:35 AM
Last time i think, I made the improvements you pointed out (removing the inner structure and doing the row on the outer loop was minimal timewise but better style), After removing debugging i was getting 10ms, here is my FINAL version:
Although i never highlighted it in my last post i was aware that the main reason for the code becoming quicker was due to the fact that now we dont build an internal 1d (row,column) array element to insert into the array, we simply insert the individual values.
When the above benchmarks are run seperately the bottom loop is x8 faster. Just the fact that we are building the 1d array to insert slows down the process a lot.
Just for your interest i benchmarked the OP's original code as you said, one feeding the other. I truely left it to run for an hour, it still wasnt finished, i went down town to look at possible laptops to buy, came back and 10 minutes later it finished. A total of 2 hours and 35 minutes and 30 seconds!!!. (or 9330751 ms) If we use my best of 10ms, using LabVIEW 2009, we get an improvement of x933075 faster. If we use your best of 6ms we get x1.555 MILLION times faster!!! I have always known that using build array functions are bad and that care must be taken when handling arrays but i must admit this is even a shock to me. That just by carefully writing a small section of code can be the difference between 6ms and 2.5 hours!! I think if you where ever going to highlight the importance of this to someone then this would be a great example.
I think the lesson here is that if you need to use 'Get Date/Time' function to do your benchmarking you need to rethink how your doing it!
Anyway, this has been fun (Your right, i am very sad ). Has even made someone who was aware of these problems even more so.
Thanks for playing along.
Rgs,
Lucither.
01-10-2011 03:33 AM
@Lucither wrote:
Last time i think, I made the improvements you pointed out (removing the inner structure and doing the row on the outer loop was minimal timewise but better style), After removing debugging i was getting 10ms, here is my FINAL version:
OK I spend a few more minutes looking at all this.
On the default data, my fastest version is still about 30% faster than yours. 🙂
If I have time, I'll clean my version up and post it tomorrow night.
Your version has a bug:
If the input matrix is sparse and there are rows with no 1's at all, your code generates incorrect results, because your row increment cannot be larger than 1 (except for the first element). This is not a problem with the current input data, but still a fundamental flaw that must be corrected.
01-10-2011 04:05 AM
Your version has a bug:
If the input matrix is sparse and there are rows with no 1's at all, your code generates incorrect results, because your row increment cannot be larger than 1 (except for the first element). This is not a problem with the current input data, but still a fundamental flaw that must be corrected
Ah, of course, thanks for pointing that out. Have made the adjustment:
With this Mod im still getting 10ms. Is yours 30% faster running in 2009?
Will be very interested to see your code. Am struggling to think how if you are running it in 2009 your still squeezing out an extra 30% improvement!. Saying that though i was suprised when i improved on my 300ms
Rgs,
Lucither.