Feel free to ask questions below or start a more general discussion.
Part I: Benchmarking (by Ed Dickens)
Zip file containing the slide presentation as a PDF with notes and the examples (LabVIEW 2015)
Please let me know of any mistakes. Thanks!
Part II: Code optimization (by Christian Altenbach)
A zip file containing the slides (as PDF with notes) and example programs (LabVIEW 2015) has been posted.
The slide notes contain additional detailed background information and links to relevant web resources.
They also contain suggestions to further explore the interesting issues.
Video of Introduction (Darin Kinion) and part I (Ed Dickens) (22 minutes)
Video of part II (Christian Altenbach) (44 minutes)
A question from the audience wondered about the efficiency of the "rotate array" array function.
Certain LabVIEW functions can operate on the array without actually touching the data in memory. This is transparent to the programmer and there is no easy way to tell if it is happening, except for impressive processing speed when benchmarking.
For example "reverse array" can just mark the array to be indexed from the back or "transpose 2D array" could just mark the array to be processed with the indices swapped. Similarly, "rotate array" could just mark as rotated, giving the start index, allowing calculation of the indices of the rotated elements without actually rotating the data, etc.
I am not aware of a complete list of such compiler optimizations, but "reverse" and "transpose" have been mentioned before. After talking to some poeple, it seems that "rotate array" also has such optimizations and benchmarking seems to confirm that.
So, thanks for bringing this up. It is certainly an important point.
!!Thank you!! to all invovled with this session! It's a topic that takes up a fair bit of my time! (I have a fully functional product in the field, but need to free up memory and cpu for planned and unplanned future feature additions, improvements etc.)
I agree with the (oft repeated) notion to TEST and benchmark code, but boy oh boy can it get challenging to properly setup and execute a valid benchmark, especially if you are testing/benchmarking for a cRIO RT target that is low on memory AND CPU resources! Personally, I found that I went from mostly ignorant on the 'magic' of compiler optimizations, to super-paranoid about the capabilities of the compiler optimizer, and then after reading and digesting "NI LabVIEW Compiler: Under the Hood" I'm down to a (un?)healthy dose of caution. I'm particularily suspicious of 'dead-code removal' and 'loop-invariant' optimizations, and on an RT target as compiled code (front panel is removed), what really constitutes 'dead' or unused code vs 'used' code? If the code I test results in an array, I'm paranoid enough that I randomly index out an element of that array and write it to a file on disk to make sure the compiler can't 'optimize' out my results on an RT that otherwise would only have the array on a front panel indicator after end of benchmark..a front panel that no longer exist at that. Is that taking it too far? I don't know.. and I don't know if that might change with next LabVIEW/compiler update.
Everyones needs are different, I tend to spend a lot of time optimizing for memory allocation because it is the most limiting factor on our product (closely followed by CPU), and because memory leaks show up much faster in testing if significant portions of your code only allocate once.
While closing references and not building arrays in infinite loops are important, also realize that nobody is perfect, not even NI, and sometimes there are leaks in NI primitives and NI functions too! If your code is 'memory noisy' it can take weeks of effort to find that slow but steady 4 byte per occurence leak amidst the noise of constant and changing memory operations!
Everyones needs are different, I tend to spend a lot of time optimizing for memory allocation because it is the most limiting factor on our product (closely followed by CPU)
I find that memory optimization often leads to CPU optimization as well. Not always, but in a majority of the time. YMMV. So I often concentrate on the memory allocations first as well.
It may not matter to most, but multiply is much faster than divide, at least on PowerPC (some cRIO's) and for DBL/SGL data types.. If you find yourself doing divisions on arrays of data, consider replacing it with an equivalent multiply operation, especially if the denominator is constant (or constant for each element in the array).
Wow! I just stumbled onto this site (though I certainly heard several of the presentations!).
Two comments/questions (remember, you asked for it ...)
- Christian's Slide 9 -- how do you turn on Array Buffer Allocation Dots? That looks really neat, but I've not seen this before (and looking it up in LabVIEW Help didn't, forgive the pun, help).
Tools->Profile->Show Buffer Allocations. A dialog will pop up allowing you to decide which type of buffer allocations to show.
Thanks, Tim. I was just reading the LabVIEW Compiler Tutorial, which also mentioned this feature (and many other "hidden gems"), and was coming back to answer my own question ...