01-20-2014 05:08 AM
Altenbach makes the following statement:
"A case structure with more than two cases always introduces an extra penalty, "
In this thread:
I can't find the detailed evidence of this..
Can someone point me towards an article/thread which explains.
Thanks.
01-20-2014 05:16 AM
01-20-2014 11:03 AM
This was an observation from more than 10 years ago (made during the bit twiddling challenge) and probably irrelevant for most practical purpose.
Case structures are very fast and any small difference depending on the number of cases is irrelevant for all practical purpose.
01-20-2014 11:46 AM
Interesting acedemic discussion, but in real life, it would be far more cumbersome to code without one. Potentially more mistake-prone, too.
01-20-2014 12:27 PM
@altenbach wrote:
Case structures are very fast and any small difference depending on the number of cases is irrelevant for all practical purpose.
Clearly that statement was simply there to troll me. When you are trying to deal with a firehose of "Big Analog Data" effects such as this are anything but irrelevant. They make the difference between LV getting the job done (at least with code that you can quickly understand) or saying Oh Well, looks like I need the speed of C here.
I recently wrote some code to select between a set of three character codes. During unit testing I passed streams of each set repeating, one of them executed in roughly half the time of the others (including the default none-of-the-above). Clearly there is a "preferred" case which did not have to do with the display order. Oddly enough, it was the middle lexicographically so I am imagining a bit of a binary decision tree going on under the hood. And it does seem to be a real added "cost", blackboxing shows expected scaling with comparison costs (string lengths etc.). Something jumps when you move beyond two cases, and then expected scaling ensues.
It is not that the Case Structure is slow, it is that there is some unusual behavior when you go from two cases to more than two cases. Nesting is faster in this case, especially if you have a priori knowledge of the statistics of the input so you can check in order of probability. If you read about the history of C++ development (the language itself), one of the tenets is that anything added to the language itself must be more efficient than something which can be written in the language. In this case, nesting is faster than multiple cases in a single structure so there is something which needs to be fixed IMO. In C you have exquisite control over the order of comparisons and the short-circuiting, we should have that in LV as well. That could simply mean the compiler respects the order of the cases (and pins the default to the rear).
01-20-2014 12:38 PM - edited 01-20-2014 12:39 PM
@Darin.K wrote:
In this case, nesting is faster than multiple cases in a single structure so there is something which needs to be fixed IMO. In C you have exquisite control over the order of comparisons and the short-circuiting, we should have that in LV as well. That could simply mean the compiler respects the order of the cases (and pins the default to the rear).
Since a n>2 case structure is similar to a Case Select statement in C or other text-based languages, I assume it evaluates each case in turn and will stop checking conditions once it finds the Case that is true. Is that what you are saying as well?
Since LabVIEW doesn't have any obvious method of determining which case it compares against first, is there something that determines the order? Will it make the conditional comparisons in order based on the order of the frames within the case structure? Will it start executing case 0 of the structure if that happens to be True any earlier than it would execute case 192 (if you had a case structure with that many cases) if that happened to be True? In other words, will reordering the cases within the case structure have any effect, however subtle, on execution time?
01-20-2014 01:53 PM
@RavensFan wrote:
@Darin.K wrote:
In this case, nesting is faster than multiple cases in a single structure so there is something which needs to be fixed IMO. In C you have exquisite control over the order of comparisons and the short-circuiting, we should have that in LV as well. That could simply mean the compiler respects the order of the cases (and pins the default to the rear).
Since a n>2 case structure is similar to a Case Select statement in C or other text-based languages, I assume it evaluates each case in turn and will stop checking conditions once it finds the Case that is true. Is that what you are saying as well?
Since LabVIEW doesn't have any obvious method of determining which case it compares against first, is there something that determines the order? Will it make the conditional comparisons in order based on the order of the frames within the case structure? Will it start executing case 0 of the structure if that happens to be True any earlier than it would execute case 192 (if you had a case structure with that many cases) if that happened to be True? In other words, will reordering the cases within the case structure have any effect, however subtle, on execution time?
With the caveat that each case has a 'break' statement, then I fully expect the analogy to Case Select to hold. LV does not allow the flowthrough which makes for some interesting C constructs. With this caveat (break statemets), you could also write a chain of if-else if-else if-else statements with the expectation that the performance would be the same. That is like nesting Case structures. In C++ I do not notice a difference (other than code readability), in LV I see a real difference.
<teaching old dog new tricks by rerunning benchmarks in newer LV version/>
Things are a bit better these days, so it is heading in the right direction. There is clearly a binary search going on so the worst-case is bounded in the case structure with n>2. It seems to blackbox pretty well these days, less of a clear penalty. Overall faster too, which is always nice. String comparisons are expensive, and it seems (will have to check) that LV makes no pre-flight check on length equality, it just dives into character comparisons.
Bottom line for now:
1. avoid string comparisons if you can (if you care about speed).
2. if you have no expectation for the distribution of input values, the binary search is quite good and bounds the worst-case scenario at the expense of a single compile-time sort.
3. if you have a dominant value in the input make it a separate case with two cases, the target and default. In the default case either add a single nested case with multiple structures or perhaps pull out the next dominant value.
This gives maximum performance with minimal nesting.
01-20-2014 03:14 PM
Darin.K wrote:
Bottom line for now:
1. avoid string comparisons if you can (if you care about speed).
Does this mean JKI state machine (case structure with string comparisons) would be slower than enum state machine?
Darin.K wrote:
3. if you have a dominant value in the input make it a separate case with two cases, the target and default. In the default case either add a single nested case with multiple structures or perhaps pull out the next dominant value.
To clarify: You're suggesting to separate the most (predicted likely) prominent case from the Default case?
"In the default case either add a single nested case with multiple structures or perhaps pull out the next dominant value."
What does this mean exactly?
01-20-2014 03:26 PM
@Darin.K wrote:
String comparisons are expensive, and it seems (will have to check) that LV makes no pre-flight check on length equality, it just dives into character comparisons.
I'm thinking it doesn't do any length checking. Because you can do string ranges such as "a"...."d". I can't remember all the gotchas that determine whether something falls within the range or not. But for the sake of argument, I'll assum "ball" will fit within this range and has 4 characters, whereas the number of characters in the case selection range is 1 or a combination, or just somewhat indeterminate.
01-20-2014 04:01 PM
@battler. wrote:
Darin.K wrote:
Bottom line for now:
1. avoid string comparisons if you can (if you care about speed).
Does this mean JKI state machine (case structure with string comparisons) would be slower than enum state machine?
@Darin.K wrote:
3. if you have a dominant value in the input make it a separate case with two cases, the target and default. In the default case either add a single nested case with multiple structures or perhaps pull out the next dominant value.
To clarify: You're suggesting to separate the most (predicted likely) prominent case from the Default case?
"In the default case either add a single nested case with multiple structures or perhaps pull out the next dominant value."
What does this mean exactly?
String-based state machines like the JKI are slower than their enum-based brethren. But this falls outside the world where it makes a difference. This really only matters when you are running a case structure in a tight loop. Responding to user events, or even machine events in most cases, is really on a slow timescale to begin with. I worry about this when I am streaming a lot of data and I need to keep up with the processing, or I am processing really large data files. There are really good reasons to use enums instead of strings, speed is way down on that list.
As to the minimal nesting, what I suggest is the following example:
If I know there are three strings I am looking for 'abc' 'def' and 'ghi', and I know that 'abc' is going to be coming 90% of the time then I will create a case structure with two cases 'abc' and default'. Iniside the default case I put a second case structure which covers the remaining cases (def, ghi and default). Rarely I may know that 'def' is 99% of the remaining cases so I may nest again (def + default), but it is rare to need more than a couple layers of nesting.