Got some interesting benchmarks on To Upper vs. "Branchless Programming"

wsimpson0050 · ‎11-13-2025

GregR you are the goat, I have seen some of your stuff on Lavag.org and just wanted to let you know you are awesome when it comes to labview.

altenbach · ‎11-13-2025

Thanks for all your insights. I would never dig so deep in any of this. 😄

Over the last decades, I spend a lot of time optimizing heavy computations to use all available cores and LabVIEW did always a fantastic job. A decade ago, the parallelization advantage was exactly the number of CPU cores (... and even slightly higher if they are hyperthreaded).

This has changed dramatically. Intel has P and E cores and AMD has Zen5 and Zen5c cores and the thermal management is now such that the clock adapts to the number of used cores, i.e. higher if only one core does heavy lifting and lower if all cores are in use, all to stay within the thermal envelope. I can imagine that distributing the workload over unequal cores is not trivial for the OS.

An interesting comparison about the advancements can be seen in my benchmarks.

Back in 2012, My dual Xeon E5-2687W (2x 8 cores) was rushing more than 100A across the fabric of each CPU (150W at 1.35A).

I recently got a HX 370 (12 cores) mini PC that cost 10% of the price but is 3x faster in single core and almost double the parallel performance and does not act as a noisy space heater.

I tend to limit parallelization for general use because it simple does not make a difference (and can even hurt!). The inherent parallel nature of LabVIEW allows many different things to happen at the same time and trying to suck all resources into one tiny part of the code needs a very good justification. Many algorithms cannot be parallelized anyway.

LabVIEW Champion.

GregR · ‎11-13-2025

@altenbach wrote:

Here's the full LUT:

I assume there are some industry conventions, e.g. xFF is x9F in "uppercase". 😄

Keep in mind this map is also codepage specific. This covers codepage 1252 which is what Windows uses for English and most European languages. Other languages use different encodings where the upper half of this would be different and there is the possibility of multibyte characters. Also Mac and Linux use UTF-8 where basically all accented letters are multibyte. All this is to say that To Upper is actually a pretty complicated operation. If you are going to create your own, make sure you understand all the use cases.

softball · ‎11-13-2025

Hi

Everyone should read GregR's entry carefully. He is absolutely right !

A problem is that Windows uses several code pages which has no natural sorting order when switching between the code pages. This might seem like a bad thing, but actually is a minor problem, because most uses the same code page for their computers in the region they live in.

The bigger problem is that different OS'es are not at all aligned about the sorting order of their 'code pages', when used in the same region of the world.

macOS and iOS has their understand of sorting order. It is pretty much like Windows, but not quite the same. ( I will be polite and not mention how iTunes introduced its own sorting order ).

Android and Linux ( and the sleeping monster behind it, UNIX in various flavors ) has a very different opinion of sorting order compared to Windows. They is breaking every sensible rules.

Just to make sure we talk about the same problem :

>> Sorting order is important when you list the folders and files present on your hard drive.

I have created my own knowledge base based on folders and files names and more. Thousands of folders and files. In order to show content in the right order, some sorting order must be obeyed or the listings look like senseless crap.

As I like to open this knowledge base from both Windows, iOS and Android, each OS must therefore show everything the exact same listing order.

I have learned the lesson and I therefore made a short list of special characters ( besides a..z and A..Z and 0..9 ) that sorts properly in a folder or file name and obey a common sorting order on all OS'es.

They are <space> , '-' , '.' , '^' , '+' '='. There are a few more characters, but they don't look pleasing to my eye for this purpose.

A surprisingly small list. Abandoning Android and Linux expands the list. It is easy to check for yourself. Just create folders and files using the characters you like. Space can of course not be the first character.

I am particularly annoyed with the Linux community. They think that just because UNIX were used in all database main frames of yesterdays then their inherited sorting order concepts are sacred.

To see what I mean, try google 'LC_COLLATE=C'. Brainless monkey talk.

With this rant past me, then I will just quote GregR's final words :

All this is to say that To Upper is actually a pretty complicated operation. If you are going to create your own, make sure you understand all the use cases.

Regards

Andrey_Dmitriev · ‎11-13-2025

@softball wrote:

Hi

Everyone should read GregR's entry carefully. He is absolutely right !

A problem is that Windows uses several code pages...

I would like to kindly remind that the primary goal of the initial message was to compare and explore branched versus branchless code and to discuss how this may affect performance in the context of LabVIEW. The small, simplified example was taken from a video as a piece of code that includes a branch, nothing more.

In my humble opinion, the creation of the functional “UpperCase” function lies entirely outside the scope of this discussion (as well as Parallel Loops, to be honest). It serves merely as synthetic demonstration code for a small tiny test — slightly more engaging than a trivial ternary operator, but nothing beyond that.

Perhaps we could explore Unicode if relevant, but there is no need to reinvent the wheel, as suitable libraries and functions are already available.

Andrey_Dmitriev · ‎11-13-2025

@altenbach wrote:

Thanks for all your insights. I would never dig so deep in any of this. 😄

Over the last decades, I spend a lot of time optimizing heavy computations

You're welcome! It has long been my dream to clearly align LabVIEW diagram with generated machine code. In the past, I disassembled code from an application generated by LabVIEW 6.1 — it was fairly simple, but not very easy to jump to the right location without digging through a long assembly listing. Now, however, I was finally able to step into running code in the debugger, and once I figured out the easiest way to do that and found where the branch generated by case structure was actually located, I couldn’t stop experimenting with different primitives and structures and noted these experiments. I ended up spending half the night on it, like an excited child with a new toy. And yes, I remember your benchmarks; I even sent at least one result.

alexderjuengere · ‎11-20-2025

so, if

Branchless Programming literally means "programming without jumps/branches" (branches = conditional jumps such as if, else, jmp, jnz, etc.).

then

the typical “LabVIEW standard state machine” with a large case structure is a performance nightmare on desktop CPUs because it does exactly what modern CPUs hate most: many small, unpredictable indirect jumps.

agreed?

Andrey_Dmitriev · ‎11-20-2025

@alexderjuengere wrote:

so, if

Branchless Programming literally means "programming without jumps/branches" (branches = conditional jumps such as if, else, jmp, jnz, etc.).

then

the typical “LabVIEW standard state machine” with a large case structure is a performance nightmare on desktop CPUs because it does exactly what modern CPUs hate most: many small, unpredictable indirect jumps.

agreed?

No — or more precisely, it depends. Typically, the time spent in individual states in SM Design Pattern is much bigger than the time spent on state transitions itself. Therefore, it usually doesn’t make sense to optimize for a few microseconds saved on branches caused by switching from one state to another. Such "branchless" optimizations only really matter in highly intensive computational loops. In those cases, there’s always the option to move from relatively inefficient LabVIEW code to a DLL compiled with a more optimized compiler.

However, putting an entire state machine into such a DLL can turn into a maintenance nightmare. At that point, it might be better to switch to another language entirely. That said, LabVIEW remains an excellent choice for desktop GUI applications (this statement based on my practical experience with Delphi, C++/WinForms, C#/WPF, Avalonia, and Rust), almost nowhere is it as easy and comfortable as in LabVIEW.

crossrulz · ‎11-20-2025

@Andrey_Dmitriev wrote:
Such "branchless" optimizations only really matter in highly intensive computational loops.

This is the important thing when it comes to optimizations in general. You should only worry about these types of optimizations when you actually need to. I'm not saying to be inefficient with your code. But there have been a lot of issues of late of people trying to optimize when it is not necessary.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

LabVIEW

Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"

Re: Got some interesting benchmarks on To Upper vs. "Branchless Programming"