What function searches string the fastest?

kosist90 · ‎04-01-2017

Basically, recently I've asked myself regarding the following - what function is better to use, when I need just to check, whether string contains some substring? Usually, I used "Search and Split String" primitive, but is it fast enough?

Below is some my small investigation, so I'm curious, whether you agree with it, or not...

I’ve made some simple time benchmark of the functions, listed in the table below.

Was created one VI which calls directly investigated function, and VI for exectution time measurement. Each funciton was called in the loop 1million times, and afterwards time of first loop iteration execution was removed (b/c of jitters while first run), and were found maximum, minimum, and mean time of execution.

Also, were tried two cases – when string was present it text, and when not.

Of course, I’m not saying, that this is the most precise time benchmark ever, but let’s check the results:

So it comes, that Search/Split String or Match Pattern seems to be the best choice. The first one takes the least time while searching of string, which is present in string; but it’s 2 times slower when string is not present. But overall, average (mean) time for them is the same.

Do you agree with it? Or there are some another, even more faster and efficient functions?

kosist90 · ‎04-01-2017

And just now I've realized, that time units are seconds, not milliseconds

CoastalMaineBird · ‎04-01-2017

You are not measuring what you think you're measuring.

Consider this article of mine: <http://culverson.com/what-time-is-it/>.

There's a tool at the end of it that may help you be more accurate.

What you're measuring is the time to do all this:

Start a SEQUENCE structure

Read the timer

Step a SEQUENCE structure

Call a SubVI (unless you've inlined it)

Do the work you're interested in.

Return from a subVI

Step a sequence structure

Read the timer again.

Step OUT of a sequence.

Subtract one time from another

Accumulate results in an array.

If your SubVI is OPEN (Panel showing) when you run, then you have more stuff happening (updating indicators, etc).

If you're changing the code inside the subVI, and closing it before each run, then you have a RELATIVE indicator of which is faster, but that might be masked by all the overhead you have.

If you have 10 uSec of overhead, it's hard to spot the difference between 200 nSec and 250 nSec execution time. Consider the tool I posted in the article.

In any case, the ABSOLUTE times you post are not accurate.

Steve Bird
Culverson Software - Elegant software that is a pleasure to use.
Culverson.com

Blog for (mostly LabVIEW) programmers: Tips And Tricks

kosist90 · ‎04-01-2017

Thanks, CoastalMaineBird - next time I'll use this approach for time benchmarking for sure...

But, I guess, it does not influence on relative comparison between execution time of different functions? B/c inside of VI-wrapper all the operations (I mean, substract time, etc.) are all the time the same, thus they have almost the same execution time then...

B/c mostly my intend was to check, what function is faster, and which one is quite slow, but not their exact execution time...

CoastalMaineBird · ‎04-01-2017

But one point I want to make is that you can sometimes swamp the thing you want to measure, by the overhead, as I mentioned.

it does not influence on relative comparison between execution time of different functions?

If the exec time of the real thing you want to measure is large, compared to the overhead, then you're fine. But if it's not, you don't know WHAT you're measuring.

I should point out that there are traps in my scheme, as well. You should turn on the LV option that shows CONSTANT FOLDING. If you don't, you could miss the fact that LabVIEW will turn your "2*3*5" computation into a constant "30" and avoid the work altogether. Usually it's fine to do that, but you could be measuring something else besides what you think.

For example, it would be perfectly reasonable in your case, for LabVIEW to figure out that your VI gets the same answer EVERY TIME for the same inputs, AND that you are providing the same inputs EVERY TIME. A smart programmer could recognize that you could reduce the whole thing to a constant, and supply a constant. LabVIEW is evolving into having such smarts. That would help you out in most circumstances, but it throws your timing measurements out the window.

Steve Bird
Culverson Software - Elegant software that is a pleasure to use.
Culverson.com

Blog for (mostly LabVIEW) programmers: Tips And Tricks

altenbach · ‎04-01-2017

A pillar of science is "peer review" so a snippet with missing subVIs does not allow us to independently evaluate the merits of your results. (A snippet also does not retain execution settings and such).

As a first step, attach your entire benchmark project.

Until then I will refrain form all comments. You also forgot techniques that would operate on the string as a U8 array.

Benchmarking is a minefield, and correct benchmarking is an art. See also our NI-Week talk from 2016.

LabVIEW Champion.

kosist90 · ‎04-01-2017

Here you are - this is two VIs what I've used...

But I didn't setup any special settings, b/c while using this function in the application it will be used with the standard setup.

Honestly, I don't trust to people who cares about milliseconds on completely time non-critical applications running on Windows. This is more about speculations.

So my point was just to find out, which function runs faster, and which - slower. And just because of curiosity, nothing more 😃

If some magic benchmark setup will show, that Search and Replace function is indeed faster than others - then I'll be really surprised.

But if benchmarking correction will show difference in some milliseconds, and it will not change final result of comparison, then...

But I agree, that to find out execution time really precise, if it is really needed - is quite tricky, and not so easy. I'll be very happy to know about it more.

Thanks a lot,

Sincerely, kosist90

CoastalMaineBird · ‎04-01-2017

I don't trust to people who cares about milliseconds on completely time non-critical applications running on Windows.

Nobody mentioned Windows until this post right here.

How should I know you're testing a "time non-critical application" ?

After all, you're asking about timing of a low-level function, and measuring things like 4.4 uSec - sounds like you're interested in precise timing.

Ask a better question and you'll get a better answer.

Steve Bird
Culverson Software - Elegant software that is a pleasure to use.
Culverson.com

Blog for (mostly LabVIEW) programmers: Tips And Tricks

altenbach · ‎04-01-2017

The subVI is empty. We don't want to guess what you did, so we need something that is ready to run out of the box.

Instead of testing a small string many times, it would be more interesting to test a very long string (with a match at a random location) fewer times. It might also be more interesting to see how many matches occur or prove that there is no matching string. The subVI should always return an output.

Having debugging enabled should only be the "standard setup" during code development, it should be off once deployed. The speed only matters when deployed.

@kosist90 wrote:

Honestly, I don't trust to people who cares about milliseconds on completely time non-critical applications running on Windows.

A millisecond is a very (very!) long time for a computer, so if a program needs to do a few things, each taking a ms, I would be worried.

LabVIEW Champion.

kosist90 · ‎04-01-2017

@altenbach, thank you very much - I didn't think, that output of function is important in this case...

@altenbach wrote:

The subVI is empty. We don't want to guess what you did, so we need something that is ready to run out of the box.

Sorry for this, but you don't need to guess - used functions are listed in the table screenshot, in the first post. I didn't keep functions in the subVI, I just pasted there what was needed. Plus I don't know, whether you have installed GPower or OpenG toolkit, so that's why that subVI is empty...

LabVIEW

What function searches string the fastest?

What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?

Re: What function searches string the fastest?