From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

"Match Regular Expression" and "Match Pattern" vi's behave differently

Solved!
Go to solution

Hi,

 

I have a simple string matching need and by experimenting found that the "Match Regular Expression" and "Match Pattern" vi's behave somewhat differently. I'd assume that the regular expression inputs on both would behave the same. A difference I've discovered is that the "|" character (the "vertical bar" character, commonly used as an "or" operator) is recognized as such in the Match Regular Expression vi, but not in the Match Pattern vi (where it is taken literally). Furthermore, I cannot find any documentation in Help (on-line or in LabVIEW) about the "|" character usage in regular expressions. Is this documented anywhere?

 

For example, suppose I want to match any of the following 4 words: "The" or "quick" or "brown" or "fox". The regular expression "The|quick|brown|fox" (without the quotes) works for the Match Regular Expression vi but not the Match Pattern vi. Below is a picture of the block diagram and the front panel results:

 

Reg Exp Block Diag.PNG Reg Exp Front Panel.PNG

 

The Help says that the Match Regular Expression vi performs somewhat slower than the Match Pattern vi, so I started with the latter. But since it doesn't work for me, I'll use the former. But does anyone have any idea of the speed difference? I'd assume it is negligible in such a simple example.

 

Thanks! Smiley Happy

0 Kudos
Message 1 of 9
(24,712 Views)

Speaking only about execution time, I've done a simple test:

 

BD_time.JPG

 

 

So for 1E6 iterations:

Match Regular Expression: about 3400 ms

Match Pattern vi: about 60 ms

 

Marco

 

0 Kudos
Message 2 of 9
(24,700 Views)

Yep-

You hit a point that's frustrated me a time or two as well (and incidentally, caused some hair-pulling that I can ill afford)

 

 

The hint is in the help file:

 

for Match regular expression "The Match Regular Expression function gives you more options for matching strings but performs more slowly than the Match Pattern function....Use regular expressions in this function to refine searches....

Characters to Find Regular Expression
VOLTS VOLTS
A plus sign or a minus sign [+-]
A sequence of one or more digits [0-9]+
Zero or more spaces \s* or * (that is, a space followed by an asterisk)
One or more spaces, tabs, new lines, or carriage returns [\t \r \n \s]+
One or more characters other than digits [^0-9]+
The word Level only if it appears at the beginning of the string ^Level
The word Volts only if it appears at the end of the string Volts$
The longest string within parentheses \(.*\)
The first string within parentheses but not containing any parentheses within it \([^()]*\)
A left bracket \[
A right bracket \]
cat, cag, cot, cog, dat, dag, dot, and dag [cd][ao][tg]
cat or dog cat|dog
dog, cat dog, cat cat dog,cat cat cat dog, and so on ((cat )*dog)
One or more of the letter a followed by a space and the same number of the letter a, that is, a a, aa aa, aaa aaa, and so on (a+) \1

 

 

For Match Pattern "This function is similar to the Search and Replace Pattern VI. The Match Pattern function gives you fewer options for matching strings but performs more quickly than the Match Regular Expression function. For example, the Match Pattern function does not support the parenthesis or vertical bar (|) characters.

Characters to Find Regular Expression
VOLTS VOLTS
All uppercase and lowercase versions of volts, that is, VOLTS, Volts, volts, and so on [Vv][Oo][Ll][Tt][Ss]
A space, a plus sign, or a minus sign [+-]
A sequence of one or more digits [0-9]+
Zero or more spaces \s* or * (that is, a space followed by an asterisk)
One or more spaces, tabs, new lines, or carriage returns [\t \r \n \s]+
One or more characters other than digits [~0-9]+
The word Level only if it begins at the offset position in the string ^Level
The word Volts only if it appears at the end of the string Volts$
The longest string within parentheses (.*)
The longest string within parentheses but not containing any parentheses within it ([~()]*)
A left bracket \[
A right bracket \]
cat, dog, cot, dot, cog, and so on. [cd][ao][tg]

 

 

Frustrating- but still managable.


"Should be" isn't "Is" -Jay
Message 3 of 9
(24,692 Views)

Thanks, Marco. The execution time question in my post was an after-thought, so I didn't try it. So it appears that the Match Pattern is about 60 times faster! I wouldn't have suspected that much. But 3400 mS / 10e6 is only 3.4 uS, so it is negligible in my case. But if it were used in a loop as your example, then it could be significant!

 

-Ed

0 Kudos
Message 4 of 9
(24,677 Views)
Solution
Accepted by topic author Edjsch

Thanks, Jeff. That's what I was looking for. BUT my version of LabVIEW, 8.5, does NOT say "For example, the Match Pattern function does not support the parenthesis or vertical bar (|) characters."!

 

See: http://zone.ni.com/reference/en-XX/help/371361D-01/glang/match_pattern/

 

and http://zone.ni.com/reference/en-XX/help/371361D-01/glang/match_regular_expression/

 

Nor is it mentioned in the Special Characters for Match Pattern help: http://zone.ni.com/reference/en-XX/help/371361D-01/lvhowto/specialcharformatchpatt/

 

The only place | was "mentioned" is in the sentence: "Certain regular expressions that use alternation (such as (.|\s)*) require significant resources to process when applied to large input strings." But I am not processing a large string.

 

It looks like NI fixed this omission. What version is your help from?

 

Ed

Message Edited by Edjsch on 05-19-2010 11:11 AM
Message 5 of 9
(24,666 Views)

I searched and found that LabVIEW version 8.6 help has this correction.

 

Ed

0 Kudos
Message 6 of 9
(24,648 Views)

Hello my friend, how do you use | like The|quick|brown|fox in Match Pattern function?

0 Kudos
Message 7 of 9
(19,485 Views)

7 years ago this thread was started, and unless you are here to continue the discussion on the differences between Match Regular Expression and Match Pattern, I suggest making your own thread.

0 Kudos
Message 8 of 9
(19,479 Views)

Match Pattern does not support the alternation option of the regular expression grammar. Historically Match Pattern was introduced with one of the first versions of LabVIEW and it implemented a simplified version of regular expression matching. There were various forms of regular expression syntaxis back then and the LabVIEW developers choose to implement one of them that was fairly powerful but not to complicated to implement.

Since, the PCRE (Perl Compatible Regular Expression) has more or less become the defacto standard for regular expression implementation and that is why NI eventually added the Match Regular Expression function that makes use of the PCRE library to implement a fully featured regular expression parser functionality.

 

Changing the Match Pattern function to support the full PCRE syntax was however not an option since it uses incompatible regular expression syntax and doing so would have broken many existing LabVIEW programs. Also the more simple regular expression syntax of Match Pattern results in a significant performance difference, so that is another reason to keep both functions in LabVIEW.

 

Use Match Pattern if its regular expression syntax supports your use case, and Match Regular Expression if you need the additional features of that function. And if you despise learning both (which IMHO is anyhow a completely impossible thing to do for the full PCRE syntax) simply use the Match Regular Expression only.

Rolf Kalbermatter
My Blog
Message 9 of 9
(19,444 Views)