Complex regular expressions without multiple passes

kc64 · ‎01-13-2010

Does anyone know of a tool that can process more complex regular expressions without chaining multiple copies of the regular expressions VI together?

As an example, if I have an XML string like

<data type="string">Power supply error has occurred.</data>

<target>Sorensen SGA166/188</target>

and I'm interested in the retry tag method only, I might write a regex something like

.*<retry method="([A-Za-z0-9_])*" />.*

to parse out the string inside the tag.

F._Schubert · ‎01-14-2010

Try pattern matching with this:

to get the complete tag. Then you need to get the first attribute by splitting at the space and match again using

[~=]=[/s]+"[~"]"

And now extract the Attribute Value:

"[~"]"

and remove the first and last character from this string to get rid of the quotes.

It depends how generic you want to be able to deal with xml. Generally you can not split attributes using spaces as delimiters, as they can appear inside the quotes as well (See your target element), but it would work with the retry element. Also, quotes can be " or ', and if I remember correctly you also need to match case insensitive.

Felix

kc64 · ‎01-14-2010

Thank you for your reply but I think you missed the point which is to process the string in one pass. My question is not about how to write regular expressions.

In other languages, I can perform the example task with one call and get back the substring--the part in the ()--as the result. Your reply exactly demonstrates the reason for my question which is that you have to run a regex, then split it, then strip off this or that.

Jim_Kring · ‎01-14-2010

At the expense of soundling like an advertisement, you might want to check out EasyXML, which is made by JKI (where I work). Parsing XML can get tricky, in a hurry. EasyXML makes it... well... easy 🙂

Cheers,

-Jim

Let's talk about the future of LabVIEW...

smercurio_fc · ‎01-14-2010

kc64 wrote:
Does anyone know of a tool that can process more complex regular expressions without chaining multiple copies of the regular expressions VI together?

As an example, if I have an XML string like

<data type="string">Power supply error has occurred.</data>
<retry method="Initialize" />
<target>Sorensen SGA166/188</target>

and I'm interested in the retry tag method only, I might write a regex something like

.*<retry method="([A-Za-z0-9_])*" />.*

to parse out the string inside the tag.

If you are dealing with XML then why not simply use the XML Parser VIs/functions, assuming you have LV 8.6 or higher? There's an example that ships with LabVIEW called "Query XML Document for a Single Node".

Or you could use the EasyXML VIs as Jim mentioned, which will probably be easier.

kc64 · ‎01-14-2010

Thanks for your replies but they are off the target. I've rewritten my question.

Does anyone know of a tool that can process more complex regular expressions without chaining multiple copies of the regular expressions VI together?

As an example, if I have an string like

My email address is foobar@gmail.com. Please don't spam me.

and I'm interested in the domain name of the email address only, I might write a regex something like

@(\w)*\.(com|net|org)

to parse out the string "gmail".

Darin.K · ‎01-14-2010

kc64 wrote:

As an example, if I have an string like
My email address is foobar@gmail.com. Please don't spam me.
and I'm interested in the domain name of the email address only, I might write a regex something like

@(\w)*\.(com|net|org)

to parse out the string "gmail".

Pardon me if I am way off base, but I am flying blind at the moment (no LV to test what I am about to say). You can add to the power of a regular expression using submatches or capture groups. The regex you have written will grab (I think) @gmail.com for the whole match. Let's say you want to get 'gmail' without a second function call. You can make the first selection group a little bit greedier by moving the * inside the parentheses. Next, on the BD pull down on the bottom of the Match Regular Expression function to expose a variable number of submatches (two should do in this case). The first one should be 'gmail'. The second one should be 'com'.

In summary, @(\w*)\.(com|net|org) should give you gmail in the first submatch. Of course, my Perl is a bit rusty and LV may not implement it similarly.

Message Edited by Darin.K on 01-14-2010 01:42 PM

kc64 · ‎01-14-2010

Darin.K--

Thank you very much! That is exactly what I needed and so simple too. I don't know why I never noticed before that the Reg Ex vi was stretchable. Doh.

Thanks. 🙂

LabVIEW

Complex regular expressions without multiple passes

Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes

Re: Complex regular expressions without multiple passes