Regular Expressions Board

thoult · ‎01-07-2014

Beat me to it!

The alt text for the comic also has an amusing regex.

---
CLA

Henrik_Volkers · ‎01-07-2014

this one?

/bu|[rn]t|[coy]e|[mtg]a|j|iso|n[hl]|[ae]d|lev|sh|[lnd]i|[po]…last names of elected US presidents but not their opponents.

Greetings from Germany
Henrik

LV since v3.1

“ground” is a convenient fantasy

'˙˙˙˙uıɐƃɐ lɐıp puɐ °06 ǝuoɥd ɹnoʎ uɹnʇ ǝsɐǝld 'ʎɹɐuıƃɐɯı sı pǝlɐıp ǝʌɐɥ noʎ ɹǝqɯnu ǝɥʇ'

thoult · ‎01-09-2014

There's a nice inspection of the regexes and their efficiency from that comic here. Turns out they can be whittled down further.

Nice if you like that sort of thing, I mean!

---
CLA

Ray.R · ‎01-17-2014

How to improve the following regex:

DI([0-9]+)([\s]+=[\s]+)([\S]+)

I'm sure something better can be done for : ([\s]+=[\s]+)

which is to remove the \s=\s or \s\s\s=\s\s\s or any amount of white spaces surrounding the equal sign. I guess, there might be a situation where there are no spaces, but unlikely..

Thanks..

Darin.K · ‎01-17-2014

Brackets are for alternatives, not needed when you have a single alternative. [\s]+ = \s+. Personally I would allow for no spaces unless your grammar explicitly forbids it.

As to improvements, beyond the bracket removal I would not change much. Shaving a few characters here and there will not affect compile time or search time noticably so I would leave it written in the way you understand.

Ray.R · ‎01-17-2014

Thanks Darin!

🙂

JackDunaway · ‎01-17-2014

@Ray.R wrote:

How to improve the following regex:

DI([0-9]+)([\s]+=[\s]+)([\S]+)

It looks like you might be trying to match a key value pair, where the middle submatch is "trash". You can make that a non-capturing group, and allow optional spaces around the "equals":

DI([0-9]+)(?:\s*=\s*)([\S]+)

Capture the DI number in the first submatch, and the value as the second submatch (ignoring the equals sign and any arbitrary amount of whitespace between the two). Note that \s captures more than just spaces, however -- it captures all whitespace (depending on how you configure Multiline). With a little more context, we could add some anchors to better constrain the match, ensuring it doesn't, for instance, choke on key/value pairs with an empty value.

Ray.R · ‎01-17-2014

Thanks Jack,

Here's a couple of examples of lines this regex will be dealing with:

DI000 = BPtxMute
DI001 = BPsystem-Failed

Nothing fancy..

I like to learn how to optimize regex code, hence my posts.

😄

lvrat · ‎06-22-2014

How do I find the exact match for a defined pattern? For example:

I need to find $a_LIFT as an exact match. My input strings are

1. if $a_LIFT eq 20

2. if $a_LIFT_TEST eq 0

when using regular expression as \$a_LIFT it pick both #1 and #2 in the search like a greedy match.

Any pattern that is input as regular expression needs to be matched exactly in the input string and disregard any other matches.

Thanks.

*************************************************
CLD
*************************************************

JackDunaway · ‎06-22-2014

@lvrat wrote:

How do I find the exact match for a defined pattern? For example:

I need to find $a_LIFT as an exact match. My input strings are

1. if $a_LIFT eq 20

2. if $a_LIFT_TEST eq 0

when using regular expression as \$a_LIFT it pick both #1 and #2 in the search like a greedy match.

Any pattern that is input as regular expression needs to be matched exactly in the input string and disregard any other matches.

Thanks.

The confusion here might be what "exact match" means, from your perspective compared to the regex engine's perspective. 🙂

\$a_LIFT as a regex indeed finds an exact match for both lines, but it's likely that your desired semantic is closer to ...

\s|\A\K\$a_LIFT(?=\s|\z)

... meaning, "let's define 'exact match' to mean \$a_LIFT that is preceded by either an ignored whitespace character or the beginning of the string, followed by a whitespace character or the end of the string which we'll also ignore." (Consider a lookbehind instead of the \K keep operator if the lookbehind is fixed-width instead of variable width... for example, have a go with (?<=\s)\$a_LIFT(?=\s) if you don't need the begining/end of line anchors in your application)

BreakPoint

Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board

Re: Regular Expressions Board