03-13-2007 10:33 AM - edited 03-13-2007 10:33 AM
Message Edited by eaolson on 03-13-2007 10:33 AM
03-13-2007 11:09 AM
03-13-2007 11:13 AM
03-13-2007 01:17 PM
If the regexp is replaced with [G[bi]], there is no error, so it's not a matter of nested brackets. I couldn't find anything on the PCRE man page that forbids nested brackets specifically, but it makes sense.
adambrewster wrote: I think your regexp is invalid. In regexps, brackets are not the same as parentheses. Parens are for grouping, while brackets are for matching one of a class of characters. Brackets can not be nested.
I don't believe that's the case. Replace the regexp with [(Gbi)], and the error goes away. So it's not a matter of the '(' being literal, and then encountering a ')' without a matching '('.
Your expression "[(G[bi])]", therefore parses as a character class which matches '(', 'G', '[', 'b', or 'i' followed by an unmatched paren, and an unmatched bracket.
It's not my regular expression. A poster at LAVA was having problems with one of his (a truly frightening one), and this seemed to be the element that was causing the problem. I'm pretty sure that the originator of the regexp meant to use G(b|i), which seems like a complicated way of matching "Gb" or "Gi", if you ask me.
daveTW wrote: what string exactly you want to replace? I think the round braces are not right in this case, since they mark partial matches which are given back by "match regular expression". But you don't want to extract parts of the string, you want to replace them (or delete, with empty <replace string>). So if you leave the outer [( ... )] then your RegEx means all strings with either "Gb" or "Gi".
03-13-2007 01:34 PM
From the doc you linked:
Part of a pattern that is in square brackets is called a "character
class". In a character class the only metacharacters are:
\ general escape character
^ negate the class, but only if the first character
- indicates character range
[ POSIX character class (only if followed by POSIX
syntax)
] terminates the character class
inside of a character class, parens are always literal, and [ is a literal except in POSIX classes, such as [:alpha:].
In your second example, "[(Gbi)]" parsed as a character class which matches '(', 'G', 'b', 'i', or ')'.
If you still don't believe me, try this one: "([[)])". If parens and brackets inside brackets aren't literal, than it will fail.
03-13-2007 01:40 PM
03-13-2007 01:48 PM
@adambrewster wrote:From the doc you linked:
Part of a pattern that is in square brackets is called a "character
class". In a character class the only metacharacters are:
\ general escape character
^ negate the class, but only if the first character
- indicates character range
[ POSIX character class (only if followed by POSIX
syntax)
] terminates the character class
inside of a character class, parens are always literal, and [ is a literal except in POSIX classes, such as [:alpha:].
In your second example, "[(Gbi)]" parsed as a character class which matches '(', 'G', 'b', 'i', or ')'.
If you still don't believe me, try this one: "([[)])". If parens and brackets inside brackets aren't literal, than it will fail.