PCRE has a feature called recursive pattern, which can be used to match nested subgroups. For example, consider the "grammar"
Q -> \w | '[' A ';' Q* ','? Q* ']' | '<' A '>'
A -> (Q | ',')*
// to match ^A$.
It can be done in PCRE with the pattern
^((?:,|(\w|\[(?1);(?2)*,?(?2)*\]|<(?1)>))*)$
(Example test case: http://www.ideone.com/L4lHE)
Should match:
abcdefg
abc,def,ghi
abc,,,def
,,,,,,
[abc;]
[a,bc;]
sss[abc;d]
as[abc;d,e]
[abc;d,e][fgh;j,k]
[b;
<,,,>
<>
<><>
<>,<>
a<<<<>>>>
<<<<<>>>><><<<>>>>
[[;];]
[,;,]
[;[;]]
[<[;]>;<[;][;,<[;,]>]>]
Should not match:
There is no recursive pattern in .NET. Instead, it provides balancing groups for stack-based manipulation for matching simple nested patterns.
Is it possible to convert the above PCRE pattern into .NET Regex style?
(Yes I know it's better not to use regex in for this. It's just a theoretical question.)
No comments:
Post a Comment