Saturday, 17 November 2018

php - Match regex pattern that isn't within a bbcode tag




I am attempting to create a regex patten that will match words in a string that begin with @



Regex that solves this initial problem is '~(@\w+)~'



A second requirement of the code is that it must also ignore any matches that occur within [quote] and [/quote] tags



A couple of attempts that have failed are:



(?:[0-9]+|~(@\w+)~)(?![0-9a-z]*\[\/[a-z]+\])


/[quote[\s\]][\s\S]*?\/quote](*SKIP)(*F)|~(@\w+)~/i


Example: the following string should have an array output as displayed:



$results = [];
$string = "@friends @john [quote]@and @jane[/quote] @doe";

//run regex match
preg_match_all('regex', $string, $results);


//dump results
var_dump($results[1]);

//results: array consisting of:
[1]=>"@friends"
[2]=>"@john"
[3]=>"@doe

Answer




You may use the following regex (based on another related question):



'~(\[quote](?:(?1)|.)*?\[/quote])(*SKIP)(*F)|@\w+~s'


See the regex demo. The regex accounts for nested [quote] tags.



Details





  • (\[quote](?:(?1)|.)*?\[/quote])(*SKIP)(*F) - matches the pattern inside capturing parentheses and then (*SKIP)(*F) make the regex engine omit the matched text:


    • \[quote] - a literal [quote] string

    • (?:(?1)|.)*? - any 0+ (but as few as possible) occurrences of the whole Group 1 pattern ((?1)) or any char (.)

    • \[/quote] - a literal [/quote] string


  • | - or

  • @\w+ - a @ followed with 1+ word chars.




PHP demo:



$results = [];
$string = "@friends @john [quote]@and @jane[/quote] @doe";
$rx = '~(\[quote\](?:(?1)|.)*?\[/quote])(*SKIP)(*F)|@\w+~s';
preg_match_all($rx, $string, $results);
print_r($results[0]);
// => Array ( [0] => @friends [1] => @john [2] => @doe )


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...