Monday, 27 August 2018

regex - What do 'lazy' and 'greedy' mean in the context of regular expressions?




Could someone explain these two terms in an understandable way?


Answer



Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+>. Suppose you have the following:



Hello World


You may think that <.+> (. means any non newline character and + means one or more) would only match the and the , when in reality it will be very greedy, and go from the first < to the last >. This means it will match Hello World instead of what you wanted.



Making it lazy (<.+?>) will prevent this. By adding the ? after the +, we tell it to repeat as few times as possible, so the first > it comes across, is where we want to stop the matching.




I'd encourage you to download RegExr, a great tool that will help you explore Regular Expressions - I use it all the time.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...