Wednesday, 16 January 2019

Match a line with multiple regex using Python



Is there a way to see if a line contains words that matches a set of regex pattern?
If I have [regex1, regex2, regex3], and I want to see if a line matches any of those, how would I do this?
Right now, I am using re.findall(regex1, line), but it only matches 1 regex at a time.


Answer



You can use the built in functions any (or all if all regexes have to match) and a Generator expression to cicle through all the regex-objects.



any (regex.match(line) for regex in [regex1, regex2, regex3])



(or any(re.match(regex_str, line) for regex in [regex_str1, regex_str2, regex_str2]) if the regexes are not pre-compiled regex objects, of course)



Although that will be ineficient compared to combining your regexes in a single expression - if this code is time or cpu critical, you should try instead, composing a single regular expression that encompass all your needs, using the special | regex operator to separate the original expressions.
A simple way to combine all the regexs is to use the string "join" operator:



re.match("|".join([regex_str1, regex_str2, regex_str2]) , line)



Although combining the regexes on this form can result in wrong expressions if the original ones already do make use of the | operator.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...