When I run a regex pattern from a online RegEx testing tool on the text below works fine. However, it is not working when using in sed on unix
Text:
001 Transaction Successful 2016-07-01-12:05:40.383 N 2016-07-01-12:05:44.171
RegEx:
(.*?)<\/DtTm>
Usage in Sed: Looking to remove anything between
and
sed 's/(.*?)<\/DtTm>//g'
Expected Output:
001 Transaction Successful N
Answer
GNU sed
has two modes, basic and extended. Neither of these, nor the single basic mode of less advanced sed
implementations, permit non-greedy specifications. As per the info sed
output:
Note that the regular expression matcher is greedy, i.e., matches are attempted from left to right and, if two or more matches are possible starting at the same character, it selects the longest.
So, if you need non-greedy, you will have to choose another tool, such as Perl (or something else supporting PCRE), which is probably what the online testing tool you mentioned is using.
The good thing is, the Perl substitute command is so stunningly similar to the sed
one that you can often just change the program name (and possibly use a different delimiter character in complex REs so you don't end up with sawtooths like \/\/\/\/\/
):
perl -pe 's|.*? ||g'
No comments:
Post a Comment