Saturday, 29 September 2018

regex - how to use sed, awk, or gawk to print only what is matched?




I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.



But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example:



Example regular expression:



.*abc([0-9]+)xyz.*



Example input file:



a
b
c
abc12345xyz
a
b
c



As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have:



myvalue=$( sed <...something...> input.txt )


Things I've tried include:



sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file
sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing


Answer



My sed (Mac OS X) didn't work with +. I tried * instead and I added p tag for printing match:



sed -n 's/^.*abc\([0-9]*\)xyz.*$/\1/p' example.txt


For matching at least one numeric character without +, I would use:



sed -n 's/^.*abc\([0-9][0-9]*\)xyz.*$/\1/p' example.txt


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...