Thursday, 1 August 2019

Regex to match multiple different sub-strings in a given input string



I am trying to write a regular expressions that will match multiple sub-strings in a given string. Here is my requirement:



Input String: value(abc) or v(def) or s(xyz)



My regular expression should match both value( and v(.



This is what I wrote: ^(?:value\(|v\()




But the above regex matches either value( or v(, not both. I need the regex to match both. Is there any way to do this?



Also, Is there any possible way in which I can get the substring between the brackets? Like a way to pick abc or def in the above example?


Answer



Your regex starts off with a start-of-string anchor (^). This causes the regex to only match at the start of your string. Since "v(def" is not at the start of the input string "value(abc) or v(def) or s(xyz)", the regex will not match it. Removing the start-of-string anchor will fix this.



In addition, the two alternatives in your regex are mostly the same, aside from some additional characters in the first alternative. Your regex could be simplified to the following:



v(?:alue)?\(




Regular expression visualization



Update: To get the value of the expression inside of the parenthesis, you can use a capturing group (surround an expression with ()). Capturing groups are numbered based on the position of their opening parenthesis. The group whose ( comes first is group "1", the second ( is group "2", and so on. Depending on what regex engine you are using, you might also be able to use named capturing groups (?...) (I know .NET supports them). You would then use your engine's method of retrieving the value of a capturing group.



For example, the following regex will match:




  • v or value

  • an opening (

  • an optional value made up of alphabetic characters


  • a closing )



The optional value will be stored in the "value" capturing group. You will want to change the expression inside the value group to match the format of your values.



v(?:alue)?\((?[a-zA-Z]*)\)



Regular expression visualization



(Visualizations created using Debuggex)



No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...