Saturday, 29 September 2018

regex - The meaning of \1* operator in Java regexes




I am learning about Java regexes, and I noticed the following operator:



\\*1


I'm having hard time figuring out what it means (searching in the web didn't help).
For example, what is the difference between these two options:



    Pattern p1 = Pattern.compile("(a)\\1*"); // option1

Pattern p2 = Pattern.compile("(a)"); // option2

Matcher m1 = p1.matcher("a");
Matcher m2 = p2.matcher("a");

System.out.println(m1.group(0));
System.out.println(m2.group(0));


Result:




a
a


Thanks!


Answer



\\1 is back reference corresponding in this case to the first capturing group which is (a) here.



So (a)\\1* is equivalent to (a)a* in this particular case.




Here is an example that shows the difference:



Pattern p1 = Pattern.compile("(a)\\1*");
Pattern p2 = Pattern.compile("(a)");

Matcher m1 = p1.matcher("aa");
Matcher m2 = p2.matcher("aa");

m1.find();

System.out.println(m1.group());
m2.find();
System.out.println(m2.group());


Output:



aa
a



As you can see when you have several a the first regular expression captures all the successive a while the second one captures only the first one.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...