Saturday, 2 June 2018

javascript - Regular Expression to match open and closed html tags




I want to come up with Regular Expression to return true if the a closed html tag is matched with an open one in specific text that gets passed in JavaScript. If there is an unmatched tag, it should return false;




For example, if the following text is passed "

Test
" it should return true
but if the following text gets passed "
Test
Boom" it should return false



I can only get it to match the first div tags to return true with the following expression



    var text = "
Test
;
var text2 = "
Test
;
var regex = /[^<>]*<(\w+)(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)>[^<>]*<\/\1+\s*>[^<>]*|[^<>]*<\w+(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/>[^<>]*||^[^<>]+$/;
var match = regex.test(text);
console.log(match); // true

var match = regex.test(text2);
console.log(match2); // still true should be false


How can I fix it so it functions the way I want it to.


Answer



The test method returns true for match2 because it has found a match.



In order to fix it, change your regex this way:




^(?:<(\w+)(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)>[^<>]*<\/\1+\s*>|<\w+(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/>||[^<>]+)*$


Description (click to enlarge)




Regular expression visualization



Demo




http://jsfiddle.net/r2LsN/



Discussion



The regex defines all the allowed patterns firstly:




  1. Tags with body: ...

  2. Tags without body: (here we can find zero or more spaced before /)


  3. Comments

  4. Any text that is not < or >.



then these patterns can appear zero or more times between the beginning and the end of the tested string: ^(?:pattern1|pattern2|pattern3|pattern4)*$.


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...