Saturday, 13 January 2018

javascript - Regular Expression to match open and closed html tags





I want to come up with Regular Expression to return true if the a
closed html tag is matched with an open one in specific text that gets passed in
JavaScript. If there is an unmatched tag, it should return
false;



For example, if the following text is
passed "

Test
" it should return true

but if the following text gets passed
"
Test
Boom" it should return
false



I can only get it to match the first div
tags to return true with the following
expression



 var text =
"
Test
;

var text2 =
"
Test
;
var regex =
/[^<>]*<(\w+)(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)>[^<>]*<\/\1+\s*>[^<>]*|[^<>]*<\w+(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/>[^<>]*||^[^<>]+$/;

var match = regex.test(text);
console.log(match); // true
var
match = regex.test(text2);
console.log(match2); // still true should be
false


How can I fix it
so it functions the way I want it to.


class="post-text" itemprop="text">
class="normal">Answer





The test
method returns true for match2 because it has
found a match
.



In order to fix
it, change your regex this way:



^(?:<(\w+)(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)>[^<>]*<\/\1+\s*>|<\w+(?:(?:\s+\w+(?:\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/>||[^<>]+)*$


Description
(click to enlarge)



href="https://www.debuggex.com/i/6gy-t3B7subnWrvp.png"
rel="nofollow">

src="https://www.debuggex.com/i/6gy-t3B7subnWrvp.png" alt="Regular expression
visualization">



Demo



href="http://jsfiddle.net/r2LsN/"
rel="nofollow">http://jsfiddle.net/r2LsN/



Discussion



The
regex defines all the allowed patterns
firstly:





  1. Tags
    with body:
    ...

  2. Tags
    without body: (here we can find zero or more
    spaced before /)

  3. Comments

  4. Any text that
    is not < or
    >.



then
these patterns can appear zero or more times between the beginning and the end of the
tested string:
^(?:pattern1|pattern2|pattern3|pattern4)*$.



No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...