Monday, 6 November 2017

php - Finding HTML tags in string

itemprop="text">

I know this question is around SO, but
I can't find the right one and I still suck in Regex
:/




I have an
string and that string is valid HTML. Now I want to find all
the tags with an certain name and
attribute.



I tried this
regex (i.e. div with type): /(

src="(.*?)<\/div>)/.



Example
string:



Do not match
me

match
me

not me
type="special_type" > match me
too




If
I use preg_match then I only get


match me
what is logical because the other one has the
attributes in a different order.



What regex do I
need to get the following array when using
preg_match on the example
string?:



array(0 => '            type="special_type" src="bla"> match me
',
1 => ' src="blaw" type="special_type" > match me
too
')

class="post-text" itemprop="text">
class="normal">Answer





A general advice:
Dont use regex to parse HTML It will get messy if the HTML
changes..



Use
DOMDocument
instead:



$str =
<<
Do not match me

type="special_type" src="bla"> match me

not
me

match me
too

EOF;


$doc = new
DOMDocument();
$doc->loadHTML($str);
$selector = new
DOMXPath($doc);

$result =
$selector->query('//div[@type="special_type"]');

// loop through
all found items
foreach($result as $node) {
echo
$node->getAttribute('src');

}


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...