Friday 13 October 2017

xml - Parsing RDFa in html/xhtml?

Using RDF::RDFa::Parser module in perl to parse rdf data
out of website.

On website with with !DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN"> it works, but on sites using xhtml !DOCTYPE
html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" " href="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
rel="nofollow">http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
no output...



test website -> href="http://www.filmstarts.de/kritiken/186918.html"
rel="nofollow">http://www.filmstarts.de/kritiken/186918.html



use
RDF::RDFa::Parser;

my $url =
'http://www.filmstarts.de/kritiken/186918.html';
my $options =
RDF::RDFa::Parser::Config->tagsoup;
my $rdfa =
RDF::RDFa::Parser->new_from_url($url,
$options);


print
$rdfa->opengraph('image');
print
$rdfa->opengraph('description');

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print &q...