Web::RDF::XML::Parser
An RDF/XML parser
SYNOPSIS
use Web::RDF::XML::Parser;
$rdf = Web::RDF::XML::Parser->new;
$rdf->ontriple (sub {
push @result, {@_};
});
$rdf->convert_document ($doc);
DESCRIPTION
The Web::RDF::XML::Parser
module is an implementation of RDF/XML. Using this module, RDF triples embedded within RDF/XML document or document fragment can be extracted.
The RDF/XML format is no longer widely used. Though this module is still maintained as part of the manakai project, use of it is not recommended.
This module is unsuitable for processing RSS 1.0 documents. Use Web::Feed::Parser instead.
METHODS
Following methods are available:
$rdf = Web::RDF::XML::Parser->new
-
Create an RDF/XML parser.
$rdf->convert_document ($doc)
-
Extract the triples from a document. The argument must be a DOM
Document
(e.g. a Web::DOM::Document object). Extracted triples are reported through theontriple
callback. $rdf->convert_rdf_element ($doc)
-
Extract the triples from an element. The argument must be a DOM
Element
containing the triples, e.g. anrdf:RDF
element. Extracted triples are reported through theontriple
callback. $rdf->ontriple ($code)
$code = $rdf->ontriple
-
Get or set the callback function which is invoked for each triple extracted from the document.
The callback is invoked with following name/value pairs as arguments:
subject
,predicate
,object
, andnode
. The callback is not expected to throw any exception. Valuessubject
,predicate
,object
are parsed term data structures (see Web::RDF::Checker). The node from which the triple is extracted is given asnode
. $rdf->onbnodeid ($code)
$code = $rdf->onbnodeid
-
Get or set the code reference that is invoked whenever a blank node identifier is to be constructed.
The code is invoked with an argument, which is used within the module to identify a blank node. The code can return the argument as is, or it can return a modified copy of the argument. Anyway, the returned value is used as the blank node identifier. The code must return the same value for the same argument. The code must return different values for different arguments. The code is not expected to throw any exception.
This hook is useful when a document contains multiple RDF fragment such that blank nodes within them have to be distinguished.
The value should not be set while the parser is running. If the value is changed, the result is undefined.
$code = $rdf->onerror
$rdf->onerror ($code)
-
Get or set the error handler for the parser. Any parse error, as well as warning and additional processing information, is reported to the handler. See
<https://github.com/manakai/data-errors/blob/master/doc/onerror.txt>
for details of error handling.The value should not be set while the parser is running. If the value is changed, the result is undefined.
$code = $rdf->onnonrdfnode
$rdf->onnonrdfnode ($code)
-
Get or set the code reference that is invoked whenever a non-RDF node is detected. Note that use of such a node in an RDF/XML fragment is non-conforming. This hook is intended for injecting validation codes (e.g. by Web::HTML::Validator). Note that the node can be a misplaced
rdf:RDF
element, for example.The code is invoked with an argument, which is the node in question. The code is expected not to throw any exception. The value should not be set while the parser is running. If the value is changed, the result is undefined.
$code = $rdf->onattr
$rdf->onattr ($code)
-
Get or set the code reference that is invoked whenever an attribute is encounted by the parser. This hook is intended for injecting validation codes (e.g. by Web::HTML::Validator).
The code is invoked with two arguments: the node in question and the type of the attribute, which is one of followings:
common Normal attributes (e.g. xml:lang="" and xmlns="") url RDF/XML attributes whose value is a URL rdf-id RDF/XML attributes whose value is an rdf-id (NCName) string RDF/XML attributes whose value is a string misc Other RDF/XML attributes
The code is expected not to throw any exception. The value should not be set while the parser is running. If the value is changed, the result is undefined.
ERROR HANDLING
This module extracts RDF triples from RDF/XML fragment using the algorithm described in the RDF/XML specification. When the input does not conform to the grammer, it try to recover from the error by most "natural" way; it might or might not report additional triples depending on how the input is non-conforming.
In most cases the input is non-conforming, the module reports one or more errors through the onerror
handler. To detect all the conformance errors, you have to use a conformance checker (e.g. Web::HTML::Validator) that invokes this module with appropriate hooks and postprocessors.
DEPENDENCY
Perl 5.8 or later is required.
This module requires the Web::URL::Canonicalize module in the perl-web-url repository <https://github.com/manakai/perl-web-url>
.
In addition, it expects DOM objects (e.g. Web::DOM::Document and Web::DOM::Element from <https://github.com/manakai/perl-web-dom>
) as input, although there is no direct dependency.
SPECIFICATIONS
- RDFXML
-
RDF 1.1 XML Syntax
<https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-xml/index.html>
. - XMLBASE
-
XML Base
<https://www.w3.org/TR/xmlbase/>
.XML Base Specification Errata
<https://www.w3.org/2009/01/xmlbase-errata>
. - VALLANGS
-
DOM Tree Validation
<https://rawgit.com/manakai/spec-dom/409d6f6c0685e96c5b0d2c7aeb894ed567f0d651/validation-langs.html#rdf/xml-integration>
.
AUTHOR
Wakaba <wakaba@suikawiki.org>.
LICENSE
Copyright 2013-2018 Wakaba <wakaba@suikawiki.org>.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.