The manakai project

NAME

perl-web-markup - A pure-perl HTML and XML processor

MODULES

Following modules are available:

Web::HTML::Parser

An HTML parser.

Web::XML::Parser

An XML parser.

Web::HTML::Serializer

An HTML serializer.

Web::XML::Serializer

An XML serializer.

Web::XPath::Parser

An XPath 1.0 parser.

Web::XPath::Evaluator

An XPath 1.0 evaluator.

Web::HTML::Table

An implementation of HTML table model.

Web::HTML::Microdata

An implementation of HTML microdata.

Web::RDF::XML::Parser

An RDF/XML parser.

Web::Feed::Parser

A RSS and Atom parser.

Web::HTML::Validator

A DOM conformance checker (for HTML and XML).

Web::GPX::Parser

A GPX parser.

DEPENDENCY

These modules require Perl 5.14 or later. They requires Encode, which is included in the Perl distribution, and modules from the perl-web-encodings package <https://github.com/manakai/perl-web-encodings>, which is a submodule of the Git repository. The Web::RDF::XML::Parser module has more submodule dependency (see its documentation for details).

In addition, a DOM implementation is required as input (and output) to these modules, although there is no direct dependency. For the XPath modules, see Web::XPath::Evaluator for its requirements on the DOM implementation. For other modules, the DOM implementation must support a subset of features defined in DOM Standard, DOM Parsing and Serialization Standard, DOM3 Core, DOM Document Type Definitions, DOM Perl Binding, and manakai's DOM Extensions. An example of such a DOM implementation is the Web::DOM modules in the perl-web-dom package <https://github.com/manakai/perl-web-dom>.

The Web::Feed::Parser module and the Web::GPX::Parser module require modules from perl-web-datetime <https://github.com/manakai/perl-web-datetime> and perl-web-url <https://github.com/manakai/perl-web-url> packages.

Validator modules such as Web::HTML::Validator and Web::RDF::Checker require additional external modules; see their documentations.

SEE ALSO

The perl-web-dom package <https://github.com/manakai/perl-web-dom> implements DOM interfaces, which contains standard ways to parse or serialize HTML/XML documents. They are implemented using the perl-web-markup package.

HISTORY

Most of these modules are originally developed under the name of "Whatpm" in 2007-2008 <https://suika.suikawiki.org/www/markup/html/whatpm/readme> and then merged into the manakai-core package <https://suika.suikawiki.org/www/manakai-core/doc/web/>. Those modules are split again into this separate package in 2013.

DEVELOPMENT

The latest version of these modules are available at the GitHub repository: <https://github.com/manakai/perl-web-markup>.

Test results can be reviewed at Travis CI <https://travis-ci.org/manakai/perl-web-markup>.

Known issues are recorded at <https://manakai.g.hatena.ne.jp/task/4/> and GitHub Issues.

AUTHOR

Wakaba <wakaba@suikawiki.org>.

LICENSE

Copyright 2007-2021 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.