Web::HTML::Tokenizer
An HTML and XML tokenizer
DESCRIPTION
THIS MODULE IS DEPRECATED. DON'T USE THIS MODULE FOR NEW APPLICATIONS.
The Web::HTML::Tokenizer
module provides an implementation of HTML and XML tokenizer.
Unlike its name,
this module can be used for XML documents as well as HTML.
It is not intended to be used directly from general-purpose applications; instead it is used as part of HTML or XML parser,
such as Web::HTML::Parser and Web::XML::Parser.
The module is intended to be a conforming HTML tokenizer according to Web Applications 1.0 specification (though it is meaningless to discuess the conformance of the tokenizer standalone). By setting the XML flag, it can also tokenize XML documents in a way consistent with the HTML tokenization specification. You might consider it as an implementation of the XML5 tokenization algorithm as "patched" by later HTML5 development.
SEE ALSO
SPECIFICATIONS
- [HTML]
-
HTML Living Standard
<http://www.whatwg.org/specs/web-apps/current-work/complete.html#tokenization>
. - [XML]
-
XML 1.0
<http://www.w3.org/TR/xml/>
.XML 1.1
<http://www.w3.org/TR/xml11/>
.XML5. See
<http://suika.suikawiki.org/~wakaba/wiki/sw/n/XML5>
for references.
AUTHOR
Wakaba <wakaba@suikawiki.org>.
LICENSE
Copyright 2007-2014 Wakaba <wakaba@suikawiki.org>.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.