The manakai project


HTML microdata


  use Web::HTML::Microdata;
  $doc->inner_html (q{
    <p itemscope>
      <span itemprop=a>bb</span>
      <img itemprop=b src="" alt=Logo>
  $md = Web::HTML::Microdata->new;
  $items = $md->get_top_level_items ($doc);
      # [
      #   {type => 'item', node => $doc->query_selector ('p'),
      #    props => {
      #      a => [{type => 'string', text => 'bb',
      #             node => $doc->query_selector ('span')}],
      #      b => [{type => 'url', text => '',
      #             node => $doc->query_selector ('img')}],
      #    },
      #    types => {}, id => undef},
      # ]


The Web::HTML::Microdata module provides access to microdata items in the document.


Following methods are available:

$md = Web::HTML::Microdata->new

Create a new instance of the microdata implementation.

$code = $md->onerror
$md->onerror ($code)

Get or set the error handler for the implementation. Any microdata error, as well as warning and additional processing information, is reported to the handler. See <> for details of error handling.

The value should not be set while the implementation is running. If the value is changed, the result is undefined.

$items = $md->get_top_level_items ($node)

Return an array reference of top-level microdata items in the subtree rooted by the specified node. The argument must be a DOM Element, Document, or DocumentFragment.

Items of the returned array reference are "item" data described in the later section.

$item = $md->get_item_of_element ($element)

Return a top-level microdata item created by the specified node. The argument must be a DOM Element.

The element must be an element that creates an item (i.e. an HTML element with the itemscope attribute specified). If the element specified does not create an item according to the spec, the result could be somewhat stupid.

The method returns an "item" data described in the later section.



An "item" data is a hash reference, containing following name/value pairs:


Always the string item.


The Element that created the microdata item.


The hash reference containing properties of the microdata item. The hash names are property names in the item. The hash values are corresponding property values, represented as array references of zero or more values. Property values are represented as "value" data. Property values are sorted in tree order of elements in which values are contained.


The hash reference containing types of the microdata item. Note that this is different member from type. The hash names are item types. The hash values are whether the item has the item type or not.


The global identifier of the microdata item, if any.


The "value" data is a hash reference, in one of following structure:

An "item" data

The value is a microdata item.

{type => 'error', node => $node}

The value is a microdata item, but it is not expanded to full "item" data to avoid the entire data structure for containing a loop. There is another full "item" data created from the same element. This is non-conforming.

The "value" data is always a DAG.

{type => 'string', text => $text, node => $node}

The value is a string $text. The value is contained in the element $node.

{type => 'url', text => $text, node => $node}

The value is a string $text, obtained from an attribute whose value is a URL. The owner of the attribute is $node.



HTML Standard - Microdata <>.


Wakaba <>.


Copyright 2014 Wakaba <>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.