The manakai project


Web page conformance checker (validator)


  webhacc [OPTIONS]
  webhacc [OPTIONS] < file.html
  webhacc [OPTIONS] --input "<!DOCTYPE HTML>..."
  webhacc --help
  webhacc --version


The webhacc command is a command-line interface to the WebHACC, a Web page conformance checker (i.e. a validator).


If a non-option argument is specified, it is interpreted as a URL or file name to validate. As a special case, the argument - is interpreted as the file "standard input". If the --input option is not specified and no non-option argument is specified, data from the standard input is validated.

Unless --help or --version option is specified, the command validates the specified input and return the result. Result is written to the standard output, while additional information might be available via the standard error output.

Following options are available:


If this option is specified, run the validator even when the response is in error (in network error or has status code other than 200-299) or is redirect (has status code 300-399). Otherwise, validation is not performed when the response is in error.


If this option is specified, it is used as the MIME type when the input source is a file or the standard input.

In a future version, this option might also be applied to HTTP input to override server's Content-Type.


If this option is specified, the value is used as the Unix user name in the cron lines generated by the --generate-cron-lines option (suitable for /etc/crontab or /etc/cron.d). Unless this option is specified, user name is omitted (suitable for crontab -e command).


Perform XML DTD validation even when there is no DOCTYPE declaration.


Generate crontab lines that should be added to schedule updating of the WebHACC script and exit without validation.


Show help message and exit without validation.


Assumes that the validated document is not intended for public and the user is expected to be able to view images. This option affects conformance of the HTML img element.


Validate the specified string as the input (instead of URL or standard input).


Encode the result in JSON.

XXX JSON structure is ...


Disable scripting for the purpose of parsing and validation. By default scripting is enabled. Note that the validator supports no scripting language at the moment. This option affects conformance of noscript elements.


Show a dump of the parsed DOM tree, in the format similar to html5lib's test data (see Web::HTML::Dumper).

In JSON output, the dump is available as dump value.


Show the innerHTML-serialized value of the parsed DOM tree.

In JSON output, the value is available as inner_html value.


Show list of supported standard specifications and exits without validation.


Upgrade the webhacc software.


Show short information on the command and exits without validation. This option can be combined with --json.


If specified, the XML parser will read and process any XML external entities referenced in the XML file.


The LANG environment variable is used to determine the natural language of the output. Note that any character encoding specified by LANG is ignored; the output character encoding is always UTF-8.


When the validation result is positive, as well as when --help or --version is specified, the command exits with 0. Otherwise the command exits with non 0 status.


Install by one-line installer

Run the following command:

  $ curl | sh

Wait a few minutes. WebHACC program files are installed into ./local/webhacc and the webhacc runner command is copied as ./webhacc.

Then, setup automatic upgrading (see subsection below).

The WEBHACC_DIR environment variable can be used to specify where WebHACC program files are installed:

  $ curl | \
    WEBHACC_DIR=path/to/webhacc sh

If your system's Perl is older than Perl 5.10, set the PMBP_PERL_VERSION environment variable:

  $ curl | PMBP_PERL_VERSION=latest sh

... such that newer version of Perl is installed for the WebHACC script (in ./local/webhacc/local/). As this compiles perl, it takes several minutes.

Install step by step

Install make, gcc, perl, git, and wget.

Clone the Git repository in your favorite directory and run the setup command:

  $ git clone git:// path/to/webhacc-cli
  $ cd path/to/webhacc-cli
  $ make deps

Then, invoke the webhacc command in the repository directory (by explicitly specifying the path to the file, by adding the directory to your PATH environment variable, by copying the file to your "bin" directory, or by your favorite way):

  $ cp webhacc ~/bin
  $ cd somewhere
  $ webhacc

Note that the make deps command does not modify any directory or file outside of the repositroy directory. You can uninstall the application entirely by simply deleting the repository directory.

Then, setup automatic upgrading (see subsection below).

Automated upgrading

As Web standards are evolved on a daily basis, the WebHACC program is also updated frequently, otherwise output of the program could be stale. Once installed, the program can be updated by invoking it with the --upgrade option:

  $ path/to/webhacc --upgrade

This command should be invoked periodically by, e.g., scheduling cron to run the command once a week, so that the program is kept up-to-date. The crontab lines that should be added to your crontab file can be generated by the following command:

  $ path/to/webhacc --generate-cron-lines


This command requires Perl 5.10 or later.

In addition, it requires various modules for validation. They are Git submodules in the modules directory, or can be installed to the local directory in the repository by the make deps command as described in the previous section.


The command supports various Web standard specifications. Run the command with the --specs option to view the list of supported specifications.


Wakaba <>.


Copyright 2007-2015 Wakaba <>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.