The manakai project

webhacc

Web page conformance checker (validator)

SYNOPSIS

  webhacc [OPTIONS] http://www.example.com/
  webhacc [OPTIONS] < file.html
  webhacc [OPTIONS] --input "<!DOCTYPE HTML>..."
  webhacc --help
  webhacc --version

DESCRIPTION

The webhacc command is a command-line interface to the WebHACC, a Web page conformance checker (i.e. a validator).

ARGUMENTS

If a non-option argument is specified, it is interpreted as a URL or file name to validate. As a special case, the argument - is interpreted as the file "standard input". If the --input option is not specified and no non-option argument is specified, data from the standard input is validated.

Unless --help or --version option is specified, the command validates the specified input and return the result. Result is written to the standard output, while additional information might be available via the standard error output.

Following options are available:

--check-error-response

If this option is specified, run the validator even when the response is in error (in network error or has status code other than 200-299) or is redirect (has status code 300-399). Otherwise, validation is not performed when the response is in error.

--content-type=mime-type

If this option is specified, it is used as the MIME type when the input source is a file or the standard input.

In a future version, this option might also be applied to HTTP input to override server's Content-Type.

--cron-user=user-name

If this option is specified, the value is used as the Unix user name in the cron lines generated by the --generate-cron-lines option (suitable for /etc/crontab or /etc/cron.d). Unless this option is specified, user name is omitted (suitable for crontab -e command).

--dtd-validation

Perform XML DTD validation even when there is no DOCTYPE declaration.

--generate-cron-lines

Generate crontab lines that should be added to schedule updating of the WebHACC script and exit without validation.

--help

Show help message and exit without validation.

--image-viewable

Assumes that the validated document is not intended for public and the user is expected to be able to view images. This option affects conformance of the HTML img element.

--input=string

Validate the specified string as the input (instead of URL or standard input).

--json

Encode the result in JSON.

XXX JSON structure is ...

--noscript

Disable scripting for the purpose of parsing and validation. By default scripting is enabled. Note that the validator supports no scripting language at the moment. This option affects conformance of noscript elements.

--show-dump

Show a dump of the parsed DOM tree, in the format similar to html5lib's test data (see Web::HTML::Dumper).

In JSON output, the dump is available as dump value.

--show-inner-html

Show the innerHTML-serialized value of the parsed DOM tree.

In JSON output, the value is available as inner_html value.

--specs

Show list of supported standard specifications and exits without validation.

--upgrade

Upgrade the webhacc software.

--version

Show short information on the command and exits without validation. This option can be combined with --json.

--xml-external-entities

If specified, the XML parser will read and process any XML external entities referenced in the XML file.

ENVIRONMENT VARIABLE

The LANG environment variable is used to determine the natural language of the output. Note that any character encoding specified by LANG is ignored; the output character encoding is always UTF-8.

EXIT STATUS

When the validation result is positive, as well as when --help or --version is specified, the command exits with 0. Otherwise the command exits with non 0 status.

INSTALL

Install by one-line installer

Run the following command:

  $ curl https://wakaba.github.io/packages/webhacc | sh

Wait a few minutes. WebHACC program files are installed into ./local/webhacc and the webhacc runner command is copied as ./webhacc.

Then, setup automatic upgrading (see subsection below).

The WEBHACC_DIR environment variable can be used to specify where WebHACC program files are installed:

  $ curl https://wakaba.github.io/packages/webhacc | \
    WEBHACC_DIR=path/to/webhacc sh

If your system's Perl is older than Perl 5.10, set the PMBP_PERL_VERSION environment variable:

  $ curl https://wakaba.github.io/packages/webhacc | PMBP_PERL_VERSION=latest sh

... such that newer version of Perl is installed for the WebHACC script (in ./local/webhacc/local/). As this compiles perl, it takes several minutes.

Install step by step

Install make, gcc, perl, git, and wget.

Clone the Git repository in your favorite directory and run the setup command:

  $ git clone git://github.com/manakai/webhacc-cli path/to/webhacc-cli
  $ cd path/to/webhacc-cli
  $ make deps

Then, invoke the webhacc command in the repository directory (by explicitly specifying the path to the file, by adding the directory to your PATH environment variable, by copying the file to your "bin" directory, or by your favorite way):

  $ cp webhacc ~/bin
  $ cd somewhere
  $ webhacc http://example.com/

Note that the make deps command does not modify any directory or file outside of the repositroy directory. You can uninstall the application entirely by simply deleting the repository directory.

Then, setup automatic upgrading (see subsection below).

Automated upgrading

As Web standards are evolved on a daily basis, the WebHACC program is also updated frequently, otherwise output of the program could be stale. Once installed, the program can be updated by invoking it with the --upgrade option:

  $ path/to/webhacc --upgrade

This command should be invoked periodically by, e.g., scheduling cron to run the command once a week, so that the program is kept up-to-date. The crontab lines that should be added to your crontab file can be generated by the following command:

  $ path/to/webhacc --generate-cron-lines

DEPENDENCY

This command requires Perl 5.10 or later.

In addition, it requires various modules for validation. They are Git submodules in the modules directory, or can be installed to the local directory in the repository by the make deps command as described in the previous section.

SPECIFICATIONS

The command supports various Web standard specifications. Run the command with the --specs option to view the list of supported specifications.

AUTHOR

Wakaba <wakaba@suikawiki.org>.

LICENSE

Copyright 2007-2015 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.