The manakai project


URL parser




Following methods are available:

$parser = Web::URL::Parser->new

Create a new parser.

$url = $parser->parse_proxy_env ($string)

Parse a string using a proxy environment variable parser. If failed, undef is returned. Otherwise, a URL record object (Web::URL) is returned.

This method is appropriate for parsing http_proxy, https_proxy, or ftp_proxy environment variable value, decoded by platform-locale-dependent character encoding.

$result = $parser->split_by_urls ($string, NAME => VALUE, ...)

Extract URLs in a free-form text for autolinking.

The first argument must be the text string to be parsed.

The second and later arguments are interpretd as name/value pairs of options. If the lax option is specified to a true value, parsing is performed in the lax mode. A public Web application (e.g. a forum service interpreting user-posted entries) should not use the lax mode. A client application (e.g. an e-mail client displaying a plain-text mail message) should use the lax mode.

This method only extracts http: and https: URLs. In the lax mode, ttp: and ttps: URL schemes are also detected and are interpreted as http: and https:, respectively.

This method returns an array reference. It contains substrings of the input text, in order, as array references. An inner array is either a text array or a link array. A text array's 0-th item is a text string, representing a substring that is not a URL. A link array's 0-th item is a text string, representing a substring that is interpreted as a URL, and its 1-th item is a text string, which can be used as an input to the URL parser.


    $result = $parser->split_by_urls ("See later!");
    # $result = [
    #   ["See "],
    #   ["", ""],
    #   [" later!"],
    # ];

    $result = $parser->split_by_urls ("ttps://", lax => 1);
    # $result = [
    #   ["", "ttps://"],
    # ];


Web Transport Processing <>.


Wakaba <>.


