The manakai project

Web::Transport::ProxyServerConnection

HTTP proxy server connection

SYNOPSIS

  tcp_server $host, $port, sub {
    my $con = Web::Transport::ProxyServerConnection
        ->new_from_aeargs_and_opts ([@_], {...});
    $con->completed->then (sub {
      warn "Client disconnected and proxy done";
    });
  };

DESCRIPTION

The Web::Transport::ProxyServerConnection module is an implementation of HTTP proxy server. It wraps an HTTP server's TCP connection socket and forward any incoming request to upstream.

METHODS

See "METHODS" in Web::Transport::GenericServerConnection.

REQUEST HANDLER

A request handler can be specified to the handle_request option of the hash reference specified to the constructor of the proxy server.

A request handler, if specified, is invoked whenever the proxy server has received a request from a client and just before the request is forwarded to the upstream server.

A request handler is a code reference. The code is expected to return a hash reference, or a promise (e.g. Promise) that is to be fulfilled with a hash reference.

The code is invoked with an argument, that is a hash reference. The hash reference can be returned by the code if the proxy should forward the request as is.

These hash references have following key/value pairs:

info => $info (argument only)

The metadata of the underlying (downstream) HTTP connection. XXX

request => $request (argument / return value)

A request hash reference. XXX

The argument contains the request that can be forwarded to the upstream (i.e. after any removal of connection-specific headers).

If the returned hash reference does not contain response or error, the request is used to send a request to the upstream server.

client_options => $client_options (return value only)

A hash reference of additional client options, as described for the client method of the handler API object ("HANDLER API OBJECT"), used to obtain the client for making a request with the request argument after the completion of the request handler.

response => $response (return value only)

A response hash reference. XXX

If the returned hash reference contains response, it is used to send a response to the downstream client. No request is made to the upstream server.

error => $error (return value only)

An exception object. It can be any Perl defined value, though Web::DOM::Error or its subclass's instance is desired.

When the proxy server has to abort something associated with the handling of the request in question, the exception is used as the error reason.

If the returned hash reference has error and does not have response, an error response is generated from the exception. If exception's name is Protocol error or Perl I/O error, a 504 response is generated. Otherwise, if exception's name is HTTP parse error, a 502 response is generated. Otherwise, a 500 response is generated. No request is made to the upstream server.

api => $api (argument only)

The handler API object ("HANDLER API OBJECT") for this invocation of the request handler.

data => $value (return value only)

Any application-specific data. This field can be used to associate request and response handler. Any Perl scalar value can be specified.

It is important that the proxy server does not allow the upstream server being the proxy server itself. However, this is in fact a difficult problem: a domain might be resolved into the proxy server's IP address; a proxy server of the proxy server might be misconfigured as the proxy server itself; an upstream server might be forward the forwarded request to the proxy server (i.e. indirect loop); and so on. It's request handler's responsibility to reject any abuse or wrong usage of the proxy server.

RESPONSE HANDLER

A response handler can be specified to the handle_response option of the hash reference specified to the constructor of the proxy server.

A response handler, if specified, is invoked whenever the proxy server has received a response from an upstream server and just before the response is forwarded to the downstream client. It is not invoked when the request handler returns a response (or an error).

A response handler is a code reference. The code is expected to return a hash reference, or a promise (e.g. Promise) that is to be fulfilled with a hash reference.

The code is invoked with an argument, that is a hash reference. The hash reference can be returned by the code if the proxy should forward the response as is.

These hash references have following key/value pairs:

info => $info (argument only)

The metadata of the underlying (downstream) HTTP connection. XXX

response => $response (argument / return value)

A response hash reference. XXX

If the returned hash reference contains response, it is used to send a response to the downstream client.

error => $error (return value only)

An exception object. It can be any Perl defined value, though Web::DOM::Error or its subclass's instance is desired.

When the proxy server has to abort something associated with the handling of the response in question, the exception is used as the error reason.

If the returned hash reference has error and does not have response, an error response is generated from the exception. If exception's name is Protocol error or Perl I/O error, a 504 response is generated. Otherwise, if exception's name is HTTP parse error, a 502 response is generated. Otherwise, a 500 response is generated.

api => $api (argument only)

The handler API object ("HANDLER API OBJECT") for this invocation of the response handler.

data => $value (argument only)

The application-specific data specified by the request handler, if any. The value is not defined if no data were specified.

This can be used to associate request and response handlers. For example, to save the response data using the file name extracted from request target URL, the request handler should store the URL as data and the response handler should extract it from data.

closed => $promise (argument only)

A promise (Promise object) which is fulfilled once the response in question has been sent.

HANDLER API OBJECT

The api value of the argument to request or response handlers are a handler API object, which provides convinient methods for use within handlers:

$client = $api->client ($url, $client_options, $api_options)

Return a client (Web::Transport::BasicClient) that is ready to send a request.

The first argument must be a URL (Web::URL object) of the target of the request.

The second argument, if specified, must be a hash reference of additional client options (see "CLIENT OPTIONS" in Web::Transport::BasicClient) used to create a client. By default a set of client options appropriate for the proxy server is used, taking the client option of the proxy server's constructor argument into account, but this argument can be used to override them.

The third argument, if specified, must be a hash reference of additional options used to choose a client. The key option, if specified, sets the short identifier for the client. When the client method is invoked twice with same orign, client options, and key, the same client is returned if possible. When the method is invoked with different key, different client is returned. When the proxy itself fetches a resource, it uses the client whose key is the empty string, which is the default key. The proxy discards the empty string keyed client after the process of the relevant request/response pairs. If the handlers want to avoid the client discarded, they should use their own keys.

This is effectively equivalent to invoking the new_from_url method of the Web::Transport::BasicClient module but this method first looks into the connection pool of the proxy server with appropriate client options. Therefore, if a request or response handler wants to fetch a resource as part of response construction, this method should be used exclusively rather than other HTTP client APIs.

$out_headers = $api->filter_headers ($in_headers, $name => $value, ...)

Remove specified kinds of headers. The first argument must be a canonical headers array reference. The remaining arguments must be zero or more key/value pairs of kinds:

conditional => $boolean

Headers in the "conditional" category, such as If-Modified-Since.

proxy_removed => $boolean

Headers removed by proxies upon forwarding, such as Transfer-Encoding, including headers specified in any |Connection:| header.

names => {$name => $boolean, ...}

Headers with specified names. The value must be a hash reference, whose keys are header names in lowercase and values are boolean true values.

It returns a new canonical headers array reference.

$api->note ($message, level => $debug)

Provide an informative or debug message for application's user or developer. The first argument must be a short character string. The remaining arguments must be zero or more key/value pairs of options.

The level option is the verbosity level of the message; the message is reported to the standard error output only when the debug level (server's debug option's value) is greater than or equal to the level value; the default is zero, i.e. always reported.

FEATURES NOT SUPPORTED BY THIS MODULE

The module does not support HTTP proxy authentication. It can be implemented within a request handler.

The module does not support HTTP caches. It can be implemented by consulting with the cache for any cached response within a request handler and caching the received response within a response handler.

The module intentionally does not add HTTP Via:, Forwarded:, and X-Forwarded-*: headers to forwarded requests and responses. It can be added within request and response handlers.

This module does not support HTTP TRACE method.

SEE ALSO

AnyEvent::Socket.

AUTHOR

Wakaba <wakaba@suikawiki.org>.

LICENSE

Copyright 2016-2018 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.