The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
=encoding utf-8

=head1 NAME

PSGI - Perl Web Server Gateway Interface Specification

=head1 ABSTRACT

This document specifies a standard interface between web servers and
Perl web applications or frameworks. This interface is designed to promote web application
portability and reduce the duplication of effort by web application
framework developers.

Please keep in mind that PSGI is not Yet Another web application
framework. PSGI is a specification to decouple web server environments
from web application framework code. Nor is PSGI a web application
API. Web application developers (end users) will not run their web
applications directly using the PSGI interface, but instead are
encouraged to use frameworks that support PSGI.

=head1 TERMINOLOGY

=over 4

=item Web Servers

I<Web servers> accept HTTP requests issued by web clients,
dispatching those requests to web applications if configured to do so,
and return HTTP responses to the request-initiating clients.

=item PSGI Server

A I<PSGI Server> is a Perl program providing an environment for a
I<PSGI application> to run in.

PSGI specifying an interface for web applications and the main purpose
of web applications being to be served to the Internet, a I<PSGI
Server> will most likely be either: part of a web server (like Apache
mod_perl), connected to a web server (with FastCGI, SCGI), invoked by
a web server (as in plain old CGI), or be a standalone web server
itself, written entirely or partly in Perl.

There is, however, no requirement for a I<PSGI Server> to actually be
a web server or part of one, as I<PSGI> only defines an interface
between the server and the application, not between the server and the
world.

A I<PSGI Server> is often also called I<PSGI Application Container>
because it is similar to a I<Java Servlet container>, which is Java
process providing an environment for I<Java Servlets>.

=item Applications

I<Web applications> accept HTTP requests and return HTTP responses.

I<PSGI applications> are web applications conforming to the PSGI interface,
prescribing they take the form of a code reference
with defined input and output.

For simplicity,
I<PSGI Applications> will also be referred to as I<Applications>
for the remainder of this document.

=item Middleware

I<Middleware> is a PSGI application (a code reference) I<and> a
I<Server>. I<Middleware> looks like an I<application> when called from a
I<server>, and it in turn can call other I<applications>. It can be thought of
a I<plugin> to extend a PSGI application.

=item Framework developers

I<Framework developers> are the authors of web application frameworks. They
write adapters (or engines) which accept PSGI input, run a web
application, and return a PSGI response to the I<server>.

=item Web application developers

I<Web application developers> are developers who write code on top of a web
application framework. These developers should never have to deal with PSGI
directly.

=back

=head1 SPECIFICATION

=head2 Application

A PSGI application is a Perl code reference. It takes exactly one
argument, the environment, and returns an array reference containing exactly
three values.

  my $app = sub {
      my $env = shift;
      return [
          '200',
          [ 'Content-Type' => 'text/plain' ],
          [ "Hello World" ], # or IO::Handle-like object
      ];
  };

=head3 The Environment

The environment MUST be a hash reference that includes CGI-like headers, as
detailed below. The application is free to modify the environment. The
environment MUST include these keys (adopted from L<PEP
333|http://www.python.org/dev/peps/pep-0333/>,
L<Rack|http://rack.rubyforge.org/doc/files/SPEC.html> and
L<JSGI|http://jackjs.org/jsgi-spec.html>) except when they would normally be
empty.

When an environment key is described as a boolean, its value MUST conform
to Perl's notion of boolean-ness. This means that an empty string or an
explicit C<0> are both valid false values. If a boolean key is not present, an
application MAY treat this as a false value.

The values for all CGI keys (named without a period) MUST be a scalar
string.

See below for details.

=over 4

=item *

C<REQUEST_METHOD>: The HTTP request method, such as "GET" or
"POST". This B<MUST NOT> be an empty string, and so is always
required.

=item *

C<SCRIPT_NAME>: The initial portion of the request URL's I<path>,
corresponding to the application. This tells the application its
virtual "location". This may be an empty string if the application
corresponds to the server's root URI.

If this key is not empty, it MUST start with a forward slash (C</>).

=item *

C<PATH_INFO>: The remainder of the request URL's I<path>, designating
the virtual "location" of the request's target within the
application. This may be an empty string if the request URL targets
the application root and does not have a trailing slash. This value
should be URI decoded by servers in order to be compatible with L<RFC 3875|http://www.ietf.org/rfc/rfc3875>.

If this key is not empty, it MUST start with a forward slash (C</>).

=item *

C<REQUEST_URI>: The undecoded, raw request URL line. It is the raw URI
path and query part that appears in the HTTP C<GET /... HTTP/1.x> line
and doesn't contain URI scheme and host names.

Unlike C<PATH_INFO>, this value B<SHOULD NOT> be decoded by servers. It is an
application's responsibility to properly decode paths in order to map URLs to
application handlers if they choose to use this key instead of C<PATH_INFO>.

=item *

C<QUERY_STRING>: The portion of the request URL that follows the C<?>,
if any. This key MAY be empty, but B<MUST> always be present, even if empty.

=item *

C<SERVER_NAME>, C<SERVER_PORT>: When combined with C<SCRIPT_NAME> and
C<PATH_INFO>, these keys can be used to complete the URL. Note,
however, that C<HTTP_HOST>, if present, should be used in preference
to C<SERVER_NAME> for reconstructing the request URL. C<SERVER_NAME>
and C<SERVER_PORT> B<MUST NOT> be empty strings, and are always
required.

=item *

C<SERVER_PROTOCOL>: The version of the protocol the client used to
send the request. Typically this will be something like "HTTP/1.0" or
"HTTP/1.1" and may be used by the application to determine how to
treat any HTTP request headers.

=item *

C<CONTENT_LENGTH>: The length of the content in bytes, as an
integer. The presence or absence of this key should correspond to the
presence or absence of HTTP Content-Length header in the request.

=item *

C<CONTENT_TYPE>: The request's MIME type, as specified by the client.
The presence or absence of this key should correspond to the presence
or absence of HTTP Content-Type header in the request.

=item *

C<HTTP_*> Keys: These keys correspond to the client-supplied
HTTP request headers. The presence or absence of these keys should
correspond to the presence or absence of the appropriate HTTP header
in the request.

The key is obtained converting the HTTP header field name to upper
case, replacing all occurrences of hyphens C<-> with
underscores C<_> and prepending C<HTTP_>, as in
L<RFC 3875|http://www.ietf.org/rfc/rfc3875>.

If there are multiple header lines sent with the same key, the server
should treat them as if they were sent in one line and combine them
with C<, >, as in L<RFC 2616|http://www.ietf.org/rfc/rfc2616>.

=back

A server should attempt to provide as many other CGI variables as are
applicable.  Note, however, that an application that uses any CGI
variables other than the ones listed above are necessarily
non-portable to web servers that do not support the relevant
extensions.

In addition to the keys above, the PSGI environment MUST also include these
PSGI-specific keys:

=over 4

=item *

C<psgi.version>: An array reference [1,1] representing this version of
PSGI. The first number is the major version and the second it the minor
version.

=item *

C<psgi.url_scheme>: A string C<http> or C<https>, depending on the request URL.

=item *

C<psgi.input>: the input stream. See below for details.

=item *

C<psgi.errors>: the error stream. See below for details.

=item *

C<psgi.multithread>: This is a boolean value, which MUST be true if the
application may be simultaneously invoked by another thread in the same
process, false otherwise.

=item *

C<psgi.multiprocess>: This is a boolean value, which MUST be true if an
equivalent application object may be simultaneously invoked by another
process, false otherwise.

=item *

C<psgi.run_once>: A boolean which is true if the server expects (but does not
guarantee!)  that the application will only be invoked this one time during
the life of its containing process. Normally, this will only be true for a
server based on CGI (or something similar).

=item *

C<psgi.nonblocking>: A boolean which is true if the server is calling the
application in an non-blocking event loop.

=item *

C<psgi.streaming>: A boolean which is true if the server supports callback
style delayed response and streaming writer object.

=back

The server or the application can store its own data in the
environment as well. These keys MUST contain at least one dot, and
SHOULD be prefixed uniquely.

The C<psgi.> prefix is reserved for use with the PSGI core
specification, and C<psgix.> prefix is reserved for officially blessed
extensions. These prefixes B<MUST NOT> be used by other servers or
application. See L<psgi-extensions|PSGI::Extensions> for the list of
officially approved extensions.

The environment B<MUST NOT> contain keys named C<HTTP_CONTENT_TYPE> or
C<HTTP_CONTENT_LENGTH>.

One of C<SCRIPT_NAME> or C<PATH_INFO> MUST be set. When
C<REQUEST_URI> is C</>, C<PATH_INFO> should be C</> and C<SCRIPT_NAME>
should be empty. C<SCRIPT_NAME> B<MUST NOT> be C</>, but MAY be
empty.

=head3 The Input Stream

The input stream in C<psgi.input> is an L<IO::Handle>-like object which
streams the raw HTTP POST or PUT data. If it is a file handle then it
MUST be opened in binary mode. The input stream B<MUST> respond to
C<read> and MAY implement C<seek>.

Perl's built-in filehandles or L<IO::Handle> based objects should work as-is
in a PSGI server. Application developers B<SHOULD NOT> inspect the type or
class of the stream. Instead, they SHOULD simply call C<read> on the object.

Application developers B<SHOULD NOT> use Perl's built-in C<read> or iterator
(C<< <$fh> >>) to read from the input stream. Instead, application
developers should call C<read> as a method (C<< $fh->read >>) to allow for
duck typing.

Framework developers, if they know the input stream will be used with the
built-in read() in any upstream code they can't touch, SHOULD use PerlIO or
a tied handle to work around with this problem.

The input stream object is expected to provide a C<read> method:

=over 4

=item read

  $input->read($buf, $len [, $offset ]);

Returns the number of characters actually read, 0 at end of file, or
undef if there was an error.

=back

It may also implement an optional C<seek> method. If
C<psgix.input.buffered> environment is true, it MUST implement the
C<seek> method.

=over 4

=item seek

  $input->seek($pos, $whence);

Returns 1 on success, 0 otherwise.

=back

See the L<IO::Handle> documentation for more details on exactly how these
methods should work.

=head3 The Error Stream

The error stream in C<psgi.errors> is an L<IO::Handle>-like object to
print errors. The error stream must implement a C<print> method.

As with the input stream, Perl's built-in filehandles or L<IO::Handle> based
objects should work as-is in a PSGI server. Application developers B<SHOULD
NOT> inspect the type or class of the stream. Instead, they SHOULD simply
call C<print> on the object.

=over 4

=item print

  $errors->print($error);

Returns true if successful.

=back

=head3 The Response

Applications MUST return a response as either a three element array
reference, or a code reference for a delayed/streaming response.

The response array reference consists of the following elements:

=head4 Status

An HTTP status code. This MUST be an integer greater than or equal to 100,
and SHOULD be an HTTP status code as documented in L<RFC
2616|http://www.w3.org/Protocols/rfc2616>.

=head4 Headers

The headers MUST be an array reference (B<not> a hash reference)
of key/value pairs. This means it MUST contain an even number of elements.

The header B<MUST NOT> contain a key named C<Status>, nor any keys with C<:>
or newlines in their name. It B<MUST NOT> contain any keys that end in C<-> or
C<_>.

All keys MUST consist only of letters, digits, C<_> or C<->. All
keys MUST start with a letter. The value of the header B<MUST> be a
scalar string and defined. The value string B<MUST NOT> contain
characters below octal 037 i.e. chr(31).

If the same key name appears multiple times in an array ref, those
header lines MUST be sent to the client separately (e.g. multiple
C<Set-Cookie> lines).

=head4 Content-Type

There MUST be a C<Content-Type> except when the C<Status> is 1xx, 204
or 304, in which case there B<MUST NOT> be a content type.

=head4 Content-Length

There B<MUST NOT> be a C<Content-Length> header when the C<Status> is
1xx, 204 or 304.

If the Status is not 1xx, 204 or 304 and there is no C<Content-Length> header,
a PSGI server MAY calculate the content length by looking at the Body. This
value can then be appended to the list of headers returned by the application.

=head4 Body

The response body MUST be returned from the application as either
an array reference or a handle containing the response body as byte
strings. The body MUST be encoded into appropriate encodings and
B<MUST NOT> contain wide characters (> 255).

=over 4

=item *

If the body is an array reference, it is expected to contain an array of lines
which make up the body.

  my $body = [ "Hello\n", "World\n" ];

Note that the elements in an array reference are B<NOT REQUIRED> to end
in a newline. A server SHOULD write each elements as-is to the
client, and B<SHOULD NOT> care if the line ends with newline or not.

An array reference with a single value is valid. So C<[ $html ]> is a valid
response body.

=item *

The body can instead be a handle, either a Perl built-in filehandle or an
L<IO::Handle>-like object.

  open my $body, "</path/to/file";
  open my $body, "<:via(SomePerlIO)", ...;
  my $body = IO::File->new("/path/to/file");

  # mock class that implements getline() and close()
  my $body = SomeClass->new();

Servers B<SHOULD NOT> check the type or class of the body. Instead, they should
simply call C<getline> to iterate over the body, and
call C<close> when done.

Servers MAY check if the body is a real filehandle using C<fileno> and
C<Scalar::Util::reftype>. If the body is real filehandle, the server MAY
optimize using techniques like I<sendfile(2)>.

The body object MAY also respond to a C<path> method. This method is
expected to return the path to a file accessible by the server. This allows
the server to use this information instead of a file descriptor number to
serve the file.

Servers SHOULD set the C<$/> special variable to the buffer size when
reading content from C<$body> using the C<getline> method. This is done by
setting C<$/> with a reference to an integer (C<$/ = \8192>).

If the body filehandle is a Perl built-in filehandle L<IO::Handle> object,
they will respect this value. Similarly, an object which provides the same API
MAY also respect this special variable, but are not required to do so.

=back

=head2 Delayed Response and Streaming Body

The PSGI interface allows applications and servers to provide a
callback-style response instead of the three-element array
reference. This allows for a delayed response and a streaming body
(server push).

This interface SHOULD be implemented by PSGI servers, and
C<psgi.streaming> environment MUST be set to true in such servers.

To enable a delayed response, the application SHOULD return a
callback as its response. An application MAY check if the
C<psgi.streaming> environment is true and falls back to the direct
response if it isn't.

This callback will be called with I<another> subroutine reference (referred to
as the I<responder> from now on) as its only argument. The I<responder>
should in turn be called with the standard three element array reference
response. This is best illustrated with an example:

  my $app = sub {
      my $env = shift;

      # Delays response until it fetches content from the network
      return sub {
          my $responder = shift;

          fetch_content_from_server(sub {
              my $content = shift;
              $responder->([ 200, $headers, [ $content ] ]);
          });
      };
  };

An application MAY omit the third element (the body) when calling
the I<responder>. If the body is omitted, the I<responder> MUST
return I<yet another> object which implements C<write> and C<close>
methods. Again, an example illustrates this best.

  my $app = sub {
      my $env = shift;

      # immediately starts the response and stream the content
      return sub {
          my $responder = shift;
          my $writer = $responder->(
              [ 200, [ 'Content-Type', 'application/json' ]]);

          wait_for_events(sub {
              my $new_event = shift;
              if ($new_event) {
                  $writer->write($new_event->as_json . "\n");
              } else {
                  $writer->close;
              }
          });
      };
  };

This delayed response and streaming API is useful if you want to
implement a non-blocking I/O based server streaming or long-poll Comet
push technology, but could also be used to implement unbuffered writes
in a blocking server.

=head2 Middleware

A I<middleware> component takes another PSGI application and runs it. From the
perspective of a server, a middleware component is a PSGI application. From
the perspective of the application being run by the middleware component, the
middleware is the server. Generally, this will be done in order to implement
some sort of pre-processing on the PSGI environment hash or post-processing on
the response.

Here's a simple example that appends a special HTTP header
I<X-PSGI-Used> to any PSGI application.

  # $app is a simple PSGI application
  my $app = sub {
      my $env = shift;
      return [ '200',
               [ 'Content-Type' => 'text/plain' ],
               [ "Hello World" ] ];
  };

  # $xheader is a piece of middleware that wraps $app
  my $xheader = sub {
      my $env = shift;
      my $res = $app->($env);
      push @{$res->[1]}, 'X-PSGI-Used' => 1;
      return $res;
  };

Middleware MUST behave exactly like a PSGI application from the perspective
of a server. Middleware MAY decide not to support the streaming interface
discussed earlier, but SHOULD pass through the response types that it doesn't
understand.

=head1 CHANGELOGS

1.1: 2010.02.xx

=over 4

=item *

Added optional PSGI keys as extensions: C<psgix.logger> and C<psgix.session>.

=item *

C<psgi.streaming> SHOULD be implemented by PSGI servers, rather than B<MAY>.

=item *

PSGI keys C<psgi.run_once>, C<psgi.nonblocking> and C<psgi.streaming>
MUST be set by PSGI servers.

=item *

Removed C<poll_cb> from writer methods.

=back

=head1 ACKNOWLEDGEMENTS

Some parts of this specification are adopted from the following specifications.

=over 4

=item *

PEP333 Python Web Server Gateway Interface L<http://www.python.org/dev/peps/pep-0333>

=item *

Rack L<http://rack.rubyforge.org/doc/SPEC.html>

=item *

JSGI Specification L<http://jackjs.org/jsgi-spec.html>

=back

I'd like to thank authors of these great documents.

=head1 AUTHOR

Tatsuhiko Miyagawa E<lt>miyagawa@bulknews.netE<gt>

=head1 CONTRIBUTORS

The following people have contributed to the PSGI specification and
Plack implementation by commiting their code, sending patches,
reporting bugs, asking questions, suggesting useful advices,
nitpicking, chatting on IRC or commenting on my blog (in no particular
order):

  Tokuhiro Matsuno
  Kazuhiro Osawa
  Yuval Kogman
  Kazuho Oku
  Alexis Sukrieh
  Takatoshi Kitano
  Stevan Little
  Daisuke Murase
  mala
  Pedro Melo
  Jesse Luehrs
  John Beppu
  Shawn M Moore
  Mark Stosberg
  Matt S Trout
  Jesse Vincent
  Chia-liang Kao
  Dave Rolsky
  Hans Dieter Pearcey
  Randy J Ray
  Benjamin Trott
  Max Maischein
  Slaven Rezić
  Marcel Grünauer
  Masayoshi Sekimura
  Brock Wilcox
  Piers Cawley
  Daisuke Maki
  Kang-min Liu
  Yasuhiro Matsumoto
  Ash Berlin
  Artur Bergman
  Simon Cozens
  Scott McWhirter
  Jiro Nishiguchi
  Masahiro Chiba
  Patrick Donelan
  Paul Driver
  Florian Ragwitz

=head1 COPYRIGHT AND LICENSE

Copyright Tatsuhiko Miyagawa, 2009-2011.

This document is licensed under the Creative Commons license by-sa.

=cut