9781118013847
the_interwebs.html

Chapter 15. The Interwebs

WHAT YOU WILL LEARN IN THIS CHAPTER

  • Understanding the basics of the HTTP

  • Understanding a Web server’s role

  • Using a Plack to respond to requests

  • Using Web forms

  • Understanding cookies

  • Handling security issues

  • Understanding a Web client’s role

  • Writing software to read Web sites

  • Creating a client to use Web APIs

Ah, Perl, the duct tape of the internet. Duct tape has a reputation for being an amazing, ad hoc supertool for fixing things in a hurry, even as a combat dressing. Perl is the same way. Need something done on the Web quickly? Reach for Perl!

We’ll be paying particular attention to the HyperText Transfer Protocol (HTTP). When you view a Web page in your browser, it was probably sent to you via HTTP (or HTTPs, the encrypted version of HTTP). HTTP is nothing more than text and Perl excels at text manipulation. You author believes you need a foundation of how HTTP flows between systems in order to effectively program at a higher level.

The first part of this chapter will be about responding as a server. It will not be “here’s how to write a Web application”, though we’ll create some simple ones, but rather, “here are some concepts you need to know.” In Chapter 19 we’ll have a brief look at Dancer, (http://perldancer.org/) one of Perl’s easiest to use frameworks for quickly building web applications.

The next part of the chapter will be about writing client software: accessing Web sites, finding links on Web pages, using Web APIs, and so on. Again, it’s not going to be a full “here’s all you wanted to know about Web clients”, but it will get you off to a great web automation beginning.

This chapter assumes that you know a little bit about creating a Web page with HTML (the HyperText Markup Language). If you don’t, check out http://www.w3schools.com/html/ to learn the basics of HTML.

A brief introduction to HTTP

HTTP is a client-server protocol. That means that a client, such as a Web browser or some software you write, makes a request to a server via HTTP and the server responds with, well, something. It might be a static Web page, a page generated on the fly, or HTTP responses telling you 404 Page Not Found, the dreaded 500 Server Error, or a 301 Moved Permanently (a redirect).

To understand how the Web works, you could use a simple telnet client, a standard tool available on all major operating systems (http://en.wikipedia.org/wiki/Telnet). A simple telnet session might look like this:

% telnet example.com 80
Trying 192.0.43.10...
Connected to example.com.
Escape character is '^]'.
HEAD /
HTTP/1.0 302 Found
Location: http://www.iana.org/domains/example/
Server: BigIP
Connection: close
Content-Length: 0
Connection closed by foreign host.

Note

You can use telnet to try to connect to any server on the Internet but it will often fail. Historically there have been a number of security issues surrounding the telnet protocol and, as a result, many servers disable telnet access.

However you can use telnet and impersonate web, mail and other clients if you know the rules of the protocol. We’ll be doing a bit of that to learn how web clients and servers communicate.

When you telnet to a server, you specify the host (example.com in this case) and the port (80, the standard HTTP port):

% telnet example.com 80
And then you'll see a response similar to:
Trying 192.0.43.10...
Connected to example.com.
Escape character is '^]'.

The “escape character”, in this case, is CTRL-] (typing the control and right square bracket at the same time). That will cause you to enter a command mode that you can CTRL-C out of.

We then issue a HEAD request against the root of the server:

HEAD /

When you “surf” to a Web page in your browser by clicking a link such as http://www.example.com/some/page/, behind the scenes your browser is probably issuing a GET request to that server

GET /some/page

That returns a set of headers giving information about the resource you have connected to, separate by two newlines, and the body of the request (also known as the entity-body), often a Web page written in HTML.

When you issue a HEAD request, you’re saying “I only want the headers for this resource, not the body”. In this case, that’s great because there is no body available for this request (the Content-Length is 0):

HTTP/1.0 302 Found
Location: http://www.iana.org/domains/example/
Server: BigIP
Connection: close
Content-Length: 0

The first line is the HTTP protocol version, followed by the HTTP numeric status code, followed by a human readable description.

Next is a list of HTTP header fields. Each consists of a field name, followed by a colon, followed by the value of that field name. In this case, we see that / at www.example.com can actually be found at http://www.iana.org/domains/example/. In fact, if you go to www.example.com in your browser, when it sees the 302 Found, it will redirect you to http://www.iana.org/domains/example/.

That’s the basics of the HTTP protocol. Once you understand it, it’s pretty simple. Plus, since it’s plain text, it’s easy to view for debugging.

Plack

To get us started with Web development and Perl, we’re going to use Plack (http://plackperl.org/). We won’t be doing anything too complicated, but we’ll see the basics of how it works. You can install Plack with your favorite CPAN client:

$ cpan PSGI Plack

Note

When installing Plack, you don’t really need to install PSGI because it is only the specification of the PSGI interface. However, you’ll find that perldoc PSGI can often help you better understand how Plack works.

Also, if you’re going to do serious Web development with Plack, it’s recommended that you install Task::Plack. That will install many modules that are very helpful when developing Plack applications.

Keep in mind that Plack examples given are bare bones without full support of features you’ll find in most applications. For example the telnet and HEAD shown earlier won’t work with our Plack app.

Plack and PSGI were created by Tatsuhiko Miyagawa, an extremely talented and prolific programmer. PSGI is a specification of how a Web server can talk to a Web application. It is modeled after WSGI, a Web server/application interface originally developed for the Python language. Plack is an implementation of the PSGI specification.

Prior to PSGI, many companies found themselves having to choose between different Web servers that accept HTTP requests and Web applications that might process those requests. When the server receives a request that should be handled by an application, it needed to know how to talk to that application and the application needed to know how to respond.

PSGI changes the game tremendously. It sits between the Web server and the Web application, guaranteeing a standard Web interface. As long as both your Web server and Web application “speak” PSGI (today, you’ll find that your popular options all understand PSGI), you can switch to different servers or applications without having to reconfigure how they talk to one another. This is one of many examples of why Perl is duct tape in the web world.

Hello, World!

Now it’s time to show you what we really mean. Be aware that Plack is actually a set of building blocks for Web applications and is not intended to be used by application developers directly. However, it’s easy to use and a great compromise between showing how web applications work and how HTTP operates.

Create a chapter15/ directory, change into it, and save the following as app.psgi:

my $app = sub {
    return [
        200,
        [ 'Content-Type' => 'text/plain' ],
        ['Hello World'],
    ];
};

A Plack application is a code reference. It’s expected to return an array reference with three values:

  • The HTTP status code

  • An array reference of HTTP headers

  • The “body” of the HTTP request.

Once you have saved the app.psgi, run plackup (it’s installed when you install Plack):

$ plackup
HTTP::Server::PSGI: Accepting connections at http://0:5000/

Note

By default, plackup looks for an app.psgi file. However, you can name this anything you want. To use a file with a name that makes more sense to you pass that as an argument to plackup:

$ plackup hello.psgi

See perldoc plackup for more information.

Congratulations! You now have a Web server running on your computer.

When you run plackup, it will start a Web server for you. You can configure it to use different Web servers, but we’ll use the default HTTP::Server::PSGI Web server that is installed with Plack.

The plackup command will appear to hang, but that’s because it is waiting for requests. Open your favorite browser and go to http://localhost:5000/. You should see the text Hello World displayed. After you see the page, look at your terminal window that plackup is running in. You’ll see something like this (reformatted for the book, this is all on two lines):

HTTP::Server::PSGI: Accepting connections at http://0:5000/
127.0.0.1 - - [11/Apr/2012:11:42:13 +0200] "GET / HTTP/1.1" 200 11 "-"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.19
(KHTML,like Gecko) Chrome/18.0.1025.151 Safari/535.19"

The first line tells us that HTTP::Server::PSGI is listening on port 5000.

Note

When we’re talking about software, a port is merely a software or process-specific way of waving its little virtual hands and saying “yoo hoo! I’m over here!” Any software or process communicating with that port will need to understand the protocol that port is listening on. In the case of the default 5000 port for plackup, that protocol is HTTP.

The second line shows a request from IP address 127.0.0.1 (your browser) came in at a particular time and issued an HTTP GET / request. It also contains information about the response code (200 OK in this case) and the type of client that attempted to connect.

With our simple Web application, connecting to any path, such as http://localhost:5000/asdf/asdf will display Hello World because that’s all you’ve programmed it to do. Let’s quickly expand this to take a look at our environment. Stop plackup (CTRL-C) and edit your app.psgi to look like this:

use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent   = 1;
$Data::Dumper::Sortkeys = 1;
$Data::Dumper::Terse    = 1;
my $app = sub {
    my $environment = Dumper( \%ENV );
    return [
        200,
        [ 'Content-Type' => 'text/plain' ],
        [ "Hello World\n", $environment ],
    ];
};

Restart plackup and refresh your browser page with http://localhost:5000/. You should now see Hello World, followed by a hash listing all of your environment variables. Here is an edited version of what your author’s browser shows:

Hello World
{
    'EDITOR'       => '/usr/bin/vim',
    'GIT_USER'     => 'ovid',
    'HISTFILESIZE' => '1000000000',
    'HISTSIZE'     => '1000000',
    'LC_CTYPE'     => 'UTF-8',
    'LOGNAME'      => 'ovid',
    'PLACK_ENV'    => 'development',
    'PWD'          => '/Users/ovid/beginning_perl/book/chapter15',
    'SHELL'        => '/bin/bash',
    'TERM'         => 'xterm-256color',
    'TERM_PROGRAM' => 'iTerm.app',
}

Let’s rewrite our app.psgi again to get a little closer to a real world example. This time we’ll add an image. Save the following in your app.psgi file:

use strict;
use warnings;
my $app = sub {
    my $env = shift;
    if ( $env->{PATH_INFO} eq '/anne_frank_stamp.jpg' ) {
        open my $fh, "<:raw", "anne_frank_stamp.jpg" or die $!;
        return [ 200, [ 'Content-Type' => 'image/jpeg' ], $fh ];
    }
    elsif ( $env->{PATH_INFO} eq '/' ) {
        return [
          200,
          [ 'Content-Type' => 'text/html' ],
          [ get_index() ]
        ];
    }
    else {
        return [
          404,
          [ 'Content-Type' => 'text/html' ],
          ['404 Not Found']
        ];
    }
};
sub get_index {
    return <<'END';
<html>
  <head><title>Sample page</title></head>
  <body>
    <p>Anne Frank was a young lady living in Amsterdam, hiding
    from the Nazis.</p>
    <p>Everyone should read her diaries.</p>
    <img src="/anne_frank_stamp.jpg"/>
  </body>
</html>
END
}

Note

app.psgi and anne_frank_stamp.jpg available for download at Wrox.com.

This program loads an image of the German Anne Frank stamp. It is in the public domain and is available at http://commons.wikimedia.org/wiki/File:Anne_Frank_stamp.jpg. Download this image and save as anne_frank_stamp.jpg and save it in the same directory as your app.psgi file.

Note

While you’re debugging, be aware that you may see more requests in the plackup terminal output than you expect. For example, many browsers automatically request something called a /favicon.ico. If found, it’s rendered as the Web site icon, usually in the URL bar and often on bookmarks.

Restart plackup and go to http://localhost:5000/. You should see a Web page similar to Figure 15.1, “Figure 15-1”.

Figure 15.1. Figure 15-1

Figure 15-1

This is all pretty normal, but let’s look at the opening lines of the $app subroutine reference:

my $env = shift;
if ( $env->{PATH_INFO} eq '/anne_frank_stamp.jpg' ) {
    open my $fh, "<:raw", "anne_frank_stamp.jpg" or die $!;
    return [ 200, [ 'Content-Type' => 'image/jpeg' ], $fh ];
}

The Plack subroutine reference is passed a single $env hashref argument, documented in perldoc PSGI. The PATH_INFO key points to the currently requested path. In this case, because we are asking for the Anne Frank image, we use open to create a filehandle, return image/jpeg as the content type and the filehandle is returned as the content of the request. Plack knows how to send that image back to your client.

When we navigate to http://localhost:5000/ in our browser, the following line of code is executed:

return [ 200, [ 'Content-Type' => 'text/html' ], [ get_index() ] ];

And that returns the HTML from the get_index() function:

sub get_index {
    return <<'END';
<html>
  <head><title>Sample page</title></head>
  <body>
    <p>Anne Frank was a young lady living in Amsterdam, hiding
    from the Nazis.</p>
    <p>Everyone should read her diaries.</p>
    <img src="/anne_frank_stamp.jpg"/>
  </body>
</html>
END

In that HTML is an img tag:

<img src="/anne_frank_stamp.jpg"/>

That tag causes the browser to issue GET /anne_frank_stamp.jpg to our Plack server. Our app.psgi sees that path and returns the image.

In other words, you went to the http://localhost:5000/ URL in your browser, but your browser actually makes two requests to the server. In fact, for most Web pages on the Internet, a single page will generate many more requests to get everything the page needs to render properly, including the HTML, Javascript, CSS, multiple images, Flash, and many other potential requests. It actually can seem quite complicated, as you can see from the example above, much of this is handled for you and it’s really not as hard as it seems.

Now let’s rewrite app.psgi one last time before moving to the next section.

use strict;
use warnings;
use Plack::Builder;
builder {
    mount '/anne_frank_stamp.jpg' => sub {
        open my $fh, "<:raw", "anne_frank_stamp.jpg" or die $!;
        return [ 200, [ 'Content-Type' => 'image/jpeg' ], $fh ];
    };
    mount '/' => sub {
        my $env = shift;
        return $env->{PATH_INFO} eq '/'
          ? [200,['Content-Type' => 'text/html'],[get_index()]]
          : [404,['Content-Type' => 'text/html'],['404 Not Found']];
    };
};
sub get_index {
    return <<'END';
<html>
  <head><title>Sample page</title></head>
  <body>
    <p>Anne Frank was a young lady living in Amsterdam, hiding
    from the Nazis.</p>
    <p>Everyone should read her diaries.</p>
    <img src="/anne_frank_stamp.jpg"/>
  </body>
</html>
END
}

Plack::Builder is a module that provides a DSL (domain-specific language) to make writing Plack applications a little bit easier. This app.psgi does the same thing as our last app.psgi, but it does so with the builder and mount commands. The builder function says “I’m going to take the following code reference and use this to build the app”.

The mount function allows you to map a particular path to a particular section of code. This is much easier than managing long if/else/elsif blocks. We do have a ?: ternary operator in the / path, but that’s only to show that we can still use this if needed.

The mount command is implemented with Plack::App::URLMap and it does not allow “dynamic” mappings. Thus, there’s no way to say “everything that is not mapped is a 404”. Web frameworks like Dancer, Catalyst and Mojolicious give you much more flexibility here, but this is enough for us to do what we need.

Handling parameters

Many times you’ll see a URL like this:

http://www.example.com/?name=john&color=blue&color=red

The question mark in a URL indicates the beginning of a query string. A query string is defined in RFC 3986 (http://www.ietf.org/rfc/rfc3986.txt). It’s a collection of name/value pairs. Each name and value is separated by an equals (=) sign and each pair is separated by an ampersand (&) or a semicolon (;). In the example above, we have two parameters, name and color. The name has one value, john, and color has two values, blue and red. We’re going to use Plack::Request to handle query strings.

Save the following code as params.psgi:

use strict;
use warnings;
use Plack::Builder;
use Plack::Request;
builder {
    mount '/' => sub {
        my $env     = shift;
        my $request = Plack::Request->new($env);
        my @params  = sort $request->param;
        my $body    = '';
        foreach my $param (@params) {
            my $values = join ',' => $request->param($param);
            $body .= "$param=$values\n";
        }
        $body ||= "No params found";
        return [ 200, [ 'Content-Type' => 'text/plain' ], [$body] ];
    };
};

Note

params.psgi available for download at Wrox.com.

Open a separate terminal window and run the following command:

plackup -r params.psgi

Note

We’re using -r this time because this will make the Web server restart every time we change params.psgi. We’ll be changing it a few times, so this makes it easier to use when developing code. If you get an error similar to could not connect, that probably means you have a syntax error on in your code. Go back to your plackup terminal window and look for the error message.

By using -r with plackup, we can have plackup running in one terminal window while we continue to develop in another terminal window.

When you request http://localhost:5000/ in your browser, it should display:

No params found

However, request http://localhost:5000/?name=john;color=red;color=blue and you should see this in your browser window.

color=red,blue
name=john

By now, you may be tired of returning a 3-element array reference because it can be a bit harder to read:

return [ 200, [ 'Content-Type' => 'text/plain' ], [$body] ];

We can replace that return statement with a response object. It’s much easier to read:

my $response = $request->new_response(200);
$response->content_type('text/plain');
$response->content($body);
return $response->finalize;

The $response->finalize handles building and returning that final array reference for you.

Templates

Up until now, we’ve included our HTML in our code. While that might be fine for small application, it’s can be hard to maintain for larger applications, particularly as you code becomes a mess of Perl, HTML, plain text, SQL (we’ll cover databases in the Chapter 16, Databases).

As a general rule, you want the different logical sections of your programs separated. One common way of doing this is using what is known as the Model-View-Controller pattern, more commonly referred to as MVC. There are a few variants of MVC, but we’ll cover a popular one for the Web.

We’re not going to show a full-blown MVC system here, but the basic components are described in Table 15.1, “Table 15-1”.

Table 15.1. Table 15-1

Component

Role

Model

Oversees the management of the business logic and data.

View

The part the client sees (a Web page, in this case)

Controller

Receives data from a view, passes it to the controller and returns the results to a view (possibly the same one).


So someone visits your Web page, the view, enters a number on a form and that data gets sent to a controller (the mount points, in our PSGI example), The controller receives the data and passes it to the correct model (the sub references in our examples) and returns the results to a view.

In Plack, there’s really not a clean separation of these concepts, but we can fake it well enough to give you an idea of what’s going on. So we’re going to create small templates with Template::Tiny to give you an idea of what’s going on.

Template::Tiny is a very small templating engine written by Adam Kennedy. It is designed to be minimal, fast, and appropriate for small applications. In short, it’s what we need. Later, you’d want to use the Template Toolkit module (the package name of the module is Template), Text::Xslate or other more robust templating modules than Template::Tiny.

First, in the same directory as your app.psgi and your params.psgi, create a templates directory. In that directory, create a file names params.tt that contains the following:

<html>
  <head><title>Parameters</title></head>
  <body>
[% IF have_params %]
    <p>Our list of params:</p>
    <table rules="all">
      <tr><th>Name</th><th>Value</th></tr>
  [% FOREACH param IN params %]
      <tr><td>[% param.name %]</td><td>[% param.value %]</td></tr>
  [% END %]
    </table>
[% ELSE %]
    <p><strong>No params supplied!</strong></p>
[% END %]
  </body>
</html>

At this point, our directory structure (assuming you’ve been typing in all the examples), should look like this:

./
|--anne_frank_stamp.jpg
|--app.psgi
|--params.psgi
|  templates/
|  |--params.tt

This HTML code with the strange syntax is Template::Tiny syntax. Template::Tiny actually doesn’t know anything about HTML. It does nothing except handle loops, if/else/unless statements and variable interpolation. All Template::Tiny commands are wrapped in [% %] brackets.

Consider the following hash reference:

{
    have_params => 1,
    params      => [
        { param => 'name',  value => 'john'     },
        { param => 'color', value => 'red,blue' },
    ],
}

If you process the above Template::Tiny template with this hash reference, this tag:

[% IF have_params %]

Will evaluate to true, passing control to the block with the <table> HTML tag. In that, you’ll see this:

[% FOREACH param IN params %]
      <tr><td>[% param.name %]</td><td>[% param.value %]</td></tr>
[% END %]

The [% FOREACH param IN params %] will iterate over the params array reference, setting param to each contained hash reference, in turn. Then when you call [% param.name %] and [% param.value %], it’s identical to calling $param->{name} and $param->{value}. In Perl code, the entire template would look like this (omitting the HTML for clarity):

my $hashref = { ... };
if ( $hashref->{have_params} ) {
    foreach my $param (@{ $hashref->{params} }) {
        print $param->{name},$param->{value};
    }
}
else {
    print "No params supplied!";
}

And that’s pretty much the entire Template::Tiny syntax. And here’s how we use it in our new params.psgi. We’re going to use File::Slurp to make reading the template code a bit easier.

use strict;
use warnings;
use Plack::Builder;
use Plack::Request;
use Template::Tiny;
use File::Slurp 'read_file';
builder {
    mount '/' => sub {
        my $env     = shift;
        my $request = Plack::Request->new($env);
        my @params;
        foreach my $param ( sort $request->param ) {
            my $values = join ',' => $request->param($param);
            push @params => { name => $param, value => $values };
        }
        my $content = get_content(
            'templates/params.tt',
            {
                params      => \@params,
                have_params => scalar @params,
            }
        );
        my $response = $request->new_response(200);
        $response->content_type('text/html');
        $response->content($content);
        return $response->finalize;
    };
};
sub get_content {
    my ( $file, $vars ) = @_;
    my $template_code = read_file($file);
    my $output;
    my $template      = Template::Tiny->new;
    $template->process( \$template_code, $vars, \$output );
    return $output;
}

We build up an array reference of parameters and pass that, along with the template name to the get_content subroutine. We use read_file from File::Slurp to read the contents of the template. We then pass the contents of the file as a scalar reference as the first argument to the instantiated Template::Tiny object. The hash referee of variables is passed as the second argument and a reference to the scalar containing our output is the third argument. (The syntax is a bit odd to maintain forward compatibility with the Template Toolkit module).

my $output;
my $template      = Template::Tiny->new;
$template->process( \$template_code, $vars, \$output );
return $output;

Once the template is processed, we return the output variable and set that as our $response->content.

Now, we you visit http://localhost:5000/, you should see this on your Web page:

No params supplied!

When you visit http://localhost:5000/?name=john;color=red;color=blue;job=janitor you should see a page that looks vaguely like this:

Our list of params:
Name    Value
color   red,blue
job     janitor
name    john

It’s not a pretty Web page, but now you can see how we’re separating out the view (sometimes called the presentation layer) from the main logic of our code.

In fact, at this point we could even take our some code from our anonymous subroutine and put that into a module in lib/ and start making the params.cgi a very tiny controller, with your model in lib/ and your view in templates/. As we’ve mentioned, though, if we start trying to do too much in Plack, it’s time to look at a real Web framework.

One last caveat: what do you think happens if you visit this URL?

http://localhost/?job=%3Chr/%3E%3Cstrong%3Ehi%20there!%3C/strong%3E

That’s the URL encoded form of this:

http://localhost/?job=<hr/><strong>hi there!</strong>

The exact appearance will depend on your browser, but basically, in the Value column, you should see hi there! in bold print with a line above it. Why? Because this is that line in the template after the value is added:

<tr><td>job</td><td><hr/><strong>hi there!</strong></td></tr>

This is sadly a common problem on the Web. People write Web applications and forget to encode user supplied data before sending it to a Web page. To fix this, add the following line to your code:

use HTML::Entities 'encode_entities';

And when you push the parameters onto the array:

push @params => { name => $param, value => $values };

Change it to this:

push @params => {
    name  => encode_entities($param),
    value => encode_entities($values)
};

Now when you visit that URL, you should see something like this:

Name    Value
job     <hr/><strong>hi there!</strong>

Warning

A common mistake when working with user data submitted from the Web is to encode HTML data as soon as you receive it. That ensures that no one can forget to encode the data before it is sent out to a Web page. Unfortunately, someone invariably forgets and re-encodes the data, causing strange things like &amp;amp; and other weirdness to show up in the on the Web page.

A stronger reason, however, is that you might want to use the data for something else that is not Web related. If you export the data to a spreadsheet, your users may not be impressed to see HTML entities there.

As a general rule, encode the HTML data right before it’s to be sent rendered in HTML and not before.

And if you look at the source code, you’ll see this (formatted to fit the page):

<tr>
  <td>job</td>
  <td>&lt;hr/&gt;&lt;strong&gt;hi there!&lt;/strong&gt;</td>
</tr>

The encode_entities function from HTML::Entities will encode strings into their corresponding HTML entities. For example, < becomes &lt; and > becomes &gt;.

There are hundreds of pre-defined character entities for HTML and they’re beyond the scope of what we’re doing here, but see http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references for a complete list. For the brave, you can also read the w3c specification: http://www.w3.org/TR/REC-html40/sgml/entities.html. Be warned, though: W3C specifications are not designed to be easy to read. They’re designed to be complete.

Handling POST requests

So far we’ve been handling GET requests. If you wish to pass extra data to the page, you do so via the query string in the URL. However, if you have data that should not be in a URL, such as a username and password, someone’s going to be very upset when they share that URL with someone before they notice that they’re sharing their private data.

This is where POST comes in. In HTTP, your client would send a POST request that looks similar to this:

POST /login HTTP/1.1
Content-Length: 31
Content-Type: application/x-www-form-urlencoded
Host: localhost:5000
Origin: http://localhost:5000
Referer: http://localhost:5000/login

username=ovid&password=youwish

In the example above, the POST tells the server that the data is in the entity body. The entity body begins after two consecutive newlines (after the Referer: header in the example above. Because the POST content is sent in the entity body, it will not show up in the URL.

Warning

Many developers believe that because a POST puts the data in the entity body, it’s more secure than a GET. Aside from people copying and pasting sensitive data in a URL, it is not more secure. If you want more security, there are many things you can do, including switching to HTTPS.

Note

A couple of things about our HTTP POST example: first, your author knows that Referer: is misspelled. Sadly, this happened a long time ago and became formalized in RFC 1945, released in 1996. Pedants lament; the rest of us deal with it.

Also, our Content-Type: is application/x-www-form-urlencoded. There are plenty of others available, but we won’t be covering them.

Let’s create a login page. This page doesn’t really “work” in the sense of allowing you to login (hey, this is an intro!), but it shows how a POST request works and also lets us see a bit more refactoring of the params.psgi application.

First, create templates/login.tt with the following:

<html>
  <head><title>Login</title></head>
  <body>
    <fieldset>
      <legend>Pretend to Login, please</legend>
      <form action="/login" method="POST">
        <table>
          <tr><td>Username</td><td><input type="text"
            name="username" /></td></tr>
          <tr><td>Password</td><td><input type="password"
            name="password" /></td></tr>
        </table>
        <div align="center"><input type="submit" value="Submit" /></div>
      </form>
  </body>
</html>

This doesn’t actually have any template parameters in it and there’s better ways of handling it then putting it in templates/, but this is fine for our purposes. Note that the form’s action sends us back to /login, because that path is what will handle the “login” request.

When you render that HTML, it should resemble Figure 15.2, “Figure 15-2”.

Figure 15.2. Figure 15-2

Figure 15-2

Now to see how to render it. Modify your params.psgi to contain the following code:

use strict;
use warnings;
use Plack::Builder;
use Plack::Request;
use Template::Tiny;
use File::Slurp 'read_file';
use HTML::Entities 'encode_entities';
builder {
    mount '/' => sub {
        my $env     = shift;
        my $request = Plack::Request->new($env);
        my @params  = get_params_array($request);
        my $content = get_content(
            'templates/params.tt',
            {
                params      => \@params,
                have_params => scalar @params,
            }
        );
        return response( $request, $content );
    };
    mount '/login' => sub {
        my $request = Plack::Request->new(shift);
        my $content;
        if ( $request->param('username') && $request->param('password') ) {
            my @params = get_params_array($request);
            $content = get_content(
                'templates/params.tt',
                {
                    params      => \@params,
                    have_params => scalar @params,
                }
            );
        }
        else {
            $content = get_content('templates/login.tt');
        }
        return response( $request, $content );
    };
};
sub get_params_array {
    my $request = shift;
    my @params;
    foreach my $param ( sort $request->param ) {
        my $values = join ',' => $request->param($param);
        push @params => {
            name  => encode_entities($param),
            value => encode_entities($values)
        };
    }
    return @params;
}
sub response {
    my ( $request, $content ) = @_;
    my $response = $request->new_response(200);
    $response->content_type('text/html');
    $response->content($content);
    return $response->finalize;
}
sub get_content {
    my ( $file, $vars ) = @_;
    $vars ||= {};
    my $template_code = read_file($file);
    my $template      = Template::Tiny->new;
    my $output;
    $template->process( \$template_code, $vars, \$output );
    return $output;
}

We have factored out our response() generation into its own subroutine. The code to get the content for the parameters has also been factored into get_params_array(). It takes the request as an argument and returns the array of parameter key/value pairs. That leaves us with our builder section,

You can still recognize the / path once you understand the get_params_array() and response() code. It does the same thing it did before. Our “login” code merely checks to see that we have POSTed both a username and a password (any will do). If you haven’t, it will render the login form. If you have, it will render the templates/params.tt page that we saw from our previous example.

Try it now by going to http://localhost:5000/login. Any username and password combination should return the template that shows the values you entered. For example, if you entered ovid and youwish for your username and password, you should see this:

Our list of params:
Name      Value
password  youwish
username  ovid

The URL, however, remains http://localhost:5000/login. As far as Plack::Request (and many other request handlers) are concerned, it makes no difference if it read your params from a POST or a GET. Thus, even if you switch this to a GET request, you’ll still see the table listing params:

http://localhost:5000/login?username=ovid;password=youwish

You can protect against this, if you wish, by only allowing processing of the username and password with a POST request method.

if (   'POST' eq $request->method
    && $request->param('username')
    && $request->param('password') )
{
    return response( $request, get_params_content($request) );
}

Warning

We can’t issue enough warnings in this chapter to say that the code presented here is not secure. Internet security is a serious problem and we are presenting this information as examples only. The management regrets harping on this, but it’s important.

Sessions

Your author would ask, at this point, for all serious Web professionals to turn to the next section and ignore the horrible, horrible abuse of sessions that will happen here.

HTTP is, by design, a stateless protocol. This means that each request is independent of every other request. In the early days of the Web, every time you visited a Web page, the server had no idea you had been there before. Then a Netscape employee named Lou Montulli had the brilliant idea of taking “magic cookies” (no, not the type you buy in Amsterdam), which were already in use in other software, and implementing them in Netscape Navigator, a browser popular back in the mid-to-late 1990s. This was one of the most important events in the history of the Web. Now, if a browser returns a cookie to a host, the host can know that the user had previously visited.

In the code that follows, Plack::Session uses cookies to pass a session key back and forth between the Plack software and the client. The Plack software will use the value of the cookie to look up its in-memory session data. Because this data is not persistent, it will not survive between server restarts. In serious production applications, session data is generally saved in a persistent state, such as in a database or memcached.

Now let the abuse begin!

First, you need to install Plack::Middleware::Session from the CPAN. Then, at the top of your params.psgi, after the modules you use, add the following two lines:

use Plack::Session;
use constant SESSION_TIME => 30;

The session time is in seconds. We’re only use 30 second sessions because your author is sadistic. Feel free to adjust that to taste.

Next, add the following two subroutines:

sub time_remaining {
    my $session = shift;
    my $remaining = SESSION_TIME - ( time - $session->get('time') );
    $remaining = 0 if $remaining < 0;
    return $remaining;
}
sub session_expired {
    my ( $request, $session ) = @_;
    return if time_remaining($session);
    $session->expire;
    my $response = $request->new_response;
    $response->redirect('/login');
    return $response->finalize;
}

The time_remaining() subroutine returns the number of left in your session.

The session_expired() subroutine will return false if you have time remaining in your session.

Then, change your templates/params.tt file to this:

<html>
  <head><title>Parameters</title></head>
  <body>
    <p>Hello [% username %]. You have [% time %] seconds left.</p>
[% IF have_params %]
    <p>Our list of params:</p>
    <table rules="all">
      <tr><th>Name</th><th>Value</th></tr>
  [% FOREACH param IN params %]
      <tr><td>[% param.name %]</td><td>[% param.value %]</td></tr>
  [% END %]
    </table>
[% ELSE %]
    <p><strong>No params supplied!</strong></p>
[% END %]
  </body>
</html>

The only change is immediately after the body tag where we display the session username and the time remaining.

Next, rewrite your builder, again, to match the following:

builder {
    enable 'Session';
    mount '/' => sub {
        my $env     = shift;
        my $request = Plack::Request->new($env);
        my $session = Plack::Session->new($env);
        if ( my $redirect = session_expired( $request, $session ) ) {
            return $redirect;
        }
        my @params = get_params_array($request);
        if ( $session->get('from_login') ) {
            push @params => {
                name  => 'username',
                value => $session->get('username'),
            };
            $session->remove('from_login');
        }
        my %template_vars = (
            params      => \@params,
            have_params => scalar( @params ),
            username    => $session->get('username'),
            time        => remaining_time($session),
        );
        my $content = get_content( 'templates/params.tt', \%template_vars, );
        return response( $request, $content );
    };
    mount '/login' => sub {
        my $env     = shift;
        my $request = Plack::Request->new($env);
        my $session = Plack::Session->new($env);
        my $content;
        if ( $request->param('username') && $request->param('password') ) {
            $session->set( 'username', $request->param('username') );
            $session->set( 'time', time );
            $session->set( 'from_login', 1 );
            my $response = $request->new_response;
            $response->redirect('/');
            return $response->finalize;
        }
        else {
            $content = get_content('templates/login.tt');
        }
        return response( $request, $content );
    };
};

When you first restart the app, you’re presented with the login screen. Let’s say you login with a username of Bob and a password of Dobbs. You’ll see a screen like this:

Hello Bob. You have 30 seconds left.
Our list of params:
Name      Value
username  Bob

You can refresh this screen as often as you like and as soon as you have zero seconds left, you’re redirected to the /login screen. Your “username” and the time left on your session are stored in the session itself. Let’s break down how this works.

When you have entered both a username and password, the following code is executed:

if ( $request->param('username') && $request->param('password') ) {
    $session->set( 'username', $request->param('username') );
    $session->set( 'time', time );
    $session->set( 'from_login', 1 );
    my $response = $request->new_response;
    $response->redirect('/');
    return $response->finalize;
}

This sets the username, time, and from_login values in our session. In our toy example, this session is held in memory.

When the browser is redirected to /, the following is the relevant bit of code related to the session:

if ( my $redirect = session_expired( $request, $session ) ) {
    return $redirect;
}
my @params = get_params_array($request);
if ( $session->get('from_login') ) {
    push @params => {
        name  => 'username',
        value => $session->get('username'),
    };
    $session->remove('from_login');
}

The session_expired() function will return a redirect to /login if the session is greater than SESSION_TIME seconds ago or if there is no username in the session.

Note

You could have accessed the session cookie directly and set the expiration time on that. However, the savvy end-user can edit their cookies and change the time manually, artificially extending their session life. That’s why it’s a good idea to not rely on the cookie expiration time for session length.

The $session->get('from_login') checks to see if the from_login value was set in the session (you can’t rely on the Referer: value because the end-user can change that, too) and, if it was, the from_login value is cleared and the username is added to the list of params for rendering.

Web Clients

Whew! We’ve covered a huge amount about the concepts behind writing Web applications, but what about writing Web clients? The client most people are familiar with is the Web browser, but that’s for, well, browsing the Web.

Many times you have specific tasks you want a client to accomplish, but a Web browser might be a poor choice. So instead, you write your own client to do the task for you. For example, you might want to get all of the images from a particular Web page. If there are hundreds of images, it might easier to write a client to get those images for you.

Warning

Before we go any further, you must remember this: web clients are fun and web clients are easy to write. They can save you a lot of trouble, but can also get you in a lot of trouble. Many Web sites have extremely clear terms of service (TOS) which state that you may not use software to “spider” or “automate” their Web site. Others require you to go through official channels to get an API key before you write a client.

They’re not doing this to be mean. They’re doing this for a variety of reasons. They might have limited resources and your Web client might spider them so fast that they have trouble responding to requests. Or they might have time-sensitive content that should not be stored, or it might be copyrighted, and so on. Before you write a client to automate some work with a Web site, be sure to read their TOS and understand what your rights and responsibilities are.

There are ways to work around Web sites blocking you clients, but we’re not going to discuss them here and we encourage you to think carefully before you do. If you must run a client you wrote against the Web site of someone trying to block you, ask permission and if you don’t get it, don’t do it.

As a general pattern for writing a Web client, you’ll be going through three steps:

  • Navigate to where you’re trying to go

  • Fetch the content

  • Parse the content

Now let’s see some examples.

Note

When you’re writing a Web client, you’ll want to understand the various HTTP response codes that you may receive. For example, a 200 means the request succeeded and a 404 is a File Not Found. There are many possible response codes and they can be hard to remember. Check out http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html to understand a bit more about them.

Sometimes you want to fetch the links on a Web page. We’ll use LWP::Simple from the libwww-perl distribution to get the HTML for a Web page and HTML::SimpleLinkExtor to extract the links.

You will need to download both of these modules from the CPAN:

$ cpan libwww-perl HTML::SimpleLinkExtor

Note

The libwww-perl module includes many modules that make life easier while writing Web clients. However, they do not support HTTPS (encrypted) URLs. You will need to install LWP::Protocol::https separately. Otherwise, you may get strange errors when writing clients.

So here’s a little script to get us started:

use strict;
use warnings;
use HTML::SimpleLinkExtor;
use LWP::Simple 'get';
my $url       = shift @ARGV or die "Hey, gimme a URL!";
my $html      = get($url) or die "Could not get '$url'";
my $extractor = HTML::SimpleLinkExtor->new;
$extractor->parse($html);
my @links = $extractor->links;
unless (@links) {
    print "No links founds for $url\n";
    exit;
}
for my $link (sort @links) {
    print "$link\n";
}

Note

get_links.pl available for download at Wrox.com.

You can run this against, say, a popular search engine like this:

http://searchenginename/

It will print out a sort list of all links found on that page. The get() function exported from LWP::Simple accepts a URL and returns the contents. The HTML::SimpleLinkExtor is used for extracting the links. You can play with this for a few Web pages and you’ll be amazed at how many links they have. But after a while, you may get tired of the No links found error, particularly if you visit the Web page and you know that there are links there. So let’s update the program using LWP::UserAgent instead of LWP::Simple.

use strict;
use warnings;
use HTML::SimpleLinkExtor;
use LWP::UserAgent;
my $url = shift @ARGV or die "Hey, gimme a URL!";
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url) or die "Could not get '$url'";
unless ( $response->is_success ) {
    die $response->status_line;
}
my $html = $response->decoded_content;
my $extractor = HTML::SimpleLinkExtor->new;
$extractor->parse($html);
my @links = $extractor->links;
unless (@links) {
    print "No links founds for $url\n";
    exit;
}
for my $link ( sort @links ) {
    print "$link\n";
}

Now try running this with perl get_links.pl whitehouse.gov (note the lack of http://). You should get an error similar to the following:

400 URL must be absolute at get_links.pl line 14.

Ah! That’s better. Now at least we have some idea of what our errors are.

Extracting Comments from Web Pages

Your author has a friend, whom we shall presume wishes to be nameless, who has a habit of responding in online forums in a very friendly, informative manner. One of the forums she participates in allows a subset of HTML to be used, including HTML comments. So she embeds HTML comments in her replies. In HTML, they look like this:

<!-- this is a comment -->

The comments can span multiple lines. Her HTML comments span multiple jurisdictions of vitriol spewed at the person she is responding to. So let’s write a small program that prints the HTML comments in a Web page. Sadly, I cannot point you to her comments as this is a family-friendly book, but you’ll enjoy the end result nonetheless.

use strict;
use warnings;
use HTML::SimpleLinkExtor;
use HTML::TokeParser::Simple;
my $url = shift @ARGV or die "Hey, gimme a URL!";
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url) or die "Could not get '$url'";
unless ( $response->is_success ) {
    die $response->status_line;
}
my $html = $response->decoded_content;
my $parser = HTML::TokeParser::Simple->new( \$html );
while ( my $token = $parser->get_token ) {
    print $token->as_is, "\n" if $token->is_comment;
}

Note

get_comments.pl available for download at Wrox.com.

This program uses HTML::TokeParser::Simple to parse the HTML returned by LWP::UserAgent. There are a wide variety of parsers available, some more suited to extracting information than others, but this is a pretty easy one to start with (disclaimer: your author wrote it).

The key portion of this code is here:

1:  my $parser = HTML::TokeParser::Simple->new( \$html );
2:  while ( my $token = $parser->get_token ) {
3:      print $token->as_is, "\n" if $token->is_comment;
4:  }

If you have the text of a Web page, you must pass it as a reference to the constructor. Then, you can keep calling $parser->get_token to get the next “bit” of the Web page. Tokens are things such as HTML tags, HTML comments, text, and so on. We walk through all of the tokens and only print the ones that are comments. Easy, eh?

One major Web comic has an ASCII pterodactyl embedded in their comments. Another site has <!--IE6sux--> 54 times. You can have a lot of fun finding unexpected comments on Web sites (remember to obey their terms of service).

Filling Out Forms Programmatically

OK, so we’ve written two simple examples so far. Boooooring. Let’s do something a little more involved. Let’s write some software to fill out a form on a Web site and see what happens when we submit it!

Er, except that’s hard to do in a book for a couple of reasons: websites often change their content or TOS and your author would not like to be sued down to his skivvies for encouraging people to do this. Fortunately, we have a workaround.

Our last “Try it out” section had a Web form that you can fill in and submit. Perfect! So go back and run plackup characters.psgi for this example and leave that running in another terminal window. That’s going to be the Web server you’ll run this example against. You’ll need to install WWW::Mechanize and HTML::TableExtract to make this work.

Type in the following example and save it as post_character.pl.

use strict;
use warnings;
use WWW::Mechanize;
use HTML::TableExtract;
my $url  = 'http://localhost:5000/';
my $mech = WWW::Mechanize->new;
$mech->get($url);
$mech->follow_link( text_regex => qr/click here/i );
$mech->submit_form(
    form_number => 1,
    fields      => {
        name       => 'Bob',
        profession => 'redshirt',
        birthplace => 'mars',
    },
);
my $extractor = HTML::TableExtract->new;
$extractor->parse($mech->content);
foreach my $table ( $extractor->tables ) {
    foreach my $row ( $table->rows ) {
        printf "%-20s - %s\n" => @$row;
    }
}

Note

post_character.pl available for download at Wrox.com.

Assuming that you didn’t do something silly like change the HTML in the character.psgi example, you should get output similar to the following (obviously the numeric values will be different):

Name                 - Bob
Profession           - Doomed
Birth place          - Mars
Strength             - 24
Intelligence         - 22
Health               - 1

It looks like our poor red shirt is going to die. On the plus side, at least he’s smart enough to know it.

WWW::Mechanize has a lovely interface. The code looks remarkably similar to what you might do as a human.

my $mech = WWW::Mechanize->new;
$mech->get($url);
$mech->follow_link( text_regex => qr/click here/i );

As you will recall, when you go the main page it has a link telling you to Please click here to create a new character.

There is only one form on the page, so we submit our values to form number 1.

$mech->submit_form(
    form_number => 1,
    fields      => {
        name       => 'Bob',
        profession => 'redshirt',
        birthplace => 'mars',
    },
);

Obviously, you’d have to read the HTML of the page to know the names and appropriate values for a given form, but it’s pretty darned easy to do. Just make sure your field refer to the value="..." data and not the human visible names.

Then we use HTML::TableExtract to get the values from the table printed on the next page:

my $extractor = HTML::TableExtract->new;
$extractor->parse($mech->content);
foreach my $table ( $extractor->tables ) {
    foreach my $row ( $table->rows ) {
        printf "%-20s - %s\n" => @$row;
    }
}

To be fair, this was a rather simple example. If you need to do this with more complicated Web sites, you’ll need to read the documentation carefully.

Note

When we submitted a Web form to generate a character’s stats, imagine if you didn’t have access to the back end code. By repeatedly submitting the form and collecting the data, you could eventually get an idea of what’s going on behind the scenes, such as calculating the average value and standard deviation of stats based on profession and birth place.

This is one of the many reasons why Web sites have annoying captchas: it’s very hard to stop people from automating things that you don’t want automated.

Be aware that many Web developers put all of their form validation in Javascript and not on the back end. Thus, if you use these techniques, you may be able to submit data and generate errors that are hard to reproduce using a browser (sometimes even if you have Javascript disabled). Be very careful with them and don’t use them irresponsibly.

Summary

In this chapter, you’ve learned some of the basics of writing Web applications. You’ve used Plack extensively to learn a bit about HTTP and how to read query parameters sent to your application via GET and POST requests. You’ve created simple templates to keep your HTML or other presentation code separated from your application’s main logic. You’ve also learned a little bit about how sessions and cookies work.

You have finally started learning to use your powers for evil (that’s the Web clients), but we’d appreciate it if you didn’t do that. You’ve learned how to write software to read the HTML on a Web site and print out interesting information about it. You’ve also used WWW::Mechanize to automate the process of filling out forms on Web pages. Finally, you learned a bit about using Web APIs to get access Web services.

Exercises

  1. Update the character.psgi and related templates to include Education. A character can study Combat, Medicine, or Engineering. These should give +2 to strength, health and intelligence, respectively.

  2. Using the updated characters.psgi from Exercise 1, update the WWW::Mechanize example to generate 100 Programmer characters, born on Earth, with an Engineering education. Print out the average stats for Strength, Intelligence and Health, along with the high and low values (actually, the standard deviation would be better, but this is not a statistics book).

WHAT YOU LEARNED IN THIS CHAPTER

Topic

Key Concept

HTTP

A plain-text protocol to communicate between clients and servers

PSGI

A specification of how Web servers and applications can communicate

Plack

A Perl implementation of PSGI

Query string

An encoded way of passing additional information to a Web application

GET

A way of fetching HTTP resources, of with an embedded query string

POST

A way of modifying HTTP resources

Cookies

Small bits of text data stored by your browser and returned to a server

Sessions

A way of maintaining information about a particular Web client

HTML::SimpleLinkExtor

Extract links from HTML documents

HTML::TokeParser::Simple

Parse HTML documents

WWW::Mechanize

Automate the navigation of Web pages

Answers to exercises

1. Update the character.psgi and related templates to include Education. A character can study Combat, Medicine, or Engineering. These should give +2 to strength, health and intelligence, respectively.

First, we’ll look at the templates:

In the templates/characters.tt, add the following select group after the Profession. This will allow you to choose your education.

<tr>
  <td>Education</td>
  <td>
    <select name="education">
      <option value="combat">Combat</option>
      <option value="medical">Medical</option>
      <option value="engineering">Engineering</option>
    </select>
  </td>
</tr>

In templates/character_display.tt, add the following line after Profession. It will allow the chosen education to be displayed.

<tr><td>Education</td><td>[% character.education %]</td></tr>

The main work is in our characters.psgi, but it’s fairly easy.

In the generate_character() subroutine, the %adjustments_for hash now looks like this:

my %adjustments_for = (
    profession => {
        programmer => {
            strength     => −3,
            intelligence => 8,
            health       => −2,
        },
        pilot    => { intelligence => 3 },
        redshirt => { strength     => 5 }
    },
    birthplace => {
        earth => {
            strength     => 2,
            intelligence => 0,
            health       => −2,
        },
        mars => { strength     => −5, health => 2 },
        vat  => { intelligence => 2,  health => −2 }
    },
    education => {
        combat      => { strength     => 2 },
        medical     => { health       => 2 },
        engineering => { intelligence => 2 }
    },
);

The %label_for hash now looks like this:

my %label_for = (
    profession => {
        pilot      => "Starship Pilot",
        programmer => "Programmer",
        redshirt   => "Doomed",
    },
    education => {
        combat      => "Combat",
        medical     => "Medical",
        engineering => "Engineering",
    },
    birthplace => {
        earth => "Earth",
        mars  => "Mars",
        vat   => "Vat 3-5LX",
    },
);

And you just need to add education to the list of attributes you are iterating over:

foreach my $attribute (qw/name education profession birthplace/) {
       # create character
   }

With that, run plackup characters.psgi and try it out.

2. Using the updated characters.psgi from Exercise 1, update the WWW::Mechanize example to generate 100 Programmer characters, born on Earth, with an Engineering education. Print out the average stats for Strength, Intelligence and Health, along with the high and low values (actually, the standard deviation would be better, but this is not a statistics book).

use strict;
use warnings;
use WWW::Mechanize;
use HTML::TableExtract;
use List::Util qw/min max sum/;
my $url  = 'http://localhost:5000/';
my $mech = WWW::Mechanize->new;
my %stats_for = map { $_ => [] } qw/Strength Intelligence Health/;
for ( 1 .. 100 ) {
    $mech->get($url);
    $mech->follow_link( text_regex => qr/Please click here/ );
    $mech->submit_form(
        form_number => 1,
        fields      => {
            name       => 'Bob',
            profession => 'programmer',
            education  => 'engineering',
            birthplace => 'earth',
        },
    );
    my $te = HTML::TableExtract->new;
    $te->parse( $mech->content );
    foreach my $ts ( $te->tables ) {
        foreach my $row ( $ts->rows ) {
            if ( exists $stats_for{ $row->[0] } ) {
                push @{ $stats_for{ $row->[0] } } => $row->[1];
            }
        }
    }
}
while ( my ( $stat, $values ) = each %stats_for ) {
    my $min = min @$values;
    my $max = max @$values;
    my $avg = sum(@$values)/scalar @$values;
    print "$stat:  Min ($min) Max ($max) Average ($avg)\n";
}

Running this on your author’s computer takes just over a second. Running this over the Web would likely take much longer. Here is the output from two sample runs:

Health:  Min (2) Max (23) Average (12.72)
Strength:  Min (5) Max (26) Average (15.32)
Intelligence:  Min (15) Max (39) Average (26.32)

Health:  Min (0) Max (24) Average (12.12)
Strength:  Min (4) Max (26) Average (14.79)
Intelligence:  Min (17) Max (38) Average (26.29)

It appears that in the second run, we generated a dead programmer. Oops.

Site last updated on: July 5, 2012 at 11:41:08 AM PDT
Cover for Beginning Perl (Wrox)

View 2 comments

  1. pkailasa – Posted June 15, 2012

    s/passes it to the controller/passes it to the Model/

    Edited on June 15, 2012, 2:21 p.m. PDT

  2. Curtis Poe – Posted June 16, 2012

    @pkailasa: d'oh! Thanks :)

Add a comment

View 2 comments

  1. mikethepod – Posted June 27, 2012

    uc("HTTPs"); s/You (author)/Your $1/;

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. Ben Bullock – Posted June 14, 2012

    "it's can be" -> "it can be"

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. Ben Bullock – Posted June 14, 2012

    "other more robust templating modules" -> "other, more robust, templating modules"

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mikethepod – Posted June 27, 2012

    s/, create a templates directory/ files, create a templates directory/; s/(create a file) names/$1 named/;

    Edited on June 27, 2012, 10:52 p.m. PDT

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    I think you want the following instead:

    params => [ { name => 'name', value => 'john' }, { name => 'color', value => 'red,blue' }, ...

  2. Curtis Poe – Posted July 5, 2012

    Damn. Nice catch!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. Ben Bullock – Posted June 14, 2012

    "that, along with the template name to the" -> "that, along with the template name, to the"

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    typo: "take out some code"

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. chrisjack1 – Posted June 29, 2012

    s/in the //

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. Ben Bullock – Posted June 14, 2012

    "right before it’s to be sent rendered in HTML " -> "right before it’s sent to be rendered in HTML"

    Edited on June 15, 2012, 7:50 a.m. PDT

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thank you!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    typo: there are better ways

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. chrisjack1 – Posted June 29, 2012

    s/read/reads/

    The author has indicated that the issue raised in this comment has been resolved.

  2. Curtis Poe – Posted July 5, 2012

    That is actually fine once you realize it's the past tense (which is OK in this construct), but nonetheless, I fixed it because others might think the same thing.

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    typo: We only use 30 ... or We're only using 30 ...

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    number of seconds left

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thanks!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment

View 2 comments

  1. mraj7 – Posted June 20, 2012

    will print out a sorted list

  2. Curtis Poe – Posted July 5, 2012

    Fixed. Thank you!

    The author has indicated that the issue raised in this comment has been resolved.

Add a comment