Chapter 7. Subroutines
WHAT YOU WILL LEARN IN THIS CHAPTER
How to declare a subroutine
Passing data to subroutines
Returning data from subroutines
Using prototypes
Using subroutine references
Understanding recursion
Implementing error checking
A subroutine is just a way of providing a “name” to a piece of code. This is useful when you need to execute the same piece of code in several different places in your program, but you don’t want to just “cut-n-drool” the same code all over the place.
Even if you don’t want to reuse a piece of code, applying a name can be useful. Compare the following two lines of code:
my $result = 1 + int( rand(6) ); my $result = random_die_roll();
Just by intelligently naming a subroutine, you can see that the second line of code much clearer than the first. Thus, we can use subroutines to make our code more self-documenting. As an added benefit, the name of a subroutine is documentation that you don’t forget to add.
Subroutine Syntax
A basic subroutine (often just called a “sub”) is declared with the syntax of:
sub IDENTIFIER BLOCK
IDENTIFIER is the name of the subroutine and BLOCK is the block of code which is executed. So if we want to write a subroutine that simulates the roll of one six-sided die, you can write it like this:
sub random_die_roll {
return 1 + int( rand(6) );
}The return() builtin is used to return data from a subroutine. You can use the subroutine either before or after it’s defined.
Now that you have assigned a name to that block of code, you can use it more or less like any Perl builtin. This code will print a random number from 1 to 6.
my $result = random_die_roll();
print $result;
sub random_die_roll {
return 1 + int( rand(6) );
}Note
In Perl, there is no formal distinction between a subroutine and a function. In some programming languages, a function and a subroutine are the same, but a function returns a value and a subroutine does not. There is no such distinction in Perl. As a result, people will sometimes refer to subroutines as functions. Again, don’t get hung up on terminology. Functionality (pun probably not intended) is what you should pay attention to.
Argument handling
Subroutines are often used when you want to reuse some code but with different data. We call the data we pass to subroutines arguments. For example, while six sided dice are the most common, many games have dice with a different number of sides. So we might want to pass to random_die_roll() the number of sides of the die we wish to roll:
my $result = random_die_roll(10);
The arguments to a subroutine are stored in the special @_ array. Here’s how to write the sub that will allow us to optionally pass the number of sides of the die we wish to roll:
sub random_die_roll {
my ($number_of_sides) = @_;
# have a useful default if called with no arguments
$number_of_sides ||= 6;
return 1 + int( rand($number_of_sides) );
}Warning
Note that we have parentheses around the variables we’re assigning the subroutine arguments to. This is just normal Perl syntax for force list context. Here’s a common mistake many Perl beginners make:
sub random_die_roll {
my $number_of_sides = @_;
# ... more code
}That evaluates the @_ array in scalar context, setting $number_of_sides to the number of elements in @_. That’s probably not what you wanted.
If you prefer, you can also write the argument handling like this:
sub random_die_roll {
my $number_of_sides = shift;
# ... more code here
}The shift() builtin (and the pop() builtin), when used in a subroutine and called with no arguments, will default to shifting off the first value of @_. You can be explicit if you prefer:
my $number_of_sides = shift @_;
Sometimes you’ll see subroutine calls prefixed with an ampersand:
my $result = &random_die_roll();
While valid, this is an older form of subroutine syntax we recommend you do not use except in one very special case:
my $result = &random_die_roll;
Note that we’ve called &random_die_roll without parentheses. When we do that, the current value of @_, if any, is passed to the new subroutine. This is sometimes useful, but it’s confusing because it looks like you called the subroutine without any arguments.
Multiple Arguments
Sometimes you want to roll a die more than once and add up the value of each die roll. Passing multiple arguments to an array is very simple. Here’s how to roll a six-sided die three times and print the result:
sub random_die_roll {
my ( $number_of_sides, $number_of_rolls ) = @_;
# have a useful default if called with no arguments
$number_of_sides ||= 6;
# the number of times to roll the die defaults to 1
$number_of_rolls ||= 1;
my $total = 0;
for ( 1 .. $number_of_rolls ) {
$total += 1 + int( rand($number_of_sides) );
}
return $total;
}
print random_die_roll( 6, 3 );Because there is more than one way to do it, you can handle the arguments like this:
my $number_of_sides = shift; my $number_of_rolls = shift;
Or if you prefer to be explicit:
my $number_of_sides = shift @_; my $number_of_rolls = shift @_;
Note that subroutines in Perl are variadic. That means they can take a variable number of arguments. So if you pass too many arguments to a subroutine, Perl will usually ignore the extra arguments. The following will print a random number from 1 to 10 and will ignore the second argument:
sub random_die_roll {
my ($number_of_sides) = @_;
# have a useful default if called with no arguments
$number_of_sides ||= 6;
return 1 + int( rand($number_of_sides) );
}
print random_die_roll( 10, 3 );In fact, you can pass as many arguments as you like and Perl will still happily ignore them:
print random_die_roll( 10, 3, $some_val, @foobar );
This is a legacy of Perl’s roots that we still have today. There are modules such as Params::Validate to help deal with this, but Perl programmers usually just read the documentation and know how they’re supposed to call the subroutines.
Named Arguments
When you start passing multiple arguments to a subroutine, it can be confusing to know what the arguments mean. Is the following telling us to roll a six-sided die four times or a four-sided die six times?
print random_die_roll( 6, 4 );
One what to do that is to use named arguments. In Perl, we handle this by passing a hash:
print random_die_roll(
number_of_sides => 6,
number_of_rolls => 4,
);
sub random_die_roll {
my %arg_for = @_;
# assign useful defaults
my $number_of_sides = $arg_for{number_of_sides} || 6;
my $number_of_rolls = $arg_for{number_of_rolls} || 1;
my $total = 0;
for ( 1 .. $number_of_rolls ) {
$total += ( 1 + int( rand($number_of_sides) ) );
}
return $total;
}This is very useful because not only is it more self-documenting, it also makes it easy for any argument to be optional. When we called random_die_roll(6,3), what if we wanted the default number of sides but to have it rolled three times? You’d have to write something like the following:
my $result = random_die_roll(undef, 3); # or my $result = random_die_roll(0, 3);
Both of those can be confusing because their intent may not be clear. Instead, you can write the following
print random_die_roll( number_of_rolls => 4 );
There is a slight problem with this, though. What if someone doesn’t read your documentation (you write documentation, don’t you?) and they try to call it like this?
print random_die_roll(2);
sub random_die_roll {
my %arg_for = @_;
# assign useful defaults
my $number_of_sides = $arg_for{number_of_sides} || 6;
my $number_of_rolls = $arg_for{number_of_rolls} || 1;
my $total = 0;
for ( 1 .. $number_of_rolls ) {
$total += ( 1 + int( rand($number_of_sides) ) );
}
return $total;
}Assuming you have warnings enabled, that will warn about Odd number of elements in hash assignment. You will also get the default values for the $number_of_sides and $number_of_rolls. Quite often programmers overlook warnings, forget to enable them, or have so many other warnings that they miss simple ones like this. A better way of handling named arguments is to pass a hash reference instead.
print random_die_roll(
{
number_of_sides => 6,
number_of_rolls => 4,
}
);
sub random_die_roll {
my ($arg_for) = @_;
# assign useful defaults
my $number_of_sides = $arg_for->{number_of_sides} || 6;
my $number_of_rolls = $arg_for->{number_of_rolls} || 1;
my $total = 0;
for ( 1 .. $number_of_rolls ) {
$total += ( 1 + int( rand($number_of_sides) ) );
}
return $total;
}With this code, if you have use strict (and you should), then calling random_die_roll(6) results in the following fatal error:
Can't use string ("6") as a HASH ref while "strict refs" in useIt’s far better to have your program die horribly than to return bad data.
Aliasing
One thing to be aware of when using subroutines is that the @_ array aliases its arguments. Thus, you can write the following:
my $number = 40;
inc_by_two($number);
print $number;
sub inc_by_two {
$_[0] += 2;
}That modifies the $number variable in place and prints 42. However, if you call it like this:
inc_by_two(40);
That generates the following error:
Modification of a read-only value attempted at ...
Naturally, the aliasing cascades, so this throws the same error:
inc_list(3,2,1);
sub inc_list {
foreach (@_) {
$_++;
}
}As a general rule, subroutines are safest when they don’t have side effects like this. Instead of trying to rely on aliasing to change variables in place, you should generally assign @_ to new variables and return new values.
sub inc_list {
my @numbers = @_;
foreach (@numbers) {
$_++;
}
return @numbers;
}state variables (pre- and post-5.10)
When you call a subroutine, variables declared in that sub are reinitialized every time you call the subroutine. However, sometimes you only want to initialize the variable once and have it retain its value between subroutine invocations. If you are using Perl version 5.10.0 or better, you can declare a state variable. Here’s a subroutine that tracks the number of time it has been called.
use 5.010;
sub how_many {
state $count = 0; # this is initialized only once
$count++;
print "I have been called $count time(s)\n";
}
how_many() for 1 .. 5;That prints:
I have been called 1 time(s) I have been called 2 time(s) I have been called 3 time(s) I have been called 4 time(s) I have been called 5 time(s)
On versions of Perl older than 5.10.0, you can still do this, but you wrap the subroutine in a block and declare the $count variable in that block, but outside of the subroutine:
{
my $count = 0;
sub how_many {
$count++;
print "I have been called $count time(s)\n";
}
}
how_many() for 1 .. 5;That prints the same thing.
The reason it works is because the subroutine is in the block in which the $count variable has been declared. It is said to “close over” the scope of that variable and is thus known as a closure. Closures are common in Perl, but are usually used with anonymous subroutines, which we’ll discuss later.
Note that the $count variable doesn’t really need to be declared in a block like that, but if you don’t, other sections of code might see the $count variable and accidentally change its value. The block is just there to safely restrict the scope of $count.
Warning
It’s generally a bad idea to have a subroutine referring to variables not explicitly passed to the subroutine. This is because if some other code changes those variables in the way the subroutine does not expect, it can be very difficult to find out what code is responsible for making that change. This is why for older Perl’s, we put the $count variable in a very limited scope to make sure that other code can’t touch it.
However, this style of making state variables is clumsy and error prone. Consider a subroutine that wants to make sure it’s never called with the same argument twice in a row:
use strict;
use warnings;
do_stuff($_) for 1 .. 5;
{
my $last = 0;
sub do_stuff {
my $arg = shift;
if ( $arg == $last ) {
print "You called me twice in a row with $arg\n";
}
$last = $arg;
}
}That code generates the following warning:
Use of uninitialized value $last in numeric eq (==) at ...
Why? Variable declaration happens at compile time, before the code is run. However, variable assignment happens at runtime and the assignment of 0 to $last doesn’t happen until after the calls to do_stuff(). Thus, the first time do_stuff() is called, $last is declared but has no value assigned to it! This is not an issue with state variables:
use strict;
use warnings;
do_stuff($_) for 1 .. 5;
sub do_stuff {
state $last = 0;
my $arg = shift;
if ( $arg == $last ) {
print "You called me twice in a row with $arg\n";
}
$last = $arg;
}That doesn’t have the warning because at compile time, $last is declared, but the first time we enter the do_stuff() subroutine, the $last = 0 assignment happens.
Note
See perldoc feature and perldoc -f state for more information about using state variables.
Passing a list, hash, or hashref?
This section isn’t really about Perl, but about good coding style. You can skip it if you want, but if you’re new to programming, it’s worth reading.
Many times when writing a subroutine we have to decide if we want to pass single arguments, multiple arguments, references, and so on. Here are a few good rules of thumb to consider.
If you have more than two arguments to pass to a subroutine, consider using a hash reference to use named arguments, particular if some of the arguments are optional. Consider the following subroutine call where the account number may be optional. If the customer only has one account, then the subroutine might default to that account. If you want to check the balance and there is no amount to $debit, that might be optional too. Named arguments are warranted here:
# probably bad
my $balance = get_balance( $customer, $account_number, $debit );
# better
my $balance = get_balance({
account_number => $account_number,
customer => $customer,
debit => $debit,
});With that, you can omit the account_number and debit and still have code that is easy to read. Plus, the order of the arguments becomes irrelevant.
But you might think that passing a hash reference is overkill here. It’s perfectly easy to read with good variable names, right? Well, you may find yourself in a section of your code where the variable names are not so clear:
my $balance = get_balance({
account_number => $acct,
customer => $co,
});Well-chosen named arguments make code much easier to read.
So is there ever a reason to pass a list to a subroutine? Sure! If you only pass one or two items, or if every item in the list is conceptually the same, passing a list is fine:
sub sum {
my @numbers = @_;
my $total = 0;
$total += $_ foreach @numbers;
return $total;
}
print sum(4, 7, 2, 100);In this case, using named arguments would be silly as we’re just summing a list of numbers.
Sometimes passing a list would be a bad idea. Imagine if the numbers you passed into sum() were a two million order totals you’ve just read from a CSV file. When you pass the list to sum(), Perl must copy every value and this might eat up a lot of memory. Instead, you can pass a reference and Perl will only copy the single value of the reference:
sub sum {
my $numbers = @_;
my $total = 0;
$total += $_ foreach @$numbers;
return $total;
}
print sum(\@two_million_numbers);Sometimes you might want to pass a hash to a sub, but as explained previously, there is nothing to stop one from passing something that isn’t a hash. As a result, hard-to-find bugs can creep into your code. Using a hashref when you want a hash is much safer.
Returning data
When writing subroutines, it’s not very helpful if you can’t return data. We’ll explain many of the ways of doing this that you’ll encounter in real code. The clearest way to do this is to use the return builtin.
Returning true/false
Many of the most basic subroutines return a try or false value. Here’s one way to write an is_palindrome() subroutine, ignoring the case of the word:
sub is_palindrome {
my $word = lc shift;
if ( $word eq scalar reverse $word ) {
return 1;
}
else {
# a bare return returns an empty list which evaluates to false
return;
}
}
for my $word (qw/Abba abba notabba/) {
# remember that the ternary ?: operator is a shortcut for if/else
my $maybe = is_palindrome($word) ? "" : "not";
print "$word is $maybe a palindrome\n";
}And that prints:
Abba is a palindrome abba is a palindrome notabba is not a palindrome
You’ll notice that, unlike some other languages, you can put a return statement anywhere in the body of the subroutine. However, we can make this subroutine even simpler:
sub is_palindrome {
my $word = lc shift;
return $word eq scalar reverse $word;
}If you don’t include an explicit return statement in a subroutine, the subroutine will return the result of the last expression to be evaluated, allowing you to write is_palindrome() as follows:
sub is_palindrome {
my $word = lc shift;
$word eq scalar reverse $word;
}It’s strongly recommended that you use an explicit return on all but the simplest subroutines because in a complicated subroutine, explicit return statements clarify flow control.
Warning
Some developers prefer to return an empty string or a zero for “false”.
sub is_palindrome {
my $word = lc shift;
return $word eq scalar reverse $word ? 1 : 0;
}That’s OK, but consider the following:
if ( my @result = is_palindrome($word) ) {
# do something
}That’s a silly example, but if you return an empty string or a zero for false, then @result will now be a one-element array and will evaluate to true! This can cause strange bugs in your code if you don’t consider this.
Note
If you need a review of true and false values, see “The if Statement” in Chapter 4, Working With data.
Returning single and multiple values
As you might guess from the preceding examples, returning a single value is as simple as return $some_value:
use constant PI => 3.1415927;
sub area_of_circle {
my $radius = shift;
return PI * ( $radius ** 2 );
}
print area_of_circle(3);That will print 28.2743343, the area of a circle with a radius of 3 (of whatever units you’re using).
Returning multiple values is simple. Just return them!
return ( $first, $second, $third );
Be aware, though, that if you return an array or hash, its data will be flattened into a list:
sub double_it {
my @array = @_;
$_ *= 2 for @array;
return @array;
}That will return a new list with the values doubled. However, if you want to return two arrays, or two hashes, or an array and a hash, and so on, you will want to return references:
sub some_function {
my @args = @_;
# do stuff
return \@array1, \@array2;
}
my ( $arrayref1, $arrayref2 ) = some_function(@some_data);Be careful with returning multiple values. Many languages only allow a single value to be returned from a subroutine. This is actually not a bad idea. If you’re trying to return too much from a single subroutine, it’s often a sign that the subroutine is trying to do too much.
Note
You may have noticed last line of the _running_total subroutine we used earlier:
sub _running_total {
state $running_total = 0;
my $numbers = shift;
my $total = 0;
$total += $_ for @$numbers;
$running_total += $total;
return $total, $running_total;
}Note that we are returning a list of values but we’re not using parentheses around the list. In Perl, it’s fine to return a list like this. The comma operator is what defines a list (not the parentheses, like many people believe) and since return has a fairly low precedence (Chapter 4, Working With data), there is no need to wrap the list in parentheses. However, many people feel more comfortable with using parentheses here and that’s fine:
return ( $total, $running_total );
With or without parentheses, returning a list this way is the same thing. Just remember that you will need the parentheses when assigning the values to variables:
my ( $total, $running_total ) = _running_total(\@numbers);
wantarray
The wantarray builtin (perldoc -f wantarray) gives you some information about how the subroutine was called. It returns undef if you don’t use the return value, 0 if you use it in scalar context and 1 if you are expecting a list. The following should make this clear:
sub how_was_i_called {
if ( not defined wantarray ) {
# no return value expected
print "I was called in void context\n";
}
elsif ( not wantarray ) {
# one return value expected
print "I was called in scalar context\n";
}
else {
# a list is expected
print "I was called in list context\n";
}
}
how_was_i_called();
my $foo = how_was_i_called();
my ($foo) = how_was_i_called();
my @bar = how_was_i_called();
my ( $this, $that ) = how_was_i_called();
my %corned_beef = how_was_i_called();That prints:
I was called in void context I was called in scalar context I was called in list context I was called in list context I was called in list context I was called in list context
The first how_was_i_called() did not assign the result to any values, so it’s in “void” context.
The second how_was_i_called() assigns to my $foo and results in a scalar context.
The my ($foo) results in a list context because the parentheses force a list context. Also, the my @bar, my ( $this, $that ), and my %corned_beef result in the subroutine being called in list context.
There are a variety of uses for wantarray, but it is usually used for returning a reference when called in scalar context:
sub double_it {
my @array = @_;
$_ *= 2 for @array;
return wantarray ? @array : \@array;
}With that, if you call double_it() in scalar context, you will get an array reference back.
Use of the wantarray builtin in controversial and many programmers recommend against it as it can lead to surprising code when a developer is not expecting the subroutine to behave differently just because they’re calling it with a different context.
FAIL!
Subroutines never know how they’re going to be called (or at least, they shouldn’t), but they should be able to handle problems. Here’s a great example of a problem:
sub reciprocal {
my $number = shift;
return 1 / $number;
}As you may recall from math class, the reciprocal of a number is one divided by that number (or that number raised to the power of −1). However, what happens when you pass a zero to our reciprocal subroutine? Your program dies with an Illegal division by zero error. Or what happens if you pass a reference instead of a number? Or maybe you passed a string? That’s where you want to check the error and handle it appropriately.
“Wake up! It’s time to die!”
Sometimes you need your program to die rather than spit out bad data. You can use the die builtin for this. The die builtin optionally accepts a string. It will print that string to STDERR (see Chapter 4, Working With data) and halt the programs execution at that point (though you can trap this will eval() as we’ll see later). So let’s say we have a program that should be executed via the command line as:
perl count_to.pl 7
And that should count from 1 to the number supplied. You want that number to look like a number and to be greater than 0. Otherwise, you want the program to die. Arguments to programs are passed via the @ARGV variable (though we’ll cover this more in Chapter 18, Common tasks when we cover command line handling). We’ll also use the looks_like_number() subroutine exported from the standard Scalar::Util module.
use strict;
use warnings;
use Scalar::Util 'looks_like_number';
my $number = $ARGV[0];
if ( not @ARGV or not looks_like_number($number) or $number < 1 ) {
die "Usage: $0 positivenumber";
}
print "$_\n" for 1 .. $number;If you run that without any arguments, with an argument that doesn’t look like a number, or with a number less than 1, the program will die with the following error message:
Usage: count_to.pl positivenumber at count_to.pl line 8
Note
The $0 variable contains the name of the program you’re currently running. See perldoc perlvar for more information.
That’s a very handy way of stopping a program before serious problems occur and letting the user know what the problem is.
If a problem is worth a warning but not worth stopping the program, you can warn instead:
unless ($config_file) {
warn "No config file supplied. Using default config";
$config_file = $default_config_file;
}It works the same, but your program keeps running.
carp and croak
Calling die is useful, but you might notice that it prints the line number of where it died. Quite often that’s a problem because you don’t want to know where the code died, but the line number of the calling code. This is where the carp() and croak() subroutines come in. These are optionally exported by the standard Carp module.
use Carp 'croak';
sub reciprocal {
my $number = shift;
if ( 0 == $number ) {
croak "Argument to reciprocal must not be 0";
}
return 1 / $number;
}
reciprocal(0);And that will print something like:
Argument to reciprocal must not be 0 at reciprocal.pl line 5
main::reciprocal(0) called at reciprocal.pl line 11It tells you where the error occurred (line 5) and where it was called from (line 11). In this simple example, it’s not that important, but in larger programs where reciprocal() can be called from multiple locations, it’s vital information to track down the error.
If you don’t want to stop the program but you need a warning, there’s also the carp() subroutine which is like croak(), but for warn instead of die.
use Carp qw(croak carp);
unless ($config_file) {
carp "No config file supplied. Using default config";
$config_file = $default_config_file;
}The Carp module also exports confess() and cluck(). These are like croak() and carp(), but the also provide full stack traces.
eval
Sometimes you want to try to run some code that might fail, but handle the failure gracefully rather than killing the program. This is where the eval() builtin comes in handy. There are two types of eval: string and block.
String eval
The first form of eval is used with an expression (typically a string, but not always). The Perl interpreter is used to interpret the expression and, if it succeeds, the code is then executed in the current lexical scope. This form of eval is often used to delay loading code until runtime or to allow a developer to fall back to an alternative solution to a problem. The special $@ variable is set if there are errors.
Consider trying to debug the following example, shown earlier in the chapter:
use Data::Dumper;
$Data::Dumper::Indent = 0;
my @numbers = ( 1, 2, 3 );
my @new = map { $_++ } @numbers;
print Dumper(\@numbers, \@new);That printed something like this:
$VAR1 = [2,3,4];$VAR2 = [1,2,3];
However, the $VAR1 and $VAR2 variables can be confusing, particularly when you’re trying to figure out what went wrong with your program. Data::Dumper offers a syntax which allows you to “name” these variables:
print Data::Dumper->Dump(
[\@numbers, \@new],
[qw/*numbers *new/],
);And that prints a much more “friendly”
@numbers = (2,3,4);@new = (1,2,3);
However, the syntax is cumbersome. As a result, your author has released Data::Dumper::Names. It behaves like Data::Dumper, but tries to provide the names of the variables. Simple change Data::Dumper to Data::Dumper::Names and it you should get the above output. But what if you don’t have that installed? You can use a string eval to fall back to Data::Dumper.
eval "use Data::Dumper::Names";
if ( my $error = $@ ) {
warn "Could not load Data::Dumper::Names: $error";
# delay loading until runtime. This is a standard module
# included with Perl
eval "use Data::Dumper";
}
$Data::Dumper::Indent = 0;
my @numbers = ( 1, 2, 3 );
my @new = map { $_++ } @numbers;
print Dumper(\@numbers, \@new);With this code, regardless of whether or not you could successfully load Data::Dumper::Names, you still get sensible output, though you’ll get a large warning message to boot.
Block eval
The block form of eval is used to trap the error with code that might fail. This is similar to try/catch with other languages, though it has some issues as we’ll soon see.
sub reciprocal { return 1/shift }
for (0 .. 3) {
my $reciprocal;
eval {
$reciprocal = reciprocal($_);
}; # the trailing semicolon is required
if ( my $error = $@ ) {
print "Could not calculate the reciprocal of $_: $error\n";
}
else {
print "The reciprocal of $_ is $reciprocal\n";
}
}And that prints:
Could not calculate the reciprocal of 0: Illegal division by zero at recip.pl line 1. The reciprocal of 1 is 1 The reciprocal of 2 is 0.5 The reciprocal of 3 is 0.333333333333333
As you can see, the block form of eval is very useful. Unfortunately, it’s also tricky to use safely. Let’s look at a few of the problems and their solutions.
eval gotchas
You probably noticed that after the block eval, we immediately saved the error into a variable:
eval { ... };
if ( my $error = $@ ) {
handle_error($error);
}Why is that? Because in the example above, if handle_error() itself has an eval, it may reset $@, causing us to lose our error message.
Another common mistake is this:
if ( my $result = eval { some_code() } ) {
# do something with $result
}
else {
warn "Could not calculate result: $@";
}As you might guess, if some_code() is allowed to return a false value (zero, the empty string, undef, and so on), then we might think we have an error when we actually don’t. A better way to write the above is this:
my $result;
my $ok = eval { $result = some_code(); 1 };
if ($ok) {
# do something with $result
}
else {
my $error = $@;
warn "Could not calculate result: $error";
}Note that the eval block has a bare 1 as the last expression. The block will return the value of the last expression and if some_code() does not generate an error, then $ok will be set to 1 and $result will have the return value of some_code(). Otherwise, $ok will be set to undef.
But there’s still a problem with the above code! If you’re working on a large system, it’s entirely possible that your eval() might be called from code that is also wrapped in an eval. When you call eval(), you’ve clobbered the outer code’s $@. So we need to rewrite this again, localizing $@!
my $result;
my $ok = do {
local $@;
eval { $result = some_code(); 1 };
};That’s starting to get tedious, but it’s fairly safe. You now know about the problems with eval, which you will probably encounter in older code. We strongly recommend that you install the excellent Try::Tiny module from the CPAN.
Try::Tiny
The Try::Tiny module provides a try/catch/finally system for Perl. Let’s rewrite our reciprocal code using it.
use Try::Tiny;
sub reciprocal { return 1/shift }
for my $number (0 .. 3) {
my $reciprocal;
try {
$reciprocal = reciprocal($number);
print "The reciprocal of $number is $reciprocal\n";
}
catch {
my $error = $_;
print "Could not calculate the reciprocal of $_: $error\n";
};
}This behaves exactly like our previous eval solution, but it will not clobber the $@ variable. Also, note that any error is now contained in $_ instead of $@, which is why we now name the number as $number to avoid confusion.
The catch block only executes if the try block trapped an error.
You can also provide an optional finally block that always executes, error or not:
try {
$reciprocal = reciprocal($number);
print "The reciprocal of $number is $reciprocal\n";
}
catch {
my $error = $_;
print "Could not calculate the reciprocal of $_: $error\n";
}
finally {
print "We tried to calculate the reciprocal of $number\n";
};Install Try::Tiny from the CPAN and read the documentation for more information about this excellent module. You also want to read its source code (perldoc -m Try::Tiny) to learn more about effective use of prototypes (explained later in this chapter), though some of the code is advanced.
Subroutine references
One lovely and powerful feature about Perl is the ability to take references to subroutines. This seems strange, but if you’re familiar with this feature, you can do strange and wonderful things. You can take references to existing subroutines or create anonymous subroutine references.
Existing subroutines
We previously mentioned the use of a leading ampersand to call a subroutine. Just as $, @, and % are the sigils for scalars, arrays, and hashes, the & is the sigil for subroutines, though it’s not seen as often. Thus, taking a reference to an existing subroutine is:
sub reciprocal { return 1 / shift }
my $reciprocal = \&reciprocal;
And there are two ways of calling this:
my $result = &$reciprocal(4);
print $result;
my $result = $reciprocal->(4);
print $result;The first method, using &$reciprocal(4), is dereferencing the subroutine with the & sigil and calling with arguments like usual. However, we recommend the second form, $reciprocal->(4), using the standard -> dereferencing operator. This is easier to read (you’re less likely to miss that leading &) and it’s more consistent in your code if you consistently use the dereferencing operator.
Anonymous subroutines
Just as you can take references to anonymous arrays and hashes (amongst other things), you can also take references to anonymous subroutines by omitting the subroutine name identifier and assign the result to a variable:
my $reciprocal = sub { return 1 / shift };
print $reciprocal->(4);Closures
So far taking references to subroutines seems interesting, but how do you use this? One way is to use a closure. A closure is a subroutine that refers to variables defined outside of its block. It is said to “close over” these variables. These have a variety of uses, though we won’t cover them extensively. We strongly recommend the book Higher Order Perl by Mark Jason Dominus if you truly wish to have your mind twisted by their power.
Note
While a closure does not need to be an anonymous subroutine, it’s usually implemented as such.
Closures are often used for iterators and lazy evaluation. Let’s say you want to periodically fetch the next Fibonacci number. In mathematics, Fibonacci numbers are in the form:
F(0) = 0 F(1) = 1 F(n) = F(n-1) + F(n-2)
So we wind up with an infinite list like:
0, 1, 1, 2, 3, 5, 8, 13, 21 ...
Obviously computing an infinite list all at once is not feasible, so we’ll use a closure to create an iterator that will generate these numbers one at a time. This code is explained in Example 7.1, “Computing the Fibonacci sequence”.
Example 7.1. Computing the Fibonacci sequence
#!perl
use strict;
use warnings;
use diagnostics;
sub make_fibonnaci {
my ( $current, $next ) = ( 0, 1 );
return sub {
my $fibonnaci = $current;
( $current, $next ) = ( $next, $current + $next );
return $fibonnaci;
};
}
my $iterator = make_fibonnaci();
for ( 1 .. 10 ) {
my $fibonnaci = $iterator->();
print "$fibonnaci\n";
}Note
fibonnaci.pl available for download at Wrox.com.
The make_fibonnaci() subroutine returns an anonymous subroutine which references the $current and $next variables declared in the make_fibonnaci() subroutine, but outside of the anonymous subroutine. The $iterator variable contains a reference to this anonymous subroutine and it “remembers” the values of the $current and $next variables. Every time it is invoked, it updates the values of $current and $next and returns the next Fibonacci number. Eventually, we get to the for loop which prints the first 10 Fibonacci numbers. You can pass the $iterator variable to other subroutines just like any other variable and it will still remember its state.
In fact, you can create several iterators with this same subroutine and each will have a separate copy of $current and $next.
Prototypes
A prototype is a very simple compile time argument check for subroutines. After the subroutine name but before the opening curly brace of the block, you can include a prototype in parentheses. The syntax looks like this:
sub sreverse($) {
my $string = shift;
return scalar reverse $string;
}
my $raboof = sreverse 'foobar';
print $raboof;
print sreverse 'foobar', 'foobar';And that prints raboof, the reverse of foobar (you may recall that reverse takes a list and does not reverse a string unless called in scalar context).
Argument coercion
With a prototype using the scalar sigil $, we force scalar context on the argument to sreverse(). We also guarantee, because only one sigil has been used in the prototype, that only one variable will be used as the argument.
So you can write this:
sub sreverse($) {
my $string = shift;
return scalar reverse $string;
}
print sreverse("this", "that");And Perl will fail at compile time, tell you that you have passed too many arguments to the subroutine:
Too many arguments for main::sreverse at proto.pl line 5, near ""that")" Execution of proto.pl aborted due to compilation errors.
Note that you don’t even need strict or warnings for this error to stop your program from compiling.
You can also use @ or % for a prototype. This will slurp in all remaining arguments in list context.
sub foo(@) {
my @args = @_;
...
}That might seem silly, but it means you can combine it with another prototype character:
sub random_die_rolls($@) {
my ( $number_of_rolls, @number_of_sides ) = @_;
my @results;
foreach my $num_sides (@number_of_sides) {
my $total = 0;
$total += int( 1 + rand($num_sides) ) for 1 .. $number_of_rolls;
push @results, $total;
}
return @results;
}
my @rolls = random_die_rolls 3;
print join "\n", @rolls;That might print something like:
8 26 31
It simulates 3 rolls of each of the subsequent die with the requisite number of sides. In this particular case, the prototype offers no particular advantage.
So far there’s nothing terribly exciting here, but can start to do interesting things if you put a backslash in front of a sigil. When you do this, you can pass the variable and it will be accepted as a reference. Here’s a subroutine that will attempt to lower-case all hash values that are not references.
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
sub my_lc(\%) {
my $hashref = shift;
foreach my $key (keys %$hashref) {
next if ref $hashref->{$key};
$hashref->{$key} = lc $hashref->{$key};
}
}
my $name = 'Ovid';
my %hash = (
UPPER => 'CASE',
Camel => 'Case',
);
# hey, no backslash required!
my_lc %hash;
print Dumper(\%hash);And that prints out:
$VAR1 = {
'Camel' => 'case',
'UPPER' => 'case'
};Because the hash is passed as a reference, it’s modified in place. Just copy the hash and return it if you don’t want this behavior:
sub my_lc(\%) {
my $hashref = shift;
my %hash = %$hashref;
foreach my $key (keys %hash) {
next if ref $hash{$key};
$hash{$key} = lc $hash{$key};
}
return %hash;
}
my %lc_hash = my_lc %hash;More Prototype Tricks
There’s a lot more you can do with prototypes, but we generally don’t recommend them if you don’t know what you’re doing. They don’t specify what type of variable you’re passing in. They tend to specify the context of the variable you’re passing in and this mimics Perl built-ins. For example, let’s say you want to write your own length() subroutine. In Perl, the length() builtin is only for scalars. It’s not for arrays and hashes. Here’s a lovely little example, borrowed from a long Tom Christiansen email to the Perl 5 Porters list (and republished at http://www.perlmonks.org/?node_id=861966).
For some reason, you decide that you want to write a wrapper around the length() builtin because you want it to handles arrays and hashes. We’ve already shown how to handle this with a dispatch table, but let’s try to handle this with prototypes.
sub mylength($) {
my $arg = shift;
return
'ARRAY' eq ref $arg ? scalar @$arg
: 'HASH' eq ref $arg ? scalar keys %$arg
: length $arg;
}
my $scalar = "whee!";
print mylength($scalar), "\n";
my @array = ( 1, 18, 9 );
print mylength(@array), "\n";
my %hash = ( foo => 'bar' );
print mylength(%hash), "\n";You can probably already guess that something is wrong because even though we haven’t covered how to use prototypes with different kinds of arguments, this looks, well, strange. Except that it’s stranger than you think. This prints out:
5 1 3
You can understand why it prints 5 for whee!, but why 1 for the array and 3 for the hash? The mylength() with a $ prototype prints 1 for the array with three elements because the $ prototype forces scalar context, so $arg contains the number of elements in the array, not the array itself! Thus, you wind up calling returning the value of length(3) and the string "3" is only one character long, thus returning 1.
The hash is even stranger. In the previous example, that prints 3 on some implementations. This is because that hash in scalar context probably evaluates to something like 1/8, as described in Chapter 3, Variables. The string "1/8" has a length of three. An empty hash in scalar context evaluates to 0, which has a string length of 1.
Warning
If the output of mylength() seems strange to you, be aware that Perl’s built-in length() builtin behaves the same way. See perldoc -f length.
You can fix that by wrapping the three primary data type sigils in the \[] prototype syntax. This tells Perl to pass a single scalar or array or has as a reference to the subroutine.
sub mylength(\[$@%]) {
my $arg = shift;
return
'ARRAY' eq ref $arg ? scalar @$arg
: 'HASH' eq ref $arg ? scalar keys %$arg
: length $$arg;
}
my $scalar = "whee!";
print mylength($scalar), "\n";
my @array = ( 1, 18, 9 );
print mylength(@array), "\n";
my %hash = ( foo => 'bar' );
print mylength(%hash), "\n";That prints the expected:
5 3 1
We don’t even test for an invalid reference type, such as a subroutine reference, being passed to mylength() because Perl will try to check that at compile-time.
Warning
Note that parentheses are required here because otherwise you will get an error about Too many arguments for main::mylength. Why did we need parentheses here and not for the sreverse() subroutine earlier? This is because of a known bug in Perl that has been fixed in version 5.14. You can read the gory details at https://rt.perl.org/rt3/Public/Bug/Display.html?id=75904 if you’re curious.
Mimicking builtins
You can also use prototypes to mimic certain types of builtins. A backslash before a sigil tells Perl that you want that variable to be accepted as a reference. So you can rewrite push like this:
sub mypush(\@@) {
my ( $array, @args ) = @_;
@$array = ( @$array, @args );
}
mypush @some_array, $foo, $bar, $baz;
mypush @some_array, @some_other_array;This works because the @ sigil in a prototype tells Perl to slurp in the rest of the arguments as a list. You can use a % sigil in a prototype, but it’s pretty useless unless you use a backslash to force a reference.
You can also separate optional arguments with a semicolon.
sub mytime(;$) {
my $real_time = shift;
if ($real_time) {
return scalar localtime;
}
else {
return "It's happy hour!";
}
}This mytime() subroutine will usually lie to you and tell you it’s fine for a drink, but if you pass it a true value, it return a string representing a human-readable version of the current local time.
Sat Dec 24 11:11:26 2011
One nifty trick with prototypes is using an ampersand (&) as the first argument. Let’s say you want to increment every element in a list by one. You might write this:
use Data::Dumper;
my @numbers = ( 1, 2, 3 );
my @new = map { $_++ } @numbers;
print Dumper(\@numbers, \@new);That prints out:
$VAR1 = [
2,
3,
4
];
$VAR2 = [
1,
2,
3
];If you look at that carefully, you realize that you’ve incremented all of the values of the original list but not the new one! Why is that? We briefly explained the map builtin in Chapter 4, Working With data. In that explanation, we mentioned that $_ is aliased to every element in the original list. Because $_++ uses the postincrement operator, we’ve successfully modified the original value of $_ in the @numbers list, but we’ve returned $_ to @new before we incremented it!
We can use a clever subroutine prototype to create an apply() subroutine that will apply an anonymous subroutine to every element in a list and return a new list. This will leave your old list intact and successfully create the new list:
sub apply (&@) {
my $action = shift;
my @shallow_copy = @_;
foreach (@shallow_copy) {
$action->();
}
return @shallow_copy;
}
use Data::Dumper;
my @numbers = ( 1, 2, 3 );
my @new = apply { $_++ } @numbers;
print Dumper(\@numbers, \@new);And this prints the desired result.
$VAR1 = [
1,
2,
3
];
$VAR2 = [
2,
3,
4
];The &@ prototype allows a subroutine to accept a block as the first argument and this block is considered to be an anonymous subroutine. You are not allowed to use a comma after it. The @ will allow you to pass a list after the anonymous subroutine.
In the apply() subroutine, we copy @_ to @shallow_copy and then iterate over @shallow_copy. Because the loop aliases $_ to each variable in our new array, the $action anonymous subroutine doesn’t touch the original array and let’s it “do the right thing”.
Of course, being a shallow copy, this will now break:
my @munged = apply { $_->[0]++ } @list;The dclone() subroutine from Storable (described in Chapter 6, References) will let you do a deep copy, if needed.
Forward Declarations
A forward declaration is a subroutine declaration without a subroutine body. It’s just a way of telling Perl “hey, I’m going to define this subroutine later”. Some programmers like predeclaring their subroutines because it solves certain parsing problems in Perl. We won’t cover it in-depth but we will explain one case where it can prevent compile errors.
Note
There’s a saying that only perl (lower-case) can parse Perl (upper-case). This is true. Many languages have extremely well-defined grammars that allow you to unambiguously declare the semantics of a given expression. For a variety of reasons, this is not possible with Perl. That’s why the Perl parse is heuristic in nature. Heuristic means “it usually guesses correctly”. Very, very seldom will you have issues with this, but for some examples of how the perl parser can sometimes get things wrong, see perldoc -f map.
Note that when using prototypes, you’ll often get subtle errors if you omit the parentheses. For example, here’s a mysterious error you may get:
use strict;
use warnings;
use diagnostics;
my $reciprocal = reciprocal 4;
sub reciprocal($) {
return 1/shift;
}That’s going to generate a number of errors, even though the code looks fine. The first one looks like this:
Number found where operator expected at recip.pl line 5, near "reciprocal 4" (#1)
(S syntax) The Perl lexer knows whether to expect a term or an operator.
If it sees what it knows to be a term when it was expecting to see an
operator, it gives you this warning. Usually it indicates that an
operator or delimiter was omitted, such as a semicolon.
(Do you need to predeclare reciprocal?)What’s happening here? Well, when the Perl parser starts compiling the code down to its internal form, it encounters the reciprocal 4 construct. Because it has not yet seen the prototype for the reciprocal subroutine, it doesn’t know that 4 is an argument for a subroutine named reciprocal(). You can solve this in one of two ways. One way is to define the reciprocal() subroutine before that line of code. That ensures that when Perl gets to reciprocal 4, it already knows what it is.
If you prefer your subroutines to be defined after the main body of code, you can use a forward declaration with the correct prototype:
use strict;
use warnings;
use diagnostics;
sub reciprocal($);
my $reciprocal = reciprocal 4;
sub reciprocal($) {
return 1/shift;
}That let’s Perl successfully parse reciprocal 4 when it gets to it.
Prototype Summary
Prototypes can be confusing and complicated, but to top it off, they’re also buggy. You’ve already seen one bug. Another one is that there are a number of invalid prototypes you can declare, such as (@@).
There are also useless prototypes you can declare. Consider a prototype of (@$). The @ symbol tells Perl to slurp in all arguments, leaving nothing for the $. Perl will not warn you about this.
Also, when we get to the Chapter on objects (Chapter 12, Object Oriented Perl), you may be tempted to use prototypes for methods. This does not work because prototypes are checked at compile time but we don’t know what method we will be calling until runtime. For now, just remember that prototypes are a bit of a minefield. They would have been left out of this book entirely, were it not for the fact that a number of programmers use them and often do so incorrectly. You are now warned.
There are far more issues with prototypes, but they’re far beyond the scope of this book. We can only recommend that if you wish to use them, read about them carefully and make sure you know what you’re doing.
Note
For more information on prototypes, see the Prototypes section perldoc perlsub.
Recursion
A recursive subroutine is a subroutine that calls itself. Why might it do this? Because it’s often clearer to express something in a recursive form. Also, sometimes we find it easier to break a large problem into smaller problems and solve those. We’ll look at both.
Basic recursion
Remember that we define Fibonacci numbers as:
F(0) = 0 F(1) = 1 F(n) = F(n-1) + F(n-2)
Here’s how we would write that as a recursive subroutine, finding the nth Fibonacci number.
sub F {
my $n = shift;
return 0 if $n == 0;
return 1 if $n == 1;
return F($n - 1) + F($n - 2);
}
print F(7);And that will correctly print 13. Notice how it very closely matches the mathematical definition of Fibonacci numbers.
Warning
One very important thing to remember about recursive functions is that they should almost always have one or more statements that return without recursing. This is to prevent infinite loops. If you write a recursive subroutine that never returns, look at your return statements carefully and see if you forgot to have one break out of the recursion.
Divide and Conquer
Divide and conquer, in computer science, is a way of breaking a problem down into smaller problems and trying to solve each of those, perhaps breaking those down into smaller problems. For example, let’s say you have a sorted list of integers you and want to find an integer in that list. One way to do this is to iterate over the list:
sub search {
my ( $numbers, $target ) = @_;
for my $i ( 0 .. $#$numbers ) {
return $i if $numbers->[$i] == $target;
}
return;
}This code works, but it can be very slow. Imagine if you have a list of 1,000 elements. You might have 1,000 iterations before you find the number. Doing this repeatedly could be very slow. A better strategy (again, assuming the list of numbers is sorted), is do a binary search. In this search, we check to see if our number is less than the midpoint of the list. If so, repeat the process for the first half of the list. If not, repeat for the second half of the list. Repeat until you’ve found the index or run out of list. We can see that this means for the first iteration, we have at most 500 numbers to compare, then 250, then 125, 63, 32, 16, 8, 4, 2, 1. So we have at most 10 iterations before we find the number. Here’s how this might work.
my @numbers = map { $_ * 3 } ( 0 .. 1000 );
sub search {
my ( $numbers, $target ) = @_;
return _binary_search( $numbers, $target, 0, $#$numbers );
}
sub _binary_search {
my ( $numbers, $target, $low, $high ) = @_;
return if $high < $low;
# divide array in two
my $middle = int( ( $low + $high ) / 2 );
if ( $numbers->[$middle] > $target ) {
# search the lower half
return _binary_search( $numbers, $target, $low, $middle - 1 );
}
elsif ( $numbers->[$middle] < $target ) {
# search the upper half
return _binary_search( $numbers, $target, $middle + 1, $high );
}
# found it!
return $middle;
}
print search(\@numbers, 699),"\n";
print search(\@numbers, 28),"\n";That prints 233 when you search for the number 699, and undef when you search for the number 28. It’s also very fast. You’ll note how we’ve successfully divided the problem into smaller and smaller steps recursively until we find what we’re looking for.
Memoization
Recursive subroutines can be very expensive in terms of memory. If the subroutine is a pure subroutine, you can memoize (cache or “memorize” previous results) it. The Memoize module on the CPAN can help with this.
The memoize subroutine provided by the module allows a subroutine to remember a previous result for a set of arguments. The first time you call a memoized subroutine it calculates the value. On any subsequent call it returns the cached value.
Note
A pure subroutine is a subroutine that relies only on the arguments passed to it and always returns the same value for each set of arguments. It’s also guaranteed not to have side-effects.
use Memoize;
memoize('F');
sub F {
my $n = shift;
return 0 if $n == 0;
return 1 if $n == 1;
return F($n - 1) + F($n - 2);
}
print F(50);That quickly prints 12586269025, but if you remove the memoize('F') line, it will take several hours to run. That’s because the recursive subroutine calls are often calculating the same thing, calling themselves over and over. If you walk through the subroutine several times, you’ll understand why this saves so much time.
Of course, everything has a price. The memoize subroutine works by using extra memory to store the computed value. Often you’ll find that trading CPU time for RAM is a good trade-off.
Try it out Writing a recursive maze generator
Example 7.4
Type in the following program and save it as
maze.pl.use strict; use warnings; use diagnostics; use List::Util 'shuffle'; my ( $WIDTH, $HEIGHT ) = ( 10, 10 ); my %OPPOSITE_OF = ( north => 'south', south => 'north', west => 'east', east => 'west', ); my @maze; tunnel( 0, 0, \@maze ); print render_maze( \@maze ); exit; sub tunnel { my ( $x, $y, $maze ) = @_; my @directions = shuffle keys %OPPOSITE_OF; foreach my $direction (@directions) { my ( $new_x, $new_y ) = ( $x, $y ); if ( 'east' eq $direction ) { $new_x += 1; } elsif ( 'west' eq $direction ) { $new_x -= 1; } elsif ( 'south' eq $direction ) { $new_y += 1; } else { $new_y -= 1; } # if a previous tunnel() through the maze has not visited # the square, go there. This will replace the _ or | # character in the map with a space when rendered if ( have_not_visited( $new_x, $new_y, $maze ) ) { # make a two-way "path" between the squares $maze->[$y][$x]{$direction} = 1; $maze->[$new_y][$new_x]{ $OPPOSITE_OF{$direction} } = 1; # This program will often recurse more than one # hundred levels deep and this is Perl's default # recursion depth level prior to issuing warnings. # In this case, we're telling Perl that we know # that we'll exceed the recursion depth and to # not warn us about it no warnings 'recursion'; tunnel( $new_x, $new_y, $maze ); } } # if we get to here, all squares surround the current square # have been visited or are "out of bounds". When we return, # we may return to a previous tunnel() call while we're # digging, or we return completely to the first tunnel() # call, in which case we've finished generating the maze. # This return is not strictly necessary, but it makes it # clear what we're doing. return; } sub have_not_visited { my ( $x, $y, $maze ) = @_; # the first two lines return false if we're out of bounds return if $x < 0 or $y < 0; return if $x > $WIDTH - 1 or $y > $HEIGHT - 1; # this returns false if we've already visited this cell return if $maze->[$y][$x]; # return true return 1; } # creates the ASCII strings that will make up the maze # when printed sub render_maze { my $maze = shift; # $as_string is the string representation of the maze # start with _________________________________________ my $as_string = "_" x ( 1 + $WIDTH * 2 ); $as_string .= "\n"; for my $y ( 0 .. $HEIGHT - 1 ) { # add the | vertical border at the left side $as_string .= "|"; for my $x ( 0 .. $WIDTH - 1 ) { my $cell = $maze->[$y][$x]; # if the neighbor is true - we have a path $as_string .= $cell->{south} ? " " : "_"; $as_string .= $cell->{east} ? " " : "|"; } $as_string .= "\n"; } return $as_string; }Note
maze.pl available for download at Wrox.com.
Run the program as
perl maze.pl. You should see output similar to the following:_____________________ |_ |_ _ _ _ _ _ | | |_ _ | _ _| | _| | _| |_ | _| | | | |_ |_ _ _| | | | | _|_ _ _| |_ _| | | | _ _ | _| _ _| |_| | _| | | _ _ | | _|_ _ _| |_| _ | | _ _ _ |_ | | _| |_ _ _ _|_ _ _ _|_ _|
Due to the random nature of this program, your maze will likely not match the one above.
How it works
This is out most complex Try It Out to date and the code should be read and run a few times to understand how it works.
We start at position 0,0, randomly shuffle the north, south, east and west directions and choose the first direction. If that puts us in a square that is not out-of-bounds (outside of our grid boundaries) and has not yet been visited, then we mark a two-way path between the two squares. Then we move to the new square and repeat the process. This moving to a new square is done recursively by calling tunnel() with our new square’s coordinates.
When we get to a square that is surrounded by out-of-bounds or surrounded by already visited squares, then we return from the tunnel() subroutine and the next of the random north, south, east and west directions is tried for the previous squares.
Eventually every north, south, east, and west direction for every square will be tried. When that’s done, the recursion ends and we render the map. Let’s look at the successive building of a 3 by 3 map.
Y_______ 0|_|_|_| 1|_|_|_| 2|_|_|_| 0 1 2 X
As you can see, the upper right-left corner is 0,0, the upper-right is 2,0, the lower-left is 0,2 and the lower-right is 2,2. Because arrays start with 0, the largest index is 2, which is why we refer to $HEIGHT - 1 and $WIDTH - 1 in the code. The code starts in the upper-left. Now here’s a sample run:
_______ _______ _______ 1|_|_|_| 2| |_|_| 3| |_|_| |_|_|_| |_|_|_| |_ _|_| |_|_|_| |_|_|_| |_|_|_| _______ _______ _______ 4| |_|_| 5| |_|_| 6| |_|_| |_ |_| |_ |_| |_ |_| |_|_|_| |_ _|_| |_ _ _| _______ _______ _______ 7| |_|_| 8| |_| | 9| |_ | |_ | | |_ | | |_ | | |_ _ _| |_ _ _| |_ _ _|
As you can see, the code randomly progressed (tunneled) from 0,0 to 0,1 to 1,1 to 1,2 before ending up at a dead-end at 0,2 in the fifth rendition of the maze. What does it do then? It can’t go left or down because those would be out of bounds. It can’t go up because that’s a visited square. It can’t go right because that’s also a visited square. As a result, the tunnel() subroutine will return, but it returns to itself because it called itself. The code then continues in the for loop for square 1,2. If you play around with this a bit, particularly for larger maps, you’ll understand better how recursion can draw the entire maze.
The downloadable version available at Wrox.com is a bit more elaborate. It attempts to redraw the maze at every step to let you see how the maze is being built.
Things to watch for
Writing subroutines allows you to write more maintainable code, but there are a few guidelines that will make your subroutines better. None of these guidelines should be taken as hard and fast rules.
Argument aliasing
Don’t forget that the @_ array aliases the arguments to the subroutine. It’s easy to forget this and write code that usually works, but breaks when you least expect it. Here’s some code which tries to modify an array “in place”, but breaks when you pass it hard-coded values:
sub fix_names {
$_ = ucfirst lc $_ foreach @_;
}
fix_names(qw/alice BOB charlie/);That will throw a Modification of a read-only value error because arguments to fix_names() are hard-coded into the program.
Scope issues
As much as possible, subroutines should only rely on the arguments passed to them and not on variables declared outside of it. You may have noticed that with the exception of one of the _running_total() examples (and even that closely encapsulated the state in an outer block), and the “maze” example in this chapter, we’ve adhered to this rule closely. Why? Take a look at this subroutine:
sub withdraw {
my $amount = shift;
if ( $customer->{balance} - $amount < $minimum_balance ) {
croak "$customer->{name} cannot withdraw $amount";
}
$customer->{balance} -= $amount;
}Where did $minimum_balance come from? Where did $customer come from? What happens if something else changes them in a way to make their data invalid? Who changed them? If you move this subroutine somewhere else, are those external variables still in scope?
So why did the maze.pl example earlier in this chapter break this rule? It’s a trade-off. The opening of the tunnel() subroutine looked like this:
sub tunnel {
my ( $x, $y, $maze ) = @_;However, if we passed in all of the variables we needed (taking into consideration the variables needed in have_not_visited()), then it would have looked like this:
sub tunnel {
my ( $x, $y, $maze, $opposite_of, $height, $width ) = @_;At which point, the argument list is starting to get ridiculous and it’s harder to figure out what’s going on. For a one-off demonstration, this is OK. In reality, when this much data needs to be tracked, switching to object-oriented programming (Chapter 12, Object Oriented Perl) is a one strategy to control the chaos.
Doing too much
Your author has worked on corporate code with “subroutines” which are thousands of lines long. They’re a mess and it’s very hard to figure out what’s going on.
A subroutine should generally do one thing and do it well. If it needs to do more, it can call other subroutines to help it out. If you try to do too much in a subroutine, not only does the subroutine start to get confusing, but what happens if something else needs that “extra” behavior you’ve squeezed into that subroutine? Keep subroutines small and tightly focused.
Too many arguments
We’ve already listed the example of what the maze.pl tunnel() function would look like if we passed in all required variables. In fact, if you look at the downloadable version, you’d need to pass in even more:
my (
$x, $y, $maze, $opposite_of, $height, $width,
$delay, $can_redraw, $delay, $can_redraw, $clear
) = @_;There are ways of working around this, but this example would have been ridiculous if we passed in that many arguments. When something like this happens, try to rewrite your code in such a way that you need fewer arguments. If you can’t, consider switching to named arguments and passing in a hashref, though in this case it would not have helped much.
Summary
You now know far more about subroutines than you probably expected. In Perl, subroutines are very powerful and can even be assigned to variables as references and passed around.
They are a very useful way of organizing your code with named identifiers to promote code reuse and more readable code.
Exercises
Write a subroutine named
average()that, given a list of numbers, returns the average of those numbers. Don’t worry about error checking.Take the subroutine
average()and add error checking to it. Make sure the error is fatal. Hint: try thelooks_like_number()subroutine fromScalar::Util, described earlier in this chapter.Write a subroutine called
make_multiplier()that takes a number and returns an anonymous subroutine. The returned anonymous subroutine will accept a number and return that number multiplied by the first number. Use your code to make the following printyes, twice. Hint: you’ll be using a closure.my $times_seven = make_multiplier(7); my $times_five = make_multiplier(5); print 21 == $times_seven->(3) ? "yes\n" : "no\n"; print 20 == $times_five->(4) ? "yes\n" : "no\n";
Write a
sum()subroutine that sums its arguments via recursion.
WHAT YOU LEARNED IN THIS CHAPTER
Topic | Key Concepts |
|---|---|
@_ | The subroutine argument array |
return | How to return data from a subroutine |
wantarray | Determine the context in which a subroutine was called |
warn/carp | How to report warnings |
die/croak | How to report problems and stop the program |
eval STRING | Delay the parsing of code until runtime |
eval BLOCK | Trap fatal errors in code |
$@ | The default eval error variable |
Try::Tiny | A better way of trapping errors in Perl |
Subroutine refs | How to pass subroutines as variables |
Closures | Subroutines that refer to variables defined in an outer scope |
Prototypes | Sigils added to a subroutine definition to suggest how arguments are passed |
Recursive subroutines | Subroutines that call themselves |
Memoization | Making subroutines faster by using more memory |
Answers to Exercises
Write a subroutine named
average()that, given a list of numbers, returns the average of those numbers. Don’t worry about error checking.sub average { my @numbers = @_; my $total = 0; $total += $_ foreach @numbers; return $total / @numbers; } print average(qw< 1 5 18 3 5>);That code prints
6.4, the average of the numbers passed in.Take the subroutine
average()and add error checking to it. Make sure the error is fatal. Hint: try thelooks_like_number()subroutine fromScalar::Util, described earlier in this chapter.use Scalar::Util 'looks_like_number'; use Carp 'croak'; sub average { my @numbers = @_; my $total = 0; foreach my $number (@numbers) { if ( not looks_like_number($number) ) { croak "$number doesn't look like a number"; } else { $total += $number; } } return $total / @numbers; } print average(qw< 1 5 18 bob 3 5>);Write a subroutine called
make_multiplier()that takes a number and returns an anonymous subroutine. The returned anonymous subroutine will accept a number and return that number multiplied by the first number. Use your code to make the following printyes, twice. Hint: you’ll be using a closure.sub make_multiplier { my $number = shift; return sub { return shift(@_) * $number }; } my $times_seven = make_multiplier(7); my $times_five = make_multiplier(5); print 21 == $times_seven->(3) ? "yes\n" : "no\n"; print 20 == $times_five->(4) ? "yes\n" : "no\n";Write a
sum()subroutine that sums its arguments via recursion.sub sum { return 0 unless @_; my ( $head, @tail ) = @_; return $head + sum(@tail); } print sum( 1, 93, 3, 5 );





Add a comment



Add a comment