Chapter 11. Packages and Modules
WHAT YOU WILL LEARN IN THIS CHAPTER
Understanding packages and namespaces
Using package variables
Defining subroutines in packages
Exporting subroutines
Understanding scoping
Writing POD: Plain Old Documentation
Creating packages with Module::Build and ExtUtils::MakeMaker
Using BEGIN, CHECK, INIT and END
Up to now, all of our code has been in a single file. However, that doesn’t work when building larger systems. You need to understand how to logically break apart your applications into separate, preferably reusable components called packages or modules. These modules generally live in different files. We’ll explain how to create and organize these packages. Some professional Perl programmers never get beyond this step and still have successful careers and by the end of this chapter, you’ll be well on your way to being able to be a professional Perl programmer.
In the real world, mission critical Perl applications range from a few lines of code to over a million (your author has worked on the latter).
When you have huge systems, would you really want all of that in one file? Probably not. Creating modules allows you to break your application down into small, manageable chunks. Doing so makes it easier to understand and design different parts of your system and helps to avoid what your author thinks of as “a steaming pile of ones and zeros”.
Namespaces and packages
We discussed namespaces very briefly in Chapter 3, Variables. A namespace is a place to organize logically related code and data. It’s given a package name and all subroutines and package variables declared in that namespace cannot be called outside of that namespace unless you prepend the package name to them or if the package “exports” the subroutines to other packages. This allows you to reuse names in different namespaces without worrying about collision. Declaring the subroutine is_stupid() twice in the same namespace will generate a warning (and the first subroutine will be overwritten). Declaring it twice in separate namespaces is just fine.
A package name is one or more identifiers separated by double colons. As you will recall from Chapter 3, Variables, an identifier must start with a leading letter or underscore. You can optionally follow that with one or more letters, numbers or underscores. The following are all valid package names from modules you will find on the CPAN:
File::Find::RuleModule::StarterDBIx::ClassMoosealiased
Note the last one, aliased. It starts with a lower-case letter. By convention in Perl, a module whose name is all lower case should be a pragma that affects Perl’s compilation (as aliased and autodie do). Think very carefully about using a lower-case name for a module as it’s usually a bad idea.
Note
Actually, as a legacy from earlier versions of Perl, you can use a single quote mark, ', in place of a double colon. So you could refer to the My::Preferred::Customer package as My'Preferred'Customer. However, this is highly frowned upon today. We mention only because you will sometimes find a programmer trying to be “clever” and using this older style of package name. Be wary of “clever” programmers.
Let’s start with a simple package named My::Number::Utilities. By convention, this should correspond to a path and filename of My/Number/Utilities.pm and it should usually located in a lib/ directory (in other words, lib/My/Number/Utilities.pm). The .pm extension is what Perl uses to identify a given module. A module is simply a file that contains one or more packages, though it’s generally recommended to have one package per module. It’s also very strongly recommended that your module and package names correspond. You can have a file called My/Sekret/Stuff.pm containing a package named I::Am::A::Lousy::Programmer, but this tends to be confusing. That module should contain a package named My::Sekret::Stuff.
Now go ahead and create the lib/My/Number/ directory now and create an empty Utilities.pm file in it. If you saved the tree.pl utility we created in Chapter 9, Files and Directories, your file structure should look like this:
lib/ | My/ | | Number/ | | |--Utilities.pm
Let’s take the first is_prime() function we created in Chapter 10, Sort, map and grep and use that to make our My::Number::Utilities package. Save the following code in lib/My/Number/Utilities.pm.
package My::Number::Utilities;
use strict;
use warnings;
our $VERSION = 0.01;
sub is_prime {
my $number = $_[0];
return if $number < 2;
return 1 if $number == 2;
for ( 2 .. int sqrt($number) ) {
return if !($number % $_);
}
return 1;
}
1;That’s it! You’ve successfully created your first module. Let’s see how to use it.
In the directory containing the lib/ directory, create a file named primes.pl and save the code in Example 11.1, “A simple module” to it (it should look familiar).
Example 11.1. A simple module
use strict;
use warnings;
use diagnostics;
use lib 'lib'; # tell Perl we'll find modules in lib/
use My::Number::Utilities;
my @numbers = qw(
3 2 39 7919 997 631 200
7919 459 7919 623 997 867 15
);
my @primes = grep { My::Number::Utilities::is_prime($_) }
@numbers;
print join ', ' => sort { $a <=> $b } @primes;Note
primes.pl and My::Number::Utilities available for download at Wrox.com.
When you run perl primes.pl, you should see the following output and you’ve successfully used the module.
2, 3, 631, 997, 997, 7919, 7919, 7919
Note
It’s possible, however, that you’ll get an error similar to the following:
Can't locate My/Number/Utilities.pm in @INC
(@INC contains: lib t /home/ovid/perl5/perlbrew/...
BEGIN failed--compilation aborted at primes.pl line 6 (#1)
(F) You said to do (or require, or use) a file that
couldn't be found. Perl looks for the file in all the
locations mentioned in @INC, unless the file name
included the full path to the file. Perhaps you need
to set the PERL5LIB or PERL5OPT environment variable
to say where the extra library is, or maybe the script
needs to add the library name to @INC. Or maybe you
just misspelled the name of the file. See
perlfunc/require and lib.Reading through that carefully should tell you where to look. In this case, you’ve either misspelled the module name, misspelled a directory or filename when creating the module, or your use lib line doesn’t actually point to the lib/ directory where the module lives (you can use absolute paths if you need to, but they tend not to be portable). If you read through the @INC line, you’ll see where Perl is looking for your module.
Reading through error messages seems to almost be a lost art given that so many error messages are awful, but learning to pay attention to them will make your programming life much easier.
Most of your modules will have effectively the same core:
package Module::Name; use strict; use warnings; our $VERSION = 0.01; # or some other version number # module code here 1;
Let’s look at the code for My::Number::Utilities again. The package statement is the first line:
package My::Number::Utilities;
This declares that everything after this declaration belongs to the My::Number::Utilities package. The package statement is either file scoped or block scoped. In this case, from the package declaration to the bottom of the file, everything is in the My::Number::Utilities package. If another file scoped package declaration is found, the subsequent code belongs to the new package:
package My::Math;
use strict;
use warnings;
our $VERSION = 0.01;
sub sum {
my @numbers = @_;
my $total = 0;
$total += $_ foreach @numbers;
return $total;
}
package My::Math::Strict;
use Scalar::Util 'looks_like_number';
our $VERSION = 0.01;
sub sum {
my @numbers = @_;
my $total = 0;
$total += $_ foreach grep { looks_like_number($_) } @numbers;
return $total;
}
1;In the above code, we have slightly different variants of the sum() function, but the first can be called with My::Math::sum(@numbers) and the second can be called with My::Math::Strict::sum(). Both the strict and warnings pragmas are in effect until the end of the file, thus affecting My::Math::Strict.
Sometimes, though, you may want to limit the scope of a package declaration. You do this by enclosing the package declaration in a block:
package My::Package;
use strict;
use warnings;
our $VERSION = 0.01;
{
package My::Package::Debug;
our $VERSION = 0.01;
# this belongs to My::Package::Debug
sub debug {# some debug routine
}
}
# any code here belongs to My::Package;
1;Generally, though, putting each package in its own appropriately named file makes it much easier to track that package down later.
You may also be curious about the bare 1 at the end of the package:
1;
In Perl, when you use a package, it must return a true value. If it does not, the use will fail at compile time. Putting a 1; at the end of the package solves this.
Note
Ordinarily, a bare value will generate a warning:
use strict; use warnings; 'one';
And that will warn about:
Useless use of a constant (one) in void context at ...
However, Perl special cases 1 and 0 to not emit a warning.
use versus require
Generally when you need to load a module, you use it:
use My::Number::Utilities;
The use statement has a variety of different uses and these can be a bit confusing.
use VERSION use Module VERSION LIST use Module VERSION use Module LIST use Module
The use VERSION form tells Perl that it must use a minimum version of Perl. This being Perl, there are a variety of different formats for the version number. So if you want to declare that your code requires Perl version 5.8.1 or above:
use v5.8.1; use 5.8.1; use 5.008_001;
If you require Perl version 5.9.5 or above, all features available via use feature will be loaded. Thus, instead of saying:
use feature "say"; # or use feature ":5.12";
You can say:
use v5.12.0;
Prefixing the number with a v (as in v5.12.0) requires a 3-part number and it’s called a version string, or v-string. They have some issues and not everyone likes them. See “Version Strings” in perldoc perldata.
Note
If you have use v5.11.0 or better in your code, strict is automatically enabled in your code. In other words, use strict is not required.
use Module use Module LIST use Module VERSION use Module VERSION LIST
Many times you use a module with:
use Test::More;
Test::More is used in testing your code (Chapter 14, Testing). You may find that you enjoy the subtest() feature of Test::More, but it’s not available until version 0.96, so you can assert that you need at least that version of Test::More:
use Test::More 0.96; # or use Test::More v0.96.0;
Warning
If the version number is less than one and you’re not using v-strings, you must have a leading 0:
use Some::Module .32; # Syntax error, do not use! use Some::Module 0.32; # Properly asserting the version
The broken form above is a syntax error in Perl and is related to how Perl’s use statement parses version numbers. Just remember that when using version numbers, you must always have a digit before the decimal point.
When Perl loads Test::More, if its version is less than the supplied version, it will automatically croak().
Test::More accepts an import list. When you use a module, Perl automatically looks for a function in that module named import() and, if it’s found, it will call it for you, passing in any arguments included in the import list when you use the module. So you might use Test::More like this:
use Test::More tests => 13;
Then, the list tests, 13 will be passed as arguments to Test::More::import().
Finally, you can combine the version check and import list:
use Test::More 0.96 tests => 13;
That tells Perl that Test::More must be version 0.96 or better and passes the arguments tests and 13 to the import method.
Note
Sometimes you want to load a module and not call its import() method. Use parentheses for the LIST when you use the module:
# Don't export Dumper() into our namespace use Data::Dumper ();
You can also require a module:
require My::Number::Utilities;
The use statement happens are compile time, even if it’s embedded in a code path which would not normally be executed. The require happens are runtime. Normally loading a module with use is what you need. However, sometimes you want to delay loading a module if you don’t need it. For example, if you want to debug output with Data::Dumper, it automatically exports the Dumper() subroutine (unless you use Data::Dumper ()).
use Data::Dumper; print Dumper($some_variable);
However, sometimes you don’t want to load Data::Dumper unless there’s a problem. You might wrap this in a subroutine and use require.
sub debug {
my @args = @_;
require Data::Dumper;
warn Data::Dumper::Dumper(\@args);
}With the debug() subroutine, Data::Dumper will never be loaded unless debug() is called. Because it’s loaded with require and not use, Data::Dumper’s import() method is not called, so you must call Dumper() with it’s fully-qualified subroutine name: Data::Dumper::Dumper().
Package variables
Up to now we’ve primarily seen lexically scoped variables declared with the my builtin:
my $foo; my @bar; my %baz;
Those are file or block scoped and not visible elsewhere. However, sometimes you want a variable to be seen by other packages. To enable the visibility you need to use fully-qualified package variables (variables with a package name prefixed to them) or declare the variables with our. The our builtin is like my, but it’s for package variables and not lexically scoped ones.
For example, the Data::Dumper module let’s you control much of its behavior with package variables:
use Data::Dumper; # sort hash keys alphabetically local $Data::Dumper::Sortkeys = 1; # tighten up indentation local $Data::Dumper::Indent = 1; print Dumper(\%hash);
Typing long package names like can be frustrating and you author repeatedly types $Data::Dumper::SortKeys = 1 and then tries to figure out what went wrong. Fortunately, Data::Dumper provides an alternate, cleaner interface, so we recommend reading the documentation.
Note
The using local() to restrict the changes to package variables to the current scope is not strictly required when using package variables, but it’s usually good practice. Your author has stumbled on numerous bugs in Perl modules due to authors changing package variables (or not realizing that some other code is doing this).
You would use a package variable when you need a variable that other packages can access using its fully qualified name. We will explain in a bit why this is usually a bad idea, but for now, here’s how you might declare them.
package My::Number::Utilities;
use strict;
use warnings;
our $VERSION = 0.01;
$My::Number::Utilities::PI = 3.14159265351;
$My::Number::Utilities::E = 2.71828182846;
$My::Number::Uitlities::PHI = 1.61803398874; # golden ratio
@My::Number::Utilities::FIRST_PRIMES = qw(
2 3 5 7 11 13 17 19 23 29
31 37 41 43 47 53 59 61 67 71
);
sub is_prime {
#
}
1;As you can see, we’ve declared several package variables, but the sharp-eyed amongst our readers may have noticed the problem. The $My::Number::Uitlities::PHI variable has a misspelled package name. Oops!
Instead, you use the our declaration to omit the package name.
our $PI = 3.14159265351; our $E = 2.71828182846; our $PHI = 1.61803398874; # golden ratio our @FIRST_PRIMES = qw( 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 );
Note
By convention, variables declared with a package scope have UPPER CASE identifiers. When you’re debugging a subroutine and see an UPPER CASE variable, it not only makes it much easier to know that it’s declared outside of this subroutine, but it also makes it harder for you to accidentally override the value of our $PHI with a my $phi declaration later.
Versions of Perl prior to version 5.6.1 did not have the our builtin, so they used the vars pragma instead.
use vars qw($PI $E $PHI @FIRST_PRIMES); $PI = 3.14159265351; $E = 2.71828182846; $PHI = 1.61803398874; # golden ratio @FIRST_PRIMES = qw( 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 );
Today if you must use package variables, it is recommended that you use the our builtin instead of the vars pragma.
So what’s wrong with package variables?
package Universe::Roman; use My::Number::Utilities; $My::Number::Utilities::PI = 3;
And all the other universes fall apart. The proper way to do that is:
package Universe::Roman; use My::Number::Utilities; local $My::Number::Utilities::PI = 3;
However, since you can’t force other packages to declare your package variables with local, it’s better to just provide a subroutine that encapsulates this value:
package My::Number::Utilities;
use strict;
use warnings;
our $VERSION = 0.01;
sub pi { 3.14159265351 }
And now the value of PI is read-only.
package Universe::Roman;
use My::Number::Utilities;
my $PI = My::Number::Utilities::pi();Note
Trivia time: It’s a widely held belief that the ancient Romans thought the value of π (PI) was 3. This is a myth. The Roman mathematician Ptolemy calculated the value of π as approximately 3.14166. While this value is wrong, it’s close enough for Roman construction needs.
The stories of Alabama trying to pass a law legislating the value of π as 3 are also not true, but Indiana did try and pass a law altering the value of π back in 1897. It was only stopped due to a mathematician visiting the Indiana legislature during the debate.
http://www.agecon.purdue.edu/crd/Localgov/Second%20Level%20pages/Indiana_Pi_Story.htm
Warning
You’ll often see code like this at the top of modules:
package Foo; use strict; use warnings; our ( $THIS, $THAT, $OTHER ) = qw( foo bar baz );
When a variable should be available throughout the entire package, programmers often use the our builtin to declare these variables at the top of the package. Unless you have a good reason for allowing other packages to read (and change) these variables directly, this is a bad idea. Just declare the variables with my.
my ( $THIS, $THAT, $OTHER ) = qw( foo bar baz );
This at least protects these variables from other packages changing their value.
We repeat: do not package variables unless absolutely necessary. There is one clear exception: declaring version numbers.
Version numbers
You’ll notice in our My::Number::Utilities module, we declared a version number with our:
our $VERSION = 0.01;
While not strictly needed, it’s strongly recommended that you declare a version number for your modules. Thus, if your Time::Dilation module version 2.3 has a bug and you release version 2.4, programmers who wish to avoid your bug can do this:
use Time::Dilation 2.4;
When you assert a version of a module in a use statement, Perl will check the package variable $Module::Name::VERSION and throw an exception if the version is lower than the version needed.
If you do not provide a version number for the module, it makes life difficult for other developers trying to use your code. As a general rule, the version declaration we’ve shown above should be sufficient for your needs, but some argue for the following:
our $VERSION = '0.001'; # make sure it's in quotes! $VERSION = eval $VERSION;
David Golden has an excellent description of why this is the preferred way to write version numbers:
http://www.dagolden.com/index.php/369/version-numbers-should-be-boring/
In short, it allows Perl’s version number to always be considered the same regardless of how the module is being used, such as when the $VERSION is parsed for a CPAN upload or which version of Perl you are using. The full reasons are beyond the scope of this book, but it’s worth reading David Golden’s writing on this topic.
Subroutines in other packages
When building software, you have different parts of the software for different tasks. However, you probably don’t want to type my $pi = My::Number::Utilities::pi() all the time. Also, the My::Number::Utilities module likely has “private” subroutines that you should not be able to call. The former is handled by exporting and the latter is handled by naming conventions.
Exporting
Let’s look at our My::Number::Utilities package again:
package My::Number::Utilities;
use strict;
use warnings;
our $VERSION = 0.01;
sub pi() { 3.14166 } # good enough for 2,000 year old aqueducts and bridges
sub is_prime {
my $number = $_[0];
return if $number < 2;
return 1 if $number == 2;
for ( 2 .. int sqrt($number) ) {
return if !($number % $_);
}
return 1;
}
1;Others using this package are going to get annoyed at typing My::Number::Utilities::pi() and My::Number::Utilities::is_prime() every time they want to use those functions, so you can export those functions to the calling code’s namespace. The most popular module for doing this is Exporter, which has shipped with Perl since version 5. Here’s another case where you need to use package variables because Exporter’s interface requires it.
Near the top of module code you review, you’ll often see the following:
use base 'Exporter'; our @EXPORT_OK = qw(pi is_prime); our %EXPORT_TAGS = ( all => \@EXPORT_OK );
This means that those functions, pi() and is_prime(), can be exported. Now, when someone wants to use your module, they can import those functions by specifying their names in the import list:
use My::Number::Utilities 'pi', 'is_prime';
Or if they prefer, they can only import the functions they want:
use My::Number::Utilities 'is_prime'; print is_prime($number) ? "$number is prime" : "$number is not prime";
When you use base 'Exporter', you are inheriting from the Exporter module (we’ll cover inheritance in Chapter 12, Object Oriented Perl. For now, just follow along). When someone uses your module, the Exporter::import() function is called. It uses your module name and finds the @EXPORT, @EXPORT_OK and %EXPORT_TAGS package variables to determine what functions can be exported. Only function names in @EXPORT_OK and @EXPORT can be exported, but using @EXPORT is usually not recommended because @EXPORT exports all the functions listed in the array and the programmer using your module no longer has control over what is imported into their namespace. It’s no fun accidentally importing a build() function and overwriting the build() function you have in your namespace.
Note
We’ve mentioned that we’ll explain inheritance in Chapter 12, Object Oriented Perl when we cover objects, but it should be mentioned that for modules that are not object oriented, some people object to inheriting from Exporter. Exporter allows you to import the import() subroutine, if preferred:
use Exporter 'import';
This imports the import() subroutine directly into your namespace. No other change in your code is required.
With the above code, if a programmer wants all of the functions, they can ask for that with :all.
use My::Number::Utilities ':all';
The %EXPORT_TAGS package hash has key/value pairs specifying groups of functions that can be exported. When using your module, a developer uses a key name from %EXPORT_TAGS, prefixed with a colon, :, to say the they want that group of functions.
Note
You can also export package variables with Exporter, but this is a bad idea for the reasons already discussed in the Package Variables section of this chapter.
So if your My::Number::Utilities module has subroutines for mathematical constants such as pi, e and phi, you can allow those to be imported separately from the :all tag:
our %EXPORT_TAGS = qw(
all => \@EXPORT_OK,
constants => [qw(ph e phi)],
);Note
You may have noticed that we declared our pi constant subroutine with a null prototype:
sub pi() { 3.14159265351 }When Perl sees such a prototype, if the body of the subroutine is simple, Perl will try, at compile time, to replace all instances of that subroutine call with the value it returns. This is faster than the subroutine call and is called inlining. So if you have a null prototype, this:
use My::Number::Utilities 'pi'; print pi;
Is equivalent to this:
use My::Number::Utilities 'pi'; print 3.14159265351;
See “Constant Functions” in perldoc perlsub.
And someone who just wants the constants and not the is_prime() function can ask for them:
use My::Number::Utilities ':constants';
Note that you’ll often see the constant pragma used to declare constants. These can be exported just like any other subroutine because they’re just created as subroutines with null prototypes.
our @EXPORT_OK = qw(PI E PHI); use constant PI => 3.14159265351; use constant E => 2.71828182846; use constant PHI => 1.61803398874;
Though Exporter is the most common way of exporting functions into other namespaces, modules such as Exporter::NoWork, Perl6::Export and others exist for those who do not care for the Exporter syntax. See a CPAN near you for the latest and greatest alternatives.
Naming conventions
You’ve seen that subroutines representing constants are often UPPER CASE subroutines, but what about other functions?
For a function that other modules are allowed to use, a normal function name is standard:
sub unique {
#
}However, for subroutines you wish to remain private, by convention those are prefixed with an underscore:
sub _log_errors {
#
}While a developer may know about those subroutines, they also know that they should not rely on them. This is no guarantee that they won’t try to use these subroutines, but a good developer knows that they should not. Perl doesn’t try to rigorously enforce privacy by default. You’re expected to behave yourself.
Note
If you really insist upon truly private subroutines, use an anonymous subroutine assigned to a scalar. For example:
package Really::Private;
use strict;
use warnings;
our $VERSION = '0.01';
use Carp 'croak';
my $is_arrayref_of_hashrefs = sub {
my $arg = shift;
# is it an array ref?
return unless 'ARRAY' eq ref $arg;
# return boolean indicating if all elements are hashrefs.
return scalar @$arg == scalar grep { 'HASH' eq ref $_ } @$arg;
};
sub process_records {
my ( $records ) = @_;
unless ( $is_arrayref_of_hashrefs->($records) ) {
croak "process_records() requires an array ref of hashrefs";
}
# process records here
}
1;In the Really::Private package described above, the $is_arrayref_of_hashrefs variable contains an anonymous subroutine reference called by process_records(). Because the subroutine reference is bound to a lexical scalar, it is not available outside of this package.
For subroutines that should return a Boolean value, we recommend starting them with is_.
sub is_prime { ... }Also, you’re strongly urged to use_underscores_to_separate_names insteadOfUpperCaseLetters. Underscores are much easier to read, particularly if English is not your first language.
BEGIN, UNITCHECK, CHECK, INIT and END
There are four special blocks, BEGIN, CHECK, INIT and END, that are executed at different stages of your program. There is also UNITCHECK, which was introduced in Perl version 5.9.5.
These blocks look like subroutines, but they’re not (you can prefix them with the sub keyword, but it’s considered bad style). They automatically execute and cannot be called. Further more, you can have multiples of each of these blocks. For example:
package Foo;
use strict;
use warnings;
BEGIN {
print "This is the first BEGIN block\n";
}
BEGIN {
print "This is the second BEGIN block\n";
}And when you use Foo (or even if you just check its syntax with perl -c Foo.pm), it will print out:
This is the first BEGIN block This is the second BEGIN block
Note
You can read more about BEGIN, INIT, CHECK, UNITCHECK and END in perldoc perlmod.
To understand these special blocks, think of a Perl program’s “lifecycle” as the following steps (this is an oversimplification):
The program is compiled.
The program is executed.
The program is finished.
BEGIN blocks
BEGIN blocks fire during program compilation (Step 1), as soon as the trailing } is found. Ordinarily a print statement happens in Step 2, when the program is executed, so this:
BEGIN {
print "This is the first BEGIN block\n";
}
print "The program is running\n";
BEGIN {
print "This is the second BEGIN block\n";
}Will print this:
This is the first BEGIN block This is the second BEGIN block The program is running
Note that if your program contains a syntax error after a BEGIN block, the BEGIN block will still execute because it is executed as soon as it is compiled and before the program finishes compiling. So this:
BEGIN {
print "This is the first BEGIN block\n";
}
print "The program is running\n";
BEGIN {
print "This is the second BEGIN block\n";
}
my $x =;Will print something like this:
syntax error at /var/tmp/eval_GWUz.pl line 8, near "=;" Execution of /var/tmp/eval_GWUz.pl aborted due to compilation errors. This is the first BEGIN block This is the second BEGIN block
Note that because STDERR and STDOUT are separate filehandles, they are not guaranteed to print in sequence, thus leading to this strange case where we may get the syntax error printed before we have the BEGIN block output printing. This is an artifact of how operating systems work and is not a flaw in Perl.
BEGIN blocks always execute in the order they are found.
Note
Note that this:
use Module ();
Is exactly equivalent to this:
BEGIN{ require Module; }That’s because the parentheses with use Module () tell Perl not to call the import() function and the BEGIN block with require therefore does the same thing.
BEGIN blocks are useful for a variety of purposes, such as checking whether or not necessary files exist before the program runs or verifying that you are on the correct operating system.
END blocks
END blocks are like BEGIN blocks, but they happen in Step 3, when the program is exiting. They will even be called if you die(), but signals and (the incredibly rare) segfaults can cause them to be skipped. They are useful if you need to clean anything up after your program finishes running.
They are executed in the reverse order that they are defined. Thus this:
END {
print "This is the first END block\n";
}
END {
print "This is the second END block\n";
}Prints this:
This is the second END block This is the first END block
INIT, CHECK and UNITCHECK blocks
When your program is done compiling but before it executes is when INIT, CHECK and UNITCHECK blocks fire. Because of this, unlike BEGIN blocks, they will not execute if there is a syntax error. A CHECK blocks runs immediately after Step 1 (compilation) is finished, in a LIFO (last in, first out) order. INIT blocks run after CHECK blocks and just before Step 2, the program execution. They run in the order they are defined (FIFO, first in, first out).
INIT {
print "This is the first INIT block\n";
}
CHECK {
print "This is the first CHECK block\n";
}
INIT {
print "This is the second INIT block\n";
}
CHECK {
print "This is the second CHECK block\n";
}That prints out:
This is the second CHECK block This is the first CHECK block This is the first INIT block This is the second INIT block
As you can see, the CHECK blocks run before the INIT blocks, in reverse order. The INIT blocks are run in the order defined.
UNITCHECK was introduced in Perl 5.9.5. They were designed to solve a problem where code that is loaded during program execution (such as with a require MODULENAME or a string eval) will not execute CHECK and INIT blocks. They would not be executed because those blocks only execute between compilation and execution, not during exection.
A UNITCHECK runs immediately after the code containing it is compiled, even if you are already in the program execution phase. This allows you to “check” necessary conditions before the containing code is executed.
Plain Old Documentation (POD)
Documentation is worth it just to be able to answer all your mail with ‘RTFM’ ~ Alan Cox
So we’ve written a lot of code by now, but what about documenting it? Whenever you reading a module’s documentation on the CPAN, you’re reading POD, short for Plain Old Documentation. POD is a very quick and easy way to write documentation for your modules and it’s quick to learn.
Note
POD is not just for modules. You are encouraged to use it in all of your code that has a lifespan greater than a few hours. You will thank yourself.
When reading module documentation on your computer, you generally do so with the perldoc command. When you type something like perldoc Convert::Distance::Imperial, it searches through @INC for Convert/Distance/Imperial.pm and Convert/Distance/Imperial.pod. It will attempt to format a .pm file if it contains POD. It automatically assumes that a .pod file is POD. This allows you to write a module and keep the documentation in a separate file, if desired:
|--convert.pl | lib/ | | Convert/ | | | Distance/ | | | |--Imperial.pm | | | |--Imperial.pod
POD starts with a command paragraph. A command paragraph is any text starting with = and followed by an identifier. Though it’s called a “paragraph”, it is usually on a single line. POD ends with the =cut command paragraph (or the end of the file). There must be no whitespace to the left of the =.
Warning
Though the =cut on a line by itself is usually sufficient to indicated the end of the POD section, many older POD parsers require a blank line before the =cut.
In general, any text typed as a paragraph in POD will be rendered as such. Here’s a POD paragraph between two subroutines:
sub reciprocal { return 1 / shift }
=pod
This is a paragraph in a POD section. When run through a formatter, the
paragraph text will be rewrapped as needed to fit the needs of your
particular output format.
=cut
sub not_reciprocal { return shift }The following command paragraphs are recognized. You may create custom ones if you create your own POD parser.
=pod =head1 Heading Text =head2 Heading Text =head3 Heading Text =head4 Heading Text =over indentlevel =item stuff =back =begin format =end format =for format text... =encoding type =cut
While POD documentation is often interspersed with code, particularly with a documentation section before each subroutine, many programmers prefer to put their documentation at the end of the module, after a __END__ or __DATA__ literal. There are arguments for and against each style. We’ll leave that up to you.
Documentation Structure
Though the exact format varies, you’ll notice that most modules on the CPAN follow a documentation layout similar to the following (and generally in this order):
NAME- The module nameSYNOPSIS- A brief code snippet showing usageDESCRIPTION- A description of what the module is forEXPORT- An optional list, if any, of what the module exportsFUNCTION/METHODS- Detailed descriptions of every subroutine/methodBUGS- Known bugs and how to report new onesAUTHOR- Who wrote the module (often more than one author)LICENSE- The license terms of the module
It is strongly recommended that you follow this format unless you have a strong reason not to. This makes your documentation consistent with other Perl modules and makes it easier to read. See the documentation for DBIx::Class for a good example of why you might want a slightly different format.
Other common sections include VERSION, DIAGNOSTICS, SEE ALSO (related modules) and CONTRIBUTORS (non-authors who’ve nonetheless offered useful feedback or patches).
The sections generally begin with a =head1 command paragraph.
=head1 NAME Convert::Distance::Imperial - Convert imperial units to other units =head1 VERSION VERSION 0.001 =head1 SYNOPSIS use Convert::Distance::Imperial 'miles_to_inches'; my $miles = miles_to_inches(453285);
Headings
POD, by default, supports 4 levels of headings:
=head1ALL CAPS TEXT=head2Some Text=head3Some text=head4Some text
The pod2html formatter, included with Perl, will render these as <h1>, <h2>, <h3> and <h4>, respectively. Other POD formatters will obviously make different choices. The ALL CAPS for the =head1 command is not strictly required for all POD formatters, but some require it, so you should probably stick with it.
Paragraphs
A paragraph in POD, as mentioned, is merely text you type in a POD section. Note that there must be no whitespace at the start of any paragraph line.
=pod This is a POD paragraph. This is a second POD paragraph. =cut
Lists
A list begins with =over indentlevel (typically the number 4), has one or more =item commands and ends with a =back command.
=over 4 =item * This is a list item =item * This is a second list item. This is an optional paragraph explaining the second list item. =back
You may have an optional paragraph after an =item command. No =headn commands are allowed and while you might think that a nested list is handy, not all POD formatters respect them.
If you don’t want a bulleted list, you can create a numbered list manually. Use 1., 2., 3., etc., after each =item. Many POD parsers are weak in this area, so double check that your desired POD parser handles this correctly.
=over 4 =item 1. This is a list item =item 2. This is a second list item. This is an optional paragraph explaining the second list item. =item 3. =back
If you don’t want a bulleted or numbered list, just use =item followed by your desired list text.
The indentlevel is optional as it defaults to 4. Many POD formatters ignore it entirely, while others consider that to be the number of ems width of indent (an em is the with of the capital letter M in the base font of the document).
Verbatim
So why can’t we have any whitespace at the start of a line of normal POD paragraph? Because any text with a leading whitespace is rendered verbatim. This makes it very easy to insert code in your documentation.
=head1 SUBROUTINES =head2 C<miles_to_yards> use Convert::Distance::Imperial 'miles_to_yards'; my $yards = miles_to_yards($miles); print "$miles miles is $yards yards\n"; The C<miles_to_yards()> subroutines takes a number, in miles, and returns a number, in yards.
We’ll explain that funky C<> stuff in just a bit.
Miscellaneous
With headings, paragraphs, lists and verbatim text, you now know most of the POD syntax people use. However, there are a few other commands we’ll take the time to explain. These are merely the most popular and you will want to read perldoc perlpod to understand what else you can do.
Formatting codes
Sometimes you want to have a bit more control over the output. It can be useful to have fixed-width text, bold or italics text. All paragraphs and some command paragraphs allow formatting code, also known as interior sequences. Here’s how I might format this paragraph in POD:
Sometimes you want to have a bit more control over the output. It can be useful to have C<fixed-width text>, B<bold> or I<italics> text. All paragraphs and some command paragraphs allow formatting codes, also known as I<interior sequences>. Here's how I might format this paragraph in POD.
Formatting codes begin with a single upper case letter, followed by a <, followed by the desired text and ending with a >. Some POD formatters require all of these to be on the same line. Table 11.1, “Common Formatting Codes” has common POD formatting codes.
Table 11.1. Common Formatting Codes
Code | Meaning |
|---|---|
C<text> | Fixed-width ('C’ode) |
C<< text >> | Fixed-width and ignore special characters ( |
B<text> | Bold |
I<text> | Italics |
E<text> | Escape text (generally, you can use HTML escape names such as |
S<text> | All spaces are non-breaking |
L<text> | Create a link |
Linking
Linking is often skipped in Perl modules, but it’s good to know, particularly since good linking will show up on CPAN modules and make it easier to cross-reference other documents. Note that names must not contain | or / and if they contain < or >, they must be balanced.
There are three primary linking formats:
L<name>
This links to a Perl manual page, such as L<Scalar::Util> or L<perlunitut>. This form of linking does not allow spaces in the name.
L<name/"sec">orL<name/sec>
Link to a section of a man page. For example, L<perlpod/"Formatting Codes">
L</"sec">orL</sec>
Link to another section of the current POD document.
If you prefer, you can prefix any of these with a text| to give them a more readable name:
L<Read about formatting codes|perlpod/"Formatting Codes">
Or that’s the theory, at least. Some POD formatters struggle with this syntax.
Finally, you can link to a URL:
L<http://overseas-exile.blogspot.com/>
You cannot give a nice “text” name to a URL, however, so the text|link syntax does not work.
Encoding
Your POD documents are generally written in ASCII or Latin-1. However, if you need them to be in another encoding, you must specify this with the =encoding command.
=encoding utf8
The Encoding::Supported module from the CPAN will give you a list of supported encodings.
Creating and Installing
Writing a module is all fine and dandy, but what about installing it? When a module is properly installed, you no longer require a use lib 'lib'; line to tell Perl where to find it. The module will probably be installed in one of the paths in @INC and Perl will find it when you use it. This is also the first step in creating a distribution that can be given to other programmers for installation or uploaded to the CPAN. Sharing is good. Installable modules are shareable modules.
Creating a simple module
In the old days, people used a program that ships with Perl called h2xs, but it’s so old and out of date that we mention it just to say “don’t bother”. Today many people are using Dist::Zilla to create and install modules and while we recommend it, it’s beyond the scope of this book. Instead, we recommend that you start by installing Module::Starter from the CPAN. It will provide a module-starter program. Let’s create an installable version of our Convert::Distance::Imperial program.
module-starter --module=Convert::Distance::Imperial \
--author='Curtis "Ovid" Poe' \
--email=ovid@cpan.orgThat will create a directory named Convert-Distance-Imperial/. Using our tree.pl program, we see the following directory structure:
$ tree.pl Convert-Distance-Imperial/ Convert-Distance-Imperial/ |--Changes |--MANIFEST |--Makefile.PL |--README |--ignore.txt | lib/ | | Convert/ | | | Distance/ | | | |--Imperial.pm | t/ | |--00-load.t | |--boilerplate.t | |--manifest.t | |--pod-coverage.t | |--pod.t
There’s a lot of stuff here, so let’s go over each item.
Note
You may find it annoying to type your name and email every time you run module-starter. perldoc Module::Starter doesn’t (as of this writing) suggest how to avoid that, but perldoc module-starter tells you that you can create a $HOME/.module-starter/config file (where $HOME is your home directory) and add your name and email in that:
author: Curtis "Ovid" Poe email: ovid@cpan.org
Then you can just type:
module-starter --module=My::Module
And the author and email information will be filled in for you.
The
Changesfile contains a list of changes for each version of your program.The
MANIFESTshould list each file that must be included in the actual distribution.The
Makefile.PLis a Perl program used to create amakefilethat you use withmake(or sometimesnmakeordmakeon Windows). Amakefileis a file that explains how to build your software. If you decide to read themakefile, be very careful not to change it unless you’re familiar with them. It contains embedded tabs and if you accidentally convert them to spaces, you will break themakefile.The
READMEis for the user to understand how to build and install the distribution and often has the distribution documentation embedded in it.The
ignore.txtis a template to use with various version control systems, such as git or Subversion, to know which files to ignore. You often want to copy that file (or its contents) to an appropriately named file for your version control system.The
lib/directory contains the modules you wish to install.The
t/directory contains the tests for the module. We’ll cover testing in Chapter 14, Testing.
You can just copy your copy of Convert/Distance/Imperial.pm to Convert-Distance-Imperial/lib/Convert/Distance/Imperial.pm and you have an installable module.
Let’s do that now in Try It out 11-2.
Now that you’ve created your first “proper” module, turning it into a distribution is a snap. After you’ve done perl Makefile.PL, make and make test, you can type make dist and something called a tarball will be created for you. In this case it will have a name similar to Convert-Distance-Imperial-0.01.tar.gz. That’s suitable for uploading to the CPAN. If you wish to do that, you will need a PAUSE (Perl Authors Upload SErver) account. You can apply for one at https://pause.perl.org/ and start sharing your CPAN modules with everyone else.
You will also notice that after you type make, there are many extra files that have been built, such as a makefile, a blib/ directory and a pm_to_blib/ directory. They can be useful for debugging build problems, but to make them go away, you can just type make realclean. They will return the next time you type make.
Makefile.PL or Module::Build?
You might think it odd that a language like Perl uses an external tool like a makefile to control how it builds its modules. After all, Java has ant and Ruby has rakefiles, why not a pure Perl alternative? This is because when Perl was first introduced, long before either Java or Ruby, it was very common on Unix-like systems and people who were likely to use Perl already knew about makefiles.
The Perl module that creates the actual makefile is called ExtUtils::MakeMaker (often referred to simply as EUMM) and it’s a beast to maintain. This is because there are many different implementations of the make program, not all of which are compatible with one another. Further, different operating systems have different constraints about filenames, paths, how commands get executed, and so on. Because the makefile must respect those constraints, the job gets harder. Imagine all of the different types of make utilities and the different incompatible operating systems and you can understand why this system has been hard to maintain.
As a result, the Module::Build project was started. It’s written entirely in Perl and is much easier to extend than EUMM. Unfortunately, Module::Build was a buggy when it first came out. Further, because EUMM had some design flaws when it was implemented, Module::Build fixed some of those flaws and this led to subtle incompatibilities between the two. Michael Schwern, the maintainer of EUMM has tried to convince people to switch to Module::Build, but many developers have chosen not to.
If you wish to use Module::Build with module-starter, just pass the --mb switch to module-starter. You’ll also want to read Module::Build::Authoring. The build process is then:
perl Build.PL ./Build ./Build test ./Build install
Finally, there’s the Module::Install module. This is designed primarily to work with ExtUtils::MakeMaker and is very easy to learn, particularly for new programmers. If you have Perl version 5.9.4 or better, you will already have Module::Build installed. You will have to install Module::Install separately.
Summary
In this chapter, you have learned the basics of writing modules and building distributions. You’ve learned about the phases of program execution and how to export subroutines to other packages. You’ve also learned how to document your modules. We strongly recommend that you read perldoc perlmod for more information.
Exercises
Write a module,
Convert::Distance::Metric, which will contains the following subroutines:kilometers_to_metersmeters_to_kilometers
Make those subroutines optionally exportable and let people also import all of them with:
use Convert::Distance::Metric ":all";
Add this module to your
Convert-Distance-Metricdistribution. Don’t forget to add it to theMANIFEST.Add full POD to the
Convert::Distance::Metricmodule. Include the following sections:NAME- The module nameSYNOPSIS- A brief code snippet showing usageDESCRIPTION- A description of what the module is forEXPORT- An optional list, if any, of what the module exportsFUNCTION- Detailed description of every subroutineBUGS- Known bugs and how to report new onesSEE ALSO- Include a link toConvert::Distance::Imperial.AUTHOR- Who wrote the module (often more than one author)LICENSE- The license terms of the module
Be sure to type
perldoc lib/Convert/Distance/Metric.pmto verify the POD output.Write a short program to convert 3.5 kilometers to meters and convert the answer back to kilometers.
(optional). We haven’t covered testing yet, but edit the
t/00-load.ttest program in theConvert-Distance-Imperialdistribution and try to add a test to verify that you can loadConvert::Distance::Metric. You can check to see if it works with:prove -lv t/00-load.t or just run: perl makefile.pl make make test
WHAT YOU LEARNED IN THIS CHAPTER
Topic | Key Concepts |
|---|---|
Namespace | A container which groups names |
Package | A namespace for package variables, subroutines, and so on |
Module | A file which contains one or more packages |
Distribution | A single file containing everything need to build and install a module or group of modules |
use and require | Loading modules at compile time and run time |
Exporting | A way to put subroutines in other packages |
BEGIN, et al. | Blocks of code which execute at specific phases of the program run |
POD | How to document your code |
Answers to exercises
Write a module,
Convert::Distance::Metric, which will contains the following subroutines:kilometers_to_meters
meters_to_kilometers
Make those subroutines optionally exportable and let people also import all of them with:
use Convert::Distance::Metric ":all";
One way of writing this package would be:
package Convert::Distance::Metric; use strict; use warnings; our $VERSION = '0.01'; use Exporter 'import'; our @EXPORT_OK = qw( kilometers_to_meters meters_to_kilometers ); our %EXPORT_TAGS = ( all => \@EXPORT_OK ); use constant METERS_PER_KILOMETER => 1000; sub meters_to_kilometers { my $meters = shift; return $meters / METERS_PER_KILOMETER; } sub kilometers_to_meters { my $kilometers = shift; return $kilometers * METERS_PER_KILOMETER; } 1;Don’t forget that trailing
1!Add this module to your
Convert-Distance-Metricdistribution. Don’t forget to add it to theMANIFEST.When you have added
Convert::Distance::Metric, you should see a file layout like this in yourlib/directory:lib/ | Convert/ | | Distance/ | | |--Imperial.pm | | |--Metric.pm
Your
MANIFESTshould now look like this:Changes lib/Convert/Distance/Imperial.pm lib/Convert/Distance/Metric.pm Makefile.PL MANIFEST This list of files README t/00-load.t t/manifest.t t/pod-coverage.t t/pod.t
If you do not include
lib/Convert/Distance/Metric.pmin your manifest, it will not be included in the distribution when you typemake dist.Add full POD to the
Convert::Distance::Metricmodule. Include the following sections:NAME- The module nameSYNOPSIS- A brief code snippet showing usageDESCRIPTION- A description of what the module is forEXPORT- An optional list, if any, of what the module exportsFUNCTION- Detailed description of every subroutineBUGS- Known bugs and how to report new onesSEE ALSO- Include a link to Convert::Distance::Imperial.AUTHOR- Who wrote the module (often more than one author)LICENSE- The license terms of the module
Be sure to type
perldoc lib/Convert/Distance/Metric.pmto verify the POD output.For simplicity’s sake, we’re going to add the POD after a final
__END__literal.__END__ =head1 NAME Convert::Distance::Metric - Convert kilometers to meters and back =head1 SYNOPSIS use Convert::Distance::Metric ":all"; print kilometers_to_meters(7); print meters_to_kilometers(3800); =head1 DESCRIPTION This is a simple module to convert kilometers to meters and back. It's mainly here to show how modules are built and documented. =head1 EXPORT The following functions may be exported on demand. You can export all of them with: use Convert::Distance::Metric ':all'; =over 4 =item * C<kilometers_to_meters> =item * C<meters_to_kilometers> =back =head1 FUNCTIONS =head2 C<kilometers_to_meters> my $meters = kilometers_to_meters($kilometers); This function accepts a number representing kilometers and returns the number of meters in that number of kilometers. =head2 C<meters_to_kilometers> my $kilometers = meters_to_kilometers($meters); This function accepts a number representing meters and returns the number of kilometers in that number of meters. =head1 BUGS None known. Report bugs via email to C<me@example.com>. =head1 SEE ALSO See the L<Convert::Distance::Imperial> modules for imperial conversions. =head1 AUTHOR Curtis "Ovid" Poe C<ovid@cpan.org> =head1 LICENSE Copyright 2012 Curtis "Ovid" Poe. This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License. See http://dev.perl.org/licenses/ for more information.
Write a short program to convert 3.5 kilometers to meters and convert the answer back to kilometers.
use strict; use warnings; use lib 'lib'; use Convert::Distance::Metric ":all"; my $kilometers = 3.5; my $meters = kilometers_to_meters($kilometers); print "There are $meters meters in $kilometers kilometers\n"; $kilometers = meters_to_kilometers($meters); print "There are $kilometers kilometers in $meters meters\n";
Running the program should print out:
There are 3500 meters in 3.5 kilometers There are 3.5 kilometers in 3500 meters
5. (optional). We haven’t covered testing yet, but edit the
t/00-load.ttest program in theConvert-Distance-Imperialdistribution and try to add a test to verify that you can loadConvert::Distance::Metric. You can check to see if it works with:prove -lv t/00-load.t or just run: perl makefile.pl make make test
We haven’t covered testing yet, but the initial
t/00-load.tlooks something like this:#!perl -T use Test::More tests => 1; BEGIN { use_ok( 'Convert::Distance::Imperial' ) || print "Bail out!\n"; } diag( "Testing Convert::Distance::Imperial $Convert::Distance::Imperial::VERSION, Perl $], $^X" );After you add
Convert::Distance::Metric, it should look like this:#!perl -T use Test::More tests => 2; BEGIN { use_ok( 'Convert::Distance::Imperial' ) || print "Bail out!\n"; use_ok( 'Convert::Distance::Metric' ) || print "Bail out!\n"; } diag( "Testing Convert::Distance::Imperial $Convert::Distance::Imperial::VERSION, Perl $], $^X" );Testing this with the
proveutility should produce output similar to the following:$ prove -lv t/00-load.t t/00-load.t .. 1..2 ok 1 - use Convert::Distance::Imperial; ok 2 - use Convert::Distance::Metric; # Testing Convert::Distance::Imperial 0.01, Perl 5.010001, /Users/curtispoe/perl5/perlbrew/perls/perl-5.10.1/bin/perl ok All tests successful. Files=1, Tests=2, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.02 cusr 0.00 csys = 0.05 CPU) Result: PASS





View 1 comment




In second step building
convert.plyou have:use convert::distance::imperial ':all';And it should be:
use Convert::Distance::Imperial ':all';Add a comment