Chapter 2. Understanding the CPAN
WHAT YOU WILL LEARN IN THIS CHAPTER
Understanding the CPAN
Finding and evaluating modules
Downloading and installing modules manually
Using CPAN clients to install modules
This is the end of Chapter 10, Sort, map and grep. Or it was. Many Perl books, if they include information about the CPAN (Comprehensive Perl Archive Network), mention it almost as an afterthought, just as your author was going to. However, CPAN is the soul of Perl. Its use is so common that your author repeatedly stumbled on creating compelling examples of Perl without duplicating code already on the CPAN. Thus, the CPAN is now not only near the front of the book, it has an entire chapter all to itself. You cannot be a real Perl programmer without understanding the CPAN.
It’s been said that the best way to make a technology popular is to release a killer app that requires it. VisiCalc, a precursor to spreadsheets, made the Apple II computer popular. Ruby On Rails is the killer app that made the Ruby programming language famous.
Perl has the CPAN. Though many have tried, nothing compares to the CPAN.
In 1994, on the Perl-packrats mailing list, an idea was born. The idea was simple: make a single place for Perl authors to upload their modules and for others to download them. That idea became the Comprehensive Perl Archive Network (CPAN) and was launched in 1995. Since then, it has grown to enormous size. By October of 2011, the CPAN had this to say for itself (http://www.cpan.org/):
The Comprehensive Perl Archive Network (CPAN) currently has 100,649 Perl modules in 23,600 distributions, written by 9,282 authors, mirrored on 269 servers.
The breadth of modules available on the CPAN is amazing. There are many popular Web frameworks available, such as
DBI, the standard database interface. Or, if you prefer ORMs (Object-Relational Mappers), there’s
Rose::Db. It has plenty of artificial intelligence modules in the
AI:: namespace (we’ll explain namespaces a bit more in Chapter 3, Variables) and more testing modules than you can imagine in the
Test:: namespace. There’s even an entire
bioperl distribution because Perl is used heavily in biology research.
There’s even an Acme:: namespace, where people upload humorous modules just for fun. Your author has over 40 modules on the CPAN at http://search.cpan.org/~ovid/, though many of them are for rather obscure problems.
In fact, that’s part of what makes the CPAN so great. When you have a relatively obscure problem, there’s a good chance there’s a CPAN module for it. Today, many are surprised when they have a problem and there’s not a CPAN module for it. Whenever possible, don’t reinvent the wheel. Look for a solution on the CPAN and see if you can save yourself a lot of time and effort by using someone else’s code. That’s why it’s there.
Oh, and did we mention that all code on the CPAN is both free and open source?
You’re going to see many differences between Windows and other operating systems here. That’s unfortunate, but we’ll try to minimize those differences as much as possible. The short description: use the automated tools we recommend for you (CPAN clients, for example) and don’t try to do this stuff manually. You’ll probably get it wrong until you understand what’s happening here. Fortunately, this is probably your biggest hurdle if you use Windows.
CPAN and METACPAN
The cpan.org Web site is the “original” CPAN and is currently the one that most people think of when they think of the CPAN Web site. It allows you to browse distributions, search distributions, check test results on modules and read reviews of said modules.
As an alternative to
cpan.org, there’s the new
metacpan.org. When writing a book, there’s always a danger in describing new technology due to the chance that it will change or cease to exist by the time the book is printed (I really hope my publisher doesn’t read this section), but metacpan.org has enough developers working on it and seems stable enough that it’s worth including in this book.
It has a search engine with autocomplete driven by the excellent ElasticSearch search engine (http://www.elasticsearch.org/). In addition to offering everything that cpan.org offers, it also has an API to allow you to write your own CPAN tools, if needed. You can sign up for a free acount with metacpan and add modules as favorites, link other accounts to your metacpan account, and even accepts a Paypal donation email address. In short, it’s social networking for the CPAN. Add the API on top of it and your author expects that metacpan is the future of the CPAN. Your author has also been wrong before.
Alternatively, some people like
cpan.uwinnipeg.ca, but those are less popular.
You won’t actually be using much of this information when you first start learning Perl, but the further you go in your Perl journey, the more crucial CPAN will be. You will repeatedly find yourself facing a hard problem, only to find that someone else has done the work for you and uploaded it to the CPAN for you.
Finding and Evaluating Modules
For cpan.org, you can browse the modules at http://www.cpan.org/modules/index.html. You can browse by author, module name, recent modules, and so on. However, many people are looking for modules to handle a problem they need to solve, not for a particular author or module name. Given the size of the CPAN, browsing is somewhat impractical. You want to search for a module and not just browse them. For that, you want to use http://search.cpan.org/.
The front page of
search.cpan.org has a list of module categories you can browse through, but given the size of the CPAN, this list is not well maintained. Instead, use the search box. So let’s say you need to write some software that displays the weather forecast. Searching for “weather” brings up something like this:
And that’s just the first page of search results!
Each result actually has a bit more detail. For example, the
Weather::Google module has this:
Weather::Google Perl interface to Google's Weather API Weather-Google-0.05 (2 Reviews) - 26 Jan 2010 - Daniel LeWarne
The first line is the name of the module and it’s also a link to the module documentation. After that is a short description, its current distribution name, a link to reviews (if any), the date of its release and the author name. As you get more familiar with the CPAN and the Perl community, you’ll learn to recognize author names and may help you decide whether a given distribution is worth looking at.
If you click on the
Weather::Google link, you’ll be taken to a page looking similar to Figure 2-1.
There’s a lot of information on this page, so we’ll just cover the highlights.
The first thing you probably want to do is click on the next
Weather::Google link, in the Modules section on the bottom of the page. That will generally show you the main documentation for the module. Larger modules, such as
DBIx::Class, will often have many modules bundled together and you’ll need to read through the list carefully to understand which ones will give you the most useful information. There may even be a “Documentation” section below the Modules section.
In reading through the documentation, you’ll find that most Perl modules have a standard format. You’ll see sections for NAME, SYNOPSIS, DESCRIPTION, and so on. Just reading through those three sections should tell you if the module in question satisfies your needs.
If you return to the page illustrated in Figure 2-1, you’ll see that
Weather::Google has a “CPAN Testers” section with PASS (337) FAIL (32). When you upload a module to the CPAN (well, to PAUSE, but we’re not covering that), many people will download your module and attempt to build it on their system. As you can see,
Weather::Google fails to build on approximately 10% of the systems. This is a rather high failure rate and you might want to click the
[ View Reports ] link and browse through some of the test failures to find out what’s going on.
Weather::Google also has a Rating section. Most modules do not have user ratings attached, but here we see that we have two five-star (good) ratings. You can click through to read what the ratings say.
There is, of course, much more information available on this page and you should play around with it and try to learn a bit more about it.
Downloading and installing
You’ve searched for a module, found one you want, and now you want to install it. That’s usually fairly simple once you’ve done it the first time or two, but getting to that first module to install can be problematic if you’re on Windows.
We’re going to explain how to do this manually because you’ll need to know when you eventually start writing your own modules. Later we’ll explain using various CPAN tools which will make most of this automatic. After you’ve read about manual installation, you’ll be grateful that there’s an automatic procedure that does all of this work for you. However, you’ll sometimes find that you need to install modules by hand, or maybe you’re just a masochist and like doing things the hard way. It’s up to you.
First, in Figure 2-1 you’ll see a “download” link next to the module name. Click that link to download the distribution. For the
Weather::Google distribution we’ve been using, you’ll be downloading a file named
Most CPAN distributions (exceptions tend to be very old distributions) end in with
.tgz. These are “tarred”, “gzipped” files. There’s some old Unix history going on behind the names, but you can ignore that. For OS X and Linux users, you can unpack the distribution with this command:
tar zxf Weather-Google-0.05.tar.gz
If you have the tar command, you can type man tar for more information about the tar command. Warning: it’s a long, complicated page and if you’re unfamiliar with man output, it can be daunting. A Web search may prove more useful.
Windows users will generally find that Winzip or other “zip” programs they have will allow them to unpack
.tgz files. If they don’t have a command line interface, just double click on the distribution icon to unpack it. Just make sure it’s unpacked into the correct directory.
It’s also possible that the distribution will come with a
.zip extension. If your tar command is new enough, you should be able to just
tar zxf filename.zip. Otherwise, use a zip program to handle it. You won’t find these distributions very often and they’re usually from Windows users.
Once unpackaged, change to the directory that’s created and list the files (if you’re on Windows, use the dir command instead of ls).
cd Weather-Google-0.05/ ls
You should see a list of files like the following:
Build.PL Changes INSTALL MANIFEST META.yml Makefile.PL README lib t
You can ignore most of those for now. The
README file usually contains instructions for installing, but in this case, it’s merely a copy of the documentation that ships with the distribution. That’s OK. What you really are interested in are two files:
If you see
Build.PL you can build, test, and install your distribution with this:
perl Build.PL ./Build ./Build test ./Build install
Makefile.PL, you can do this:
perl Makefile.PL make make test make install
You’ll want to read the output of each of those steps carefully to make sure they’re doing what you want. In this case, when you run
perl Build) it will have output similar to the following:
$ perl Build.PL Checking prerequisites... requires: ! XML::Simple is not installed build_requires: ! Test::Pod is not installed ERRORS/WARNINGS FOUND IN PREREQUISITES. You may wish to install the versions of the modules indicated above before proceeding with this installation Run 'Build installdeps' to install missing prerequisites. Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'Weather-Google' version '0.05'
Which means you’ll have to:
$ perl Build installdeps
And hope all of the dependencies install correctly. This may fail due to not having sufficient permissions or simply because some dependencies fail their tests. If your module has a
Makefile.PL and no
Build.PL, then it might not even allow you to automatically install these dependencies (it depends on how the
Makefile.PL is written), thus forcing you to download and install all dependencies by hand, possibly repeating this procedure over and over.
Note that the
./Build test or make test steps are completely optional. They merely run any tests included with the distribution. If you run this, you’ll see output similar to this.
$ ./Build test t/00-load.t ................. 1/1 # Testing Weather::Google 0.05 t/00-load.t ................. ok t/01init.t .................. ok t/02current_conditions.t .... ok t/03forecast_conditions.t ... ok t/04forecast_information.t .. ok t/05language.t .............. ok t/pod-coverage.t ............ ok t/pod.t ..................... ok All tests successful. Files=8, Tests=388, 4 wallclock secs Result: PASS
In the process of writing this, your author discovered that
Weather::Google requires an internet connection for the tests to run. This is not surprising due to the fact that it contacts Google for the results, but it’s problematic because you won’t always have an internet connection when running tests. It’s one of many subtle issues you can run into when testing.
There’s also a problem with the
./Build install and
make install commands. They often require root access and must be run like this:
sudo ./Build install sudo make install
(If you’re a Windows user, this probably won’t apply as you’ll probably have Administrator access to your box.)
That’s because the default installation is usually in a directory that your regular user accounts won’t have access to. You can install your modules to some place you do have access to if you wish:
perl ./Build.pl --install_base /path/to/install/modules #or perl Makefile.PL INSTALL_BASE=/path/to/your/home/dir
Why do we have both
Makefile.PL to build Perl modules? A long time ago, in a garage far, far away,
Makefile.PL was created to allow creation of a
Makefile to build your Perl module. Unfortunately, with over 100 supported platforms and many different and conflicting
make programs, it turned out to be very difficult to write portable makefiles. Plus, some systems don’t support make at all!
Build.PL was created.
Makefile.PL relies on
ExtUtils::MakeMaker to create makefiles.
Build.PL only relies on Perl to install itself. As Perl is far more portable than make, it was considered by some to be a better solution.
ExtUtils::MakeMaker turns out to be far too difficult to extend for new features. Unfortunately,
Module::Build has historically had a few bugs and many developers rejected it. It offers more features, but some of the same features needed to be implemented differently.
The battle between the two formats rages to this day and we’re rather stuck with the mess.
But now you need to understand a lot about how to tell Perl where to find these modules and that can get annoying if you’re not familiar with it. Instead, if you don’t use Windows, use
perlbrew if possible. You’ll install the modules in a subdirectory off your home directory and
perlbrew will magically handle making sure that Perl knows where your modules are.
If you do use Windows, we recommend Strawberry Perl because most of this will magically work out of the box. However, if you prefer to use ActivePerl, you’ll want to read the
ppm section later in this chapter. Fortunately, as of August 2009, ActiveState Perl has been updated to make using CPAN much easier. Make sure you’re using a recent version of ActiveState Perl version 5.10.1 or better. The CPAN client bundled with it is preconfigured and when you first run it, it will note that you’re missing
dmake and a compiler and it will download, build and install them for you. You’ll see a message similar to the following when you first run cpan:
C:\>cpan It looks like you don't have a C compiler and make utility installed. Trying to install dmake and the MinGW gcc compiler using the Perl Package Manager. This may take a few minutes...
Then just wait a few minutes while it handles downloads and installs everything. After that is done, everywhere that you see instructions to run the make command, you’ll type
Or you can install Strawberry Perl and this is not an issue because it will come bundled with everything you need. Seriously, don’t do try installing modules by hand unless you need to and know what you’re doing.
Have you been scared enough to not do this on your own? To be fair, we’ve only skimmed the surface of things that can go wrong if you try to install modules manually. Your author is has been doing this for years and he’s quite used to it, but even he prefers the clients.
The CPAN.pm module which comes bundled with Perl is the oldest of the CPAN clients and comes with Perl. To run it you just type cpan and that will put you in the CPAN shell. If you use Strawberry Perl for Windows (sense a theme here?), it’s configured for you already. Otherwise, it will first prompt you for basic information. The prompt message may vary. Older versions will ask:
Are you ready for manual configuration? [yes] Newer versions will ask: Would you like to configure as much as possible automatically? [yes]
Note that the sense of the question has been reversed. If you’re asked to configure as much as possible automatically, just hit Enter and cpan will set everything up for you, except for your urllist. The urllist tells the client where to find and download CPAN modules from CPAN mirrors all over the world. Just follow the instructions carefully, choosing the continent you’re on, then your country, and finally choosing a few mirrors that are hopefully close to you. Don’t stress too much about getting these mirrors perfect. In fact, newer CPAN clients will even ask you if you want it to automatically pick the mirrors for you, making this much easier than it used to be. All in all, getting starting with a CPAN client is a breeze compared to what it used to be.
If you choose to go the manual configuration route, you will be asked many questions about the CPAN build and cache directory, the cache size, what you wish to cache, terminal settings, whether to follow prerequisites, where various programs are installed, and so on. Most of these questions have defaults and if you don’t understand the question, hitting Enter and accepting the default is usually fine.
After configuring the CPAN, you probably want
Bundle::CPAN. To install a module, you type
install module::name at the
cpan > install Bundle::CPAN
This will take a while for the first time, but it will update your CPAN client to the latest version. It will also add a few extra features, such as readline support, that are not available by default due to license issues.
Weather::Google module we mentioned earlier:
cpan > install Weather::Google
When you do this, the client will:
Find the latest version of the module
Follow dependencies (optional)
If there are any dependencies, the CPAN client will either prompt you if you wish to install them or, if you’ve configured it to follow dependencies automatically, it will go through its find, download, unpack, build, follow, test and install steps for every dependency. For
Weather::Google, we find that we have dependencies on both
XML::Simple (both, in turn, having other dependencies). Having your client do all of this automatically for you is a huge timesaver and means it’s more likely to get it right than you will.
Note that if any tests fail, the client will not install the module. You can either choose a different module or, if you’ve investigated the tests and don’t think they apply to you, you can force the module to install anyway:
cpan > force install Weather::Google
To better understand what you can do with your
cpan client, a small amount of help is available.
The output will vary considerably depending the CPAN version you have installed.
If you’re using a Linux/OS X computer and you’ve decided to install modules in directories your regular user does not have access to, you may have to type
sudo cpan to allow your modules to install. We recommend installing as a non-root user if feasible.
A new and popular CPAN client is
cpanm, also known as
App::cpanminus. It’s very fast, requires no configuration and has no dependencies on other modules. This makes it very easy to install. If you use a package management system like Debian, FreeBSD ports, and so on, search for
cpanminus and attempt to install it that way. You can also install it with this option:
curl -L http://cpanmin.us | perl - --sudo App::cpanminus
If you’re using
local::lib, or some other method of ensuring your Perl modules do not require root access to install, you can omit the
curl -L http://cpanmin.us | perl - App::cpanminus
Finally, you can click the Download link at http://search.cpan.org/dist/App-cpanminus/ and install it manually, as explained previously.
tar zxf App-cpanminus-1.5004.tar.gz cd App-cpanminus-1.5004/ perl Makefile.PL make make test make install
The install step, as mentioned, may need to be changed to sudo make install.
If you’re on Windows and you’re using
nmake, just change the last three lines:
nmake nmake test nmake install
Then, to install a module, just type
cpanm module. The
cpanm program will attempt to install the module for you, quickly and easily. It produces very little output beyond “downloading this, configuring that”, and related messages. Many modules will ask questions such as “Do you wish to install X”.
cpanm attempts to just do the right thing without bothering you. Large, complicated modules with many dependencies can be a hassle to install even with the
cpanm usually makes it pretty easy.
If you’re using ActivePerl, you’re probably on Windows and, if you have trouble with a CPAN client, you can use
ppm, or the “Perl Package Manager” that ships with ActivePerl. This uses a large set of prebuilt modules that just work. Want to install
ppm install Text::CSV_XS
If you run
ppm without any arguments, a GUI will be launched and you browse installed packages, upgrade, remove, or install new packages. The GUI will let you do anything the command line version of
ppm will do and it may be a more comfortable environment for you to work in. However, you cannot upgrade core modules (modules that ship with Perl) with
ppm. As a result, you cannot install any module that requires a core module to be upgraded.
CPAN::Mini isn’t really a client, but it’s so darned useful that you need to know about it. Sometimes you’ll find that you want to install a CPAN module, but you have no Internet connection or a very slow Internet connection.
CPAN::Mini allows you to create a “mini” CPAN mirror on your computer, complete with the latest versions of all modules.
To start using
CPAN::Mini, open up your favorite text editor and type the following:
local: ~/minicpan/ remote: http://cpan.pair.com/pub/CPAN/
Save that in your home directory as
local: key should point to where you want your miniature copy of CPAN to be store. If you prefer, you can use a full path to a particular directory:
Note that Windows uses a backslash instead of a forward slash for directory separators, but Perl is smart enough to do the right thing, even if you use forward slashes instead:
The remote: key should point to a close CPAN mirror. You can see a list of CPAN mirrors at http://www.cpan.org/SITES.html.
From there you run the
minicpan command periodically to update your local copy. Note that the first time you run this command, it will take a long time because it will need to fetch the latest version of every CPAN module. If you run it regularly, subsequent updates will be much faster.
To install modules from your local
CPAN::Mini mirror, you can configure your CPAN client to use this mirror:
$ cpan cpan shell -- CPAN exploration and modules installation (v1.9800) Enter 'h' for help. cpan> o conf urllist unshift file:///Users/ovid/minicpan Please use 'o conf commit' to make the config permanent!
As noted in the output, use
o conf commit if you want this change to be permanent.
When this is done, attempting to install module will fetch it from your local mirror instead of using the Internet.
cpanm, tell it to use your mirror and only the mirror:
cpanm --mirror ~/minicpan/ --mirror-only Weather::Google
If you make heavy use of shell aliases, add the following to your list of aliases:
alias minicpanm='cpanm --mirror ~/minicpan/ --mirror-only'
And when you’re without an Internet connection:
Congratulations! You now know how to find and install modules from the CPAN! In this chapter you learned about the CPAN, the world’s largest collection of open source code dedicated to a single programming language. You learned the
cpanm clients, how to create a miniature CPAN mirror and you’ve installed your first module,
What You Learned in This Chapter
The world’s largest collection of open source code for a single language.
The Web site for the CPAN
The original client program for downloading and installing CPAN modules
A new and excellent alternative to CPAN.pm
The CPAN client bundled with ActivePerl
Create a local CPAN mirror