A Perl Module Primer

Aug 18, 2007

I've recently been wrangling with some Perl code for a project at work, and have been putting together a Perl module that includes a number of common functions that I need. As such, I had to remind myself how to create a Perl module. During my initial development, I ran into a number of problems, but I eventually worked through all of them. In the hopes of helping myself remember how to do this, and to help any other burgeoning Perl developers, I've written the following little guide. Hopefully it will help shed some light on this subject.

Let me preface this guide with two important statements:

  1. I'm not aiming to show you how to create a module for distribution. Most of the other tutorials cover that topic in depth.
  2. I am going to assume that you have a working knowledge of Perl.

To start, let's take a look at our sample module:

package MyPackage;
use strict;
use warnings;

require Exporter;
our @ISA = ("Exporter");

our %EXPORT_TAGS = ( 'all' => [ qw(sayHello whoAreYou $firstName
    %hashTable @myArray) ] );
our @EXPORT_OK = (@{ $EXPORT_TAGS{'all'} });
our @EXPORT = qw();

our $firstName = "Jonah";
our $lastName = "Bishop";

our %hashTable = { a => "apple", b => "bird", c => "car" };
our @myArray = ("Monday", "Tuesday", "Wednesday");

sub sayHello
{
    print "Hello World!\n";
}

sub whoAreYou
{
    print "My name is $firstName $lastName\n";
}

1;

We start out by declaring our package name with the package keyword. Special Note: If you intend on having multiple modules, and you use the double colon (::) separator, you're going to need to set up your directory structure correspondingly. For example, if I had two modules, one named Jonah::ModuleOne and another named Jonah::ModuleTwo, I would need to have a folder named Jonah, inside of which would live the code to my two modules.

I next enable the strict and warnings pragmas, since that's good programming practice. Lines 5 and 6 are standard to virtually all Perl modules. First, we require inclusion of the standard Exporter module, then we indicate that our module inherits from said Exporter (the @ISA (is a) array is what sets this).

Line 8 is where things get interesting. We need to specify what symbols we want to export from this module. There are a number of ways of doing this, but I have chosen to use the EXPORT_TAGS hash. Special Note: This is a hash, not an array! I recently spent about an hour trying to debug a strange error message, and it all stemmed from the fact that I had accidentally created this as an array.

The EXPORT_TAGS hash gives us a means of grouping our symbols together. We essentially associate a label with a group of symbols, which makes it easy to selectively choose what you want to import when using the module. In this example, I simply have a tag named 'all' which, as you might guess, allows me to import all of the specified symbols I provide in the associated qw() list. Note that you must precede exported variable names with their appropriate character: $ for scalars, @ for arrays, and % for hashes. Exported subroutines don't need to have the preceding & character, but it doesn't hurt if you put it there.

Line 10 shows the EXPORT_OK array. This array specifies the symbols that are allowed to be requested by the user. I have placed the EXPORT_TAGS{'all'} value here for exporting. I will show how to import this symbol into a script in just a moment. Line 11 is the EXPORT array, which specifies the symbols that are exported by default. Note that I don't export anything by default. Special Note: It is good programming practice to not export anything by default; the user should specifically ask for their desired symbols when they import your package.

Lines 13 through 27 should be self explanatory. We set up two scalar variables, $firstName and $lastName, as well as a hash table and an array. Note that we precede all variables with the our declaration, which puts this variable into the global scope for the given context. Since we're using the strict pragma, we need these our declarations; otherwise we'd get some compilation errors.

Line 29 is very important and can easily be forgotten. When a Perl module is loaded via a use statement, the compiler expects the last statement to produce a true value when executed. This particular line ensures that this is always the case.

Now that we've taken a look at the module, let's take a look at a script that uses it:

#!/usr/bin/perl
use strict;
use warnings;
use MyPackage qw(:all);

sayHello();
whoAreYou();

print "$lastName\n"; # WRONG!
print $MyPackage::lastName . "\n"; # RIGHT!

Most of this should be pretty clear. Note, however, how we import the module on line 4. We do the typical use MyPackage statement, but we also include the symbols we want to import. Since we didn't export anything by default, the user has to explicitly ask for the desired symbols. All we exported was a tag name, so we specify it here. Note the preceding colon! When you are importing a tag symbol, it must be preceded by a single colon. This too caused me a great deal of frustration, and it's a subtlety that's easily missed.

One other interesting note: on line 9, we try to print the $lastName variable. Since we never exported that particular variable in our module, referencing it by name only will result in an error. The correct way to access the variable, even though it wasn't exported, is shown on line 9. You must fully qualify non-exported symbols!

Hopefully this quick little guide has made things a little clearer for you. If for no other reason, it will help me remember these subtleties of Perl programming. :-)

7 Comments

kip

2:39 PM on Aug 20, 2007
I have a bunch of small Perl scripts (mainly for automating frequent tasks involving our source control system), that use a library of common functions in a Perl module. The way I do it may not be the best practice (in fact, it's probably not) but it works for my purposes. For my kri_common.pm module I have:
return 1;  #I forget why I had to do this

#
# Converts backslash to slash.  Also removes trailing slash if present.
sub convert_backslash {
  my $path = shift;
  $path =~ s{\\}{/}g;
  $path =~ s{/$}{};
  return $path;
}

#
# Removes the output of pwd from the front of a path.
sub remove_pwd {
  my $path = shift;
  $path = &convert_backslash($path);
  my $pwd = `pwd`;
  $path =~ s{^$pwd/}{};
  return $path;
}
...
Then in my .pl file to use the module, I do this:
#get common functions
push @INC, split(/;/, $ENV{"PATH"});   require kri_common;

$file = &remove_pwd(&convert_backslash($ARGV[0])); #change \ to /
...
I hadn't realized just how much Perl I've forgotten in the last three years. I mainly use it for common tasks, or when I need to do some kind of search that is a little too advanced for grep. This comic seems relevant here. :)

Jonah

4:08 PM on Aug 20, 2007
Shouldn't the "return 1;" appear at the bottom of your module? As your comment suggests, I don't know why you have it at the top. The Perl compiler expects the last expression in the module to return true, so everything I've read indicates that you should do so at the end of the module's file (or block, if you have multiple modules in a single file). I like that comic. You can get that on a t-shirt, and I've seen a guy wearing it around here at IBM. Pretty sweet. I also like this shirt.

kip

4:49 PM on Aug 20, 2007
That's right, it has to return something. I guess I put it at the top and that worked, because there isn't actually any code in the module (subroutine declarations must not count as expressions). I'm not sure if that would work with strict turned on.

Hugh

2:21 AM on Sep 9, 2007
Hi, Enjoyed your article, thanks for writing it. I tried it out and found out the hard way that I needed to use the lib pragma to help Perl locate your example module under the subdirectory I placed it in. use lib "C:/perl/site/lib/MyPackage/"; Is there a better way to do this? The modules I download from ActiveState and other locations aren't using the lib pragma or modifying @INC on the fly. I'm using WinXP SP2 and the ActiveState version of Perl 5.8.8. http://blog.borngeek.com/2007/08/18/a-perl-module-primer/

Jonah

2:52 PM on Sep 10, 2007
Hello Hugh. I'm glad that you liked the article. When you use or require a Perl module, Perl will look through all the directories in the @INC array. One of those directories just happens to be the "." (current) directory. So, if you are willing to put the module in the same folder as your script, you won't need to make use of the lib pragma. If you place your module in another directory, you'll need to use the lib pragma to tell Perl where to find it. As inefficient as that may seem, the lib pragma is how you pass other possible directories to the @INC array. Hope this information helps!

Hugh

12:54 AM on Sep 12, 2007
Jonah, Thanks for the info it was helpful. I can live with the lib pragma but thought I would ask the question in the off chance there was yet another way to do it - TIMTOWTDI. I never cease to be amazed by Perl and the community that supports it. Keep on writing articles like this, they're very much appreciated by newer Perl enthusiasts like me!

Jonah

3:09 PM on Sep 12, 2007
I noted something in your first comment that I should have picked up on at the time. You were trying to install your package to "C:\Perl\site\lib\MyPackage\". The "Perl\site\lib" folder exists in the @INC array, so you shouldn't have to use the lib pragma to include it! Since you put your package in a subfolder, you'll need to change your use directive to the following:
use MyPackage::MyPackage;
The use directive assumes that the name before a double-colon pair (::) are folder names. I'm guessing that's the bug that's lurking in your scripts. I'll do my best to write a few more Perl articles. It's such a great language!

Leave a Comment

Ignore this field:
Never displayed
Leave this blank:
Optional; will not be indexed
Ignore this field:
Both Markdown and a limited set of HTML tags are supported
Leave this empty: