Perl Variables Perl is different from popular languages like C, C++, and Java. In C, variable is name of a memory location. You can store a piece of data in it like an interger, a floating point number, a character etc. Before using a variable, you have to declare it along with it type (whether it is an integer, character, etc.). In Perl you don't have to deal with all that nonsense. All you do is just use a variable.

A perl variable is different from a C variable. In Perl, a variable can be a scalar, array, hash, subroutine, or a typeglob. Perplexed. Don't be. A scalar performs all the functions of a C variable and a lot more. In Perl, array is also a variable. You will see how powerful your code will become if you can treat an array like you treat a variable. Hash used to be called an associative array. It is a different kind of an array, an unordered array. A subroutine is much like a C function and a lot more. A typeglob is another interesting topic. Each of these data types are distinguished by the first symbol of the variable. Everything starting with $ is a scalar. Everything starting with @ is an array and so on. See the table below.

Variable Syntax
Is a name for:
$cents An individual value  (number or string)
@large a list of values, keyed by number
%interest A group of values, keyed by string
&how A callable chunk of Perl code
*struck Everything named struck

Just like C, Perl variables also have types. When you use a variable, Perl automatically declares and initializes it. All scalars are initialized to 0 by default. All arrays are initialized to NULL by default. All strings are initialized to empty string by default.

Perl Scalars

Scalar Variables

A scalar variable holds a single scalar value. The value represents either a number, a string or a reference to something. Scalar variable names begin with a dollar sign followed by a letter(s), digit(s), or underscore(s). All scalar variables are case-sensitive. Perl has three contexts in which it will interpret a scalar variable: string context, numeric context, and miscellaneous context.

Scalar Data Types:

The scalar data type is the most basic form of data container Perl has. Perl treats strings and numbers in a similar manner. You don't need to declare a scalar, just create it (use it).

$string = "YourString"; $number = 269; $decimal = 49.42

Perl figures out by itself whether it is a float, integer, or string. If you want to include the symbols ( " ) or ( ; ) in your string, you will need to escape them by using ( \" ) and ( \; ). You can use q() for single quotes and qq() for double quotes as well. Please refer to Perl Quotes and Escape Sequences for more information.

Type Example
integer $answer = 968; 
real $pi = 3.14159265
scientific $avogadro = 6.02e23 
string $car = "BMW"
string with interpolation $sign = "I love my $car"
string without interpolation $cost  = 'It costs $80000';
another variable $one = $two
expession $force = $mass * $acceleration;
string output from a command $cwd = 'pwd'
numeric status of a command $exit = system("vi, $x")
an object $car = new Car "BMW";

There is no way to declare a scalar to be of type "number" or "string". Perl converts between the various subtypes as needed, so you can treat a number as a string or a string as a number, and Perl will do the Right Thing. References (pointers), however, are not castable.

Numeric Literals
12345 integer
12345.67 floating point
6.02E23 scientific notation
0xffff hexadecimal
0489 octal

Scalar Operators:

Before performing an operation, perl operators decide the type of its operands. If the both or all operands (whichever is applicable) are scalars,then the result is a scalar. We would explore what happens if all operands are not scalar in the following chapters. This section would just briefly touch the topic of operators. Please refer to Perl Operators for more information. Perl supports common arithmetic operators like +, -, * , /, and %.

$a = 4 + 7;      # $a = 11 $a = 4.9 + 3.9;  # $a = 8.8 $a = 10 / 3;     # $a = 3.3333333.... $a = 5 % 3;      # $a = 2, remainder

Perl has different comparison operators for strings and numbers.

Comparision Numbers Examples Strings Examples
Equal == if($one == 5) {    # do something; } eq if($string1 eq $string2) {    # do something; }
Not Equal
if($one != 5) {     # do something; } ne if($string1 ne $string2) {    # do something; }
Less Than
if($one < 5) {     # do something; }
if($string1 lt $string2) {    # do something; }
Greater Than
if($one > 5) {     # do something; }
if($string1 gt $string2) {    # do something; }
Less Than or Equal to
if($one <= 5) {     # do something; }
if($string1 le $string2) {    # do something; }
Greater Than or Equal to
if($one >= 5) {     # do something; }
if($string1 ge $string2) {    # do something; }

There are two really handy operators for strings only. (.) and (x). The first concatenates strings. The other multiplies them:

"my " . "life."; # my life. This operator concatenates strings "perl" x 3;      # This is same as perlperlperl

From time to time, you would want to convert a string to a number and a number to string. This is how you do it. Suppose you have a $string "123" and a $string2 "234", then string = string + string2; would produce 357 (as a number, not as a string), not 123234.

To be fair, Perl also provides some operators which deal with numbers only. Namely autoincrement and autodecrement.

$a = 4; $b = 9; $r = ++$a;  # $r = 5. Increments before assignment $r = $b++;  # $r = 5, $b = 10. Increment after assignment $b--;       # postdecrement --$b;       # predecrement


chop Chop function chops of the last character of a string scalar. It seems like a useless function but it does come in handy at times.

$r = "perls"; $r = chop($r);   # $r = s chop($r);        # $r = perl

chomp chomp deletes the last character only if it is a \n. It comes in handy.

Perl Operators

Perl has a whole array of very useful operators. They can generally be classified as follows:

Comparison Operators:

You have to use different operator for numeric and string to accomplish the same task. String operators cannot be used for numeric values and vice versa.

Comparison Operators

String Numeric Purpose Syntax
eq == equal to true if $a == $b
true if $s1 eq s2
ne !- not equal to true if $a != $b
true if $s1 ne s2
lt < less than true if $a < $b
true if $s1 lt s2
gt > greater than true if $a > $b
true if $s1 gt s2
le <= less than or equal to true if $a <= $b
true if $s1 le s2
ge >= greater than or equal to true if $a >= $b
true if $s1 ge s2
cmp <=> comparison with a signed result 0 if equal
1 if $a greater
-1 if $b greater

String Operators:

Perl has a rich collection of string operators:

. Concatenate

Numeric Operators:











Raise the right operand to the power of the left operand





Assignment Operators and Equivalence Operators:

These operators are already defined in the tables above in the context of numeric and string. For example = is an assignment operator and eq is an equivalence operator.

Arithmetic Operators
$a + $b
$a * $b
$a % $b
$a ** $b
String Operators
Dot operator
$a . $b
123 . 456 = 123456
x operator
$a x $b
123 x 3 = 123123123
++$a, $a++
--$a, $a--
Logical Operators
$a && $b
also $a and $b
$a || $b
also $a or $b
! $a
also not $a

Named Operators

int int(5.6234) 5
length length("nose") 4
lc lc(LOWER) lower
uc uc(upper) UPPER
cos cos(30) 0.8660
rand rand(5)

Returns a random number from 0 to less than its argument. If the argument is omitted, a number between 0 to 1 is returned.



Operator precedence and associativity
Terms and list operators (leftward) Left
-> Left
++   -- Nonassociative
** Right
! ~ \ and unary + and - Right
=~   !~ Left
* / % x Left
+ - .  Left
<<  >> Left
Named unary operators Nonassociative
<   >   <=   >=   lt   gt   le   ge Nonassociative
==  !=  <=>   eq   ne   cmp Nonassociative
& Left
|  ^ Left
&& Left
|| Left
.. Nonassociative
?: Right
=   ++   -+   *=   and   so on Right
,   => Left
List operators (rightward) Nonassociative
not Right
and Left
or xor Left

Autoincrement and Autodecrement

print ++($foo = '99');     # prints '100'
print ++($foo = 'a0');     # prints 'a1'

print ++($foo = 'Az');     # prints 'Ba'
print ++($foo = 'zz');     # prints 'aaa'


-24 is -(24), not (-2)**4

Unary Operators

Unary ! performs logical negation which is "not"
Unary - performs arithmetic negation if the operand is numeric. If the operand is an identifier, a string consisting of a minus sign conccatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned.

Unary ~ performs bitwise negation, that is 1's complement.

Unary + has no semantic effect whatsoever, even on strings. It is syntactically useful for separating a function name from a parenthesized expression which would otherwise be interpreted as the complete list of function arguments.

Unary \ creates a reference to whatsoever follows.

Binding Operators

Binary =~ binds a scalar expression to a pattern match, substitution, or translation. These operations search or modify the string $- by default.

Binary !~ is just like =~ except the return value is negated in the logical sense. The following expressions are functionally equivalent:

$string !~ /pattern/ not $string =~ /pattern/

Multiplicative Operators

  • / and % work as expected. If you have a floating point use fmod() instead of % because % converts its operands to integers before finding the remainder according to integer division.

Binary x is the repetition operator.

as a string replicator print '-' x 80;                  # print row of dashes print "\t" x ($tab/8), ' ' x ($tab%);    # tab over as a list replicator @ones = (1) x 80;                  # a list of 80 1's @ones = (5)                   # set all elements to 5 to initialize array and hash slices @keys = qw(perls before swine); @hash{@keys} = ("") x keys; which is equivalent to $hash{perls} = '"'; $hash{before} = ""; $hash{swine} = "";

Additive Operators

  • and - convert their arguments from strings to numeric values if necessary and return a numeric result. The "." operator provides string concatenation.
$almost = "Fred" . "Flintshone";            # returns FredFlintstone another method of concatenation is $fullname = "$firstname $lastname";

Shift Operators

The bit-shift operators (<< and >>)

1 << 4;     # returns 16 32 >> 4;       # returns 4

Named Unary and File Test Operators

Some of the functions described in chapter 3 are really unary operators.

sleep 4 | 3  is  equivalent to (sleep 4) | 3 but print 4 | 3 is equivalent to print (4 | 3)

This is so because sleep is a unary operator and list operator. When in doubt use parenthesis. Remember, if it looks like a function then it is a function.

A file test operator is a unary operator that takes one argument, either a filename or a filehandle, and tests the associated file to see if something is  true about it.

File Test Operators
File is readable by effective uid/gid
File is writable by effective uid/gid
File is executable by effective uid/gid
File is owned by effective uid
File is readabe by real uid/gid
File is writable by real uid/gid
File is executable by real uid/gid
File is owned by real uid
File exists 
File has zero size
File has non-zero size (returns size)
File is a plain file
File is a directory
File is a symbolic link
File is a named pipe (FIFO)
File is a socket
File is a block special file
File is a character special file
Filehandle is opened to a tty
File has setuid bit set
File has setgid bit set
File has sticky bit set
File is a text file
File is a binary file (opposite of -T)
Age of file (at startup) in days since modification
Age of file (at startup) in days since last access
Age of file (at startup) in days since inode change

Bitwise Operators


p>Bitwise AND, OR, and XOR: &, |, and ^. Both operands must be of the same type.

string AND string
"123.45" & "234.56"
020.44 (Remember, it is bitwise AND)
string AND numeric
"123.45" & 234.56
numeric AND numeric
123.45 & 234.56
integer AND integer
123 & 234

C-style Logical (Short Circuit) Operators

And: $a && $b       # $a if $a is false, $b otherwise
Or:   $a || $b           

$a if $a is true, $b otherwise

open(File, "filename") || die "Cannot open somefile: $!\n";

Range Operator

The range operator .. performs two different tasks. In a list context, it returns a list of values counting (by ones) from the left value to the right value.

for (101 .. 200)  { print; }     # prints 101......200
@foo = @foo[0 .. $#foo];     # an expensive no-op
@foo = @foo[ -5 .. -1];     # slice last 5 items
@alphabet = ('A' .. 'Z');     # prints ABCDEFGHIJKLMOPQRSTUVWXYZ


p>In scalar context, .. returns a Boolean value.


p>if (101 .. 200) { print; }     # print 2nd hundred lines
next line if (1 .. /^$/);     # skip header lines
s?^/> / if (/^$/ .. eof());     # quote body

Angle Operator

The angle operator (<>), sometimes called a diamond operator, is primarily used for reading and writing files.

Perl Special Variables

Perl contains numerous variables that have a special meaning. Below is a list of many of them.


English Name


$_ $ARG The default input and pattern-searching space
$& $MATCH The string matched by the last successful pattern match
$* $PREMATCH The string preceding whatever was matched by the last successful pattern
$' $POSTMATCH The string following whatever was matched by the last successful pattern match
$` $LAST_PAREN_MATCH The last bracket matched by the last search pattern
$+ $MULTILINE_NUMBER If set to 1, Perl 5+ does multi-line matching within a string (the default is 0)
$. $INPUT_LINE_NUMBER The last current input line number from the last file handle read (an explicit close on a file handle resets the line number)
$/ $INPUT_RECORD_SEPARATOR The input record separator (newline by default)
$| $OUTPUT_AUTOFLUSH If set to any nonzero value, forces a flush after every write or print on the currently selected output device (the default is 0)
$, $OUTPUT_FIELD_SEPARATOR The output field separator for the print function
$\ $OUTPUT_RECORD_SEPARATOR The output record separator for the print function
$" $LIST_SEPARATOR The output list separator for the print function
$; $SUBSCRIPT_SEPARATOR The subscript separator for multidimensional array emulation
$# $OFMT The output format for printed numbers
$% $FORMAT_PAGE_NUMBER The current page number of the currently selected output file handle
$= $FORMAT_LINES_PER_PAGE The current page length (printable lines) of the currently selected output file handle

Further Reading

Perl Arrays and Lists

Perl has a data structure that is strictly known as array of scalars. This structure is more commonly known as an array or a list. Perl's arrays can be used as a simple list, stack, or even the skeleton of a complex data structure. Anything beginning with an @ symbol is an array.

Arrays and lists:

Arrays are closely related to (but not the same as) lists. A Perl list is a sequence of  comma separated values usually in a set of parentheses. A Perl array is a container for a sequence of values (that is, a container for a list). Lists are commonly used to initialize arrays. Assigning a list to an array places each item in the list in a consecutive element of the array. Lists can also be used to extract values from arrays.

Using Arrays as an Indexed List:

The most common method of  using an array as an indexed list is to directly assign the array all of its values at creation. The following example sets the array variable @months to the months of the year. There are two items to mention regarding the example below: the placeholder JUNK and the keyword qw. Arrays start at index 0: junk is the placeholder so Jan could be 1.

@months = qw ( JUNK Jan Feb March April May June July Aug Sept Oct Nov Dec); @array = qw (a b c d e);    is equivalent to @array = ("a", "b", "c", "d", "e");

The keyword qw is a shortened form used to extract individual words from a string. The above example can also be done in the following manner:

$months[0] = "JUNK"; $months[1] = "Jan"; ...$months{12} = "Dec"; @home = ("a", "b", "c"); ($m, $n, $o) = @home; $home[0] = "a"; $home[1] = "b"; $home[2] = "c";

Notice when you assign the array elements directly, you use the $ character, not the @ character.

List constructor operator:

The list constructor operator could save you the trouble of listing all the values if you are using numbers:

(1 .. 5) # is equivalent to (1, 2, 3, 4, 5) (2 .. 6, 59, 98) # is equivalent to (2, 3, 4, 5, 6, 59, 98) ($x .. $y) # if $x and $y are two numbers, .. is the range in between


You can assign values to arrays using all the methods discussed above. What we have been doing above is assigning scalar values to an array. Perl also allows you to assign an array to another array.

@onearray = @anotherarray;

You can also mix things up:

@hexcharacters = qw(a b c d e f); @palindrome = (1 .. 9, @hexcharacters, reverse(@hexcharacters), 9, 8, 7, 6, 5, 4, 3, 2, 1); # ( 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, f, e, d, c, b, a, 9, 8, 7, 6, 5, 4, 3, 2, 1 )

Don't be intimidated by the line noise. @hexcharacters is assigned the values a, b, c, d, e, and f. In @palindrome, first we are assigning the values 1, 2, 3, 4, 5, 6, 7, 8, and 9 using the list constructor operator. Then we are assigning the array @hexcharacters. The function reverse() does what you think it does, it reverses the array @hexcharacters. You can take on from there. The point is that you have a lot of ways to assign an array and you can use them simultaneously if you wish or need to do so.

Note that @hexcharacters is not the ninth element of @palindrome, the ninth element is a. In fact, @hexcharacters is not even an element of @palindrome. This is because a list cannot contain another list as an array. If you still want an array to be an element of another array, then use a list reference (which we will talk about in the references lesson).

Going back to the paragraph before the previous one, Perl offers you a great degree of flexibility and a lot of ways to do the same thing. Well, most of the new ways of doing things come from your ingenuity, but you can't do much if the language won't allow it. To get a feel of what I mean see below:

($one, $two, $three) = (23, 43, 37); # $one = 23, $two = 43, and $three = 37 @sqr[1..4] = (1, 4, 9, 16) # range of indices @sqrt[1, 49, 9, 16, 4] = (1, 7, 3, 4, 2); # non-sequential indices @inverse[@sqr] = (1, 0.25, 0.1111, 0.0625); # indices stored in another array ($one, $two) = ($two, $one) # swaps $one and $two ($d, @array) = ($x, $y, $z) # $d = $x and @array = ($y $z) ($five, @numbers) # This moves the first element of @numbers to $five

We have seen that scalars can be assigned to arrays. Arrays can also be assigned to arrays. We know from the previous lesson that scalars can be assign to scalars. Can we assign an array to a scalar? The answer is yes, but don't go away, yet. Yes you can assign it, but the scalar would not get the array, it would get the size of the array.

@array = (23, 34, 45); $scalar = @array; # $scalar = 3, the lengh of the array ($scalar) = @array; # $scalar = 23, the first element of the array

You can assign values to more than one arrays at the same time:

@array1 = @array2 = @array3 = (21, 324, 324);

Accessing array elements selectively:

Perl arrays are indexed 0 to n. Suppose if @array is an array of 23 elements, then $array[0] is the first element and $array[22] is the last element. To copy an element's value to a scalar:

$scalar = $array[9];

Array Arithmetic:

$array[5]++; # increment sixth element of @array
$n = 5;
$array[$n];        # accesses the sixth element of the array
$array[++$n];      # accesses the seventh element of the array
$array[--$n];      # would decrement $n and then use as an index
$array[$n] += 5;   # adds 5 to the nth element of the array
($array[0], $array[1]) = ($array[1], $array[0]); # swaps two elements of the array. You can also do this for the entire array

You can also use negative values to access perl arrays. They access the array in reverse:

@array = (23, 34, 4, 3421, 234);
$array[-2];      # 3421
$array[-3];      # 4

Slicing: the act of accessing a list of elements from an array. Here is how you do it:

@array[3, 4];                      # is equivalent to ($array[3], $array[4])
@array[3, 4] = @array[4, 3];       # slice and swap
@array[3, 10, 15] = (4, 98, 120);  # slice and assign values
@array[4, 6, 9] = @array[2, 2, 2]; # assign the value of $array[2] to $array[4], $array[6], and $array[9]

End of Array:

Where does the array end? Every programmer using an array needs to know the answer to this question, regardless of the language he is using. Java does not allow a program to access an array element out of bound (Meaning element which is out of the range of an array. For example the 100th element is out of bound of a 10 element array). C++ allows you access an element out of bound but that attempt will return a garbage value. Perl allows you to access an element out of bound but that element will return the value undef meaning undefined. Which method do you think is the best? Java or Perl?


p>C++ does not allow a program to extend an array dynamically. For example you have an array of 10 elements. Then the program adds an element while running. This is not allowed in C++. Its not allowed in Java but Java provides a vector which can be resized dynamically. Dynamic resizing is allowed in perl. It does not have to be in order. Meaning that if there is a 10 element array, you can add the 19th element without having to add eleventh, twelfth, ... eighteen element. All elements in between would have the value undef.

@array = (1 .. 4);
$array[6] = "perl";    # (1, 2, 3, 4, undef, undef, perl)

Occasionally you would have to access the last element of the array. You can access the last element by using $#arrayname. You can also use -1 as an index.

@array = (12, 43, 54, 213);
print $array[-1];           # 213
print $#array;              # element number 3
print $array[$#array];      # 213

List Value and Arrays

@stuff  = ("one", "two", "three");
$stuff = @stuff;                     # $stuff = 3
$stuff = ("one", "two", "three");    # $stuff = 3

LISTs do automatic interpolation of sublists. That is, when a LIST is evaluated, each element of the list is evaluated in a list context, and the resulting list value is interpolated into LIST just as if each individual element were a member of  LIST. Thus arrays lose their identity in a LIST.

(@foo, @bar, &Somesub)

contains all the elements of @foo, followed by all elements of @bar, followed by all the elements returned by the subroutine named Somesub when it's called in a list context. You can use a reference to an array if you do not want it to interpolate. Null list is represented by ().

@days + 0;          # implicitly forces @days into a scalar context
scalar(@days);      # explicitly forces @days into a scalar context
@whatever = ();     # assigning a null list
$#whatever = -1;    # assigning a null list


Using Arrays as Stacks (push and pop):
When I was learning C++, I had to go through a lot of pain to learn how to create my own stack. I didn't have to go through the same pain in Java because there is a class by the name of stack defined in the language. Learning to use it took a little time but was a blessing when compared to C++. In perl, you can convert an array into a stack in one line and then back in another line! No wonder a lazy programmer like myself got hooked to Perl. To utilize an array as a stack, use the push and pop functions:

Suppose LIFO = (1, 2, 3)
push(@myList, "LIFO");   # @myList = (1, 2, 3)
$one = 34;
push(@myList, $one);     # @myList = (1, 2, 3, 34)
push(@myList, 99, 100);  # @myList = (1, 2, 3, 34, 99, 100)
$index = pop(@myList);   # $index = 100

The push function takes an array and a list of elements to append to it. It then appends them and returns the new length of the array. The pop function removes the last element of an array and returns that element. If the array is empty, it returns undef.


p>shift and unshift:

The push and pop functions deal with the highest subscripts. This is sometimes called the right side of an array. Now that we discovered that an array can also be treated like a stack, it seems a bit awkward to call it a array. This is why the word is used to refer to an array. The shift and unshift functions deal with the lowest subscripts. This is sometimes called the left side of the array:

unshift(@array, $a);        # like @array = ($a, @array);
unshift(@array, $a, $b);    # like @array = ($a, $b, @array);
$x = shift(@array);         # like ($x, @array) = @array;
@array = (5, 6, 7);
unshift(@array, 2, 3, 4);   # @array is now (2, 3, 4, 5, 6, 7)
$x = shift(@array);         # $x gets 2, @array is now (3, 4, 5, 6, 7)

The unshift and shift functions work just like push and pop respectively, except that they add elements to the start of an array instead of the end.


The push, pop, shift, and unshift functions are special cases of a more general function called splice, which changes the elements of an array. The splice function takes four arguments:

the array to be modified
the index at which it's to be modified
the number of  elements to be removed (starting at the index specified in the previous argument)
a list of extra elements to be inserted at the index (after the previous elements are removed)

The function returns a list of the elements removed from the array being modified.


p>The following is from programming perl:


This function removes the elements designated by OFFSET and LENGTH from an array, and replaces them with the elements of LIST, if any. The function returns the elements removed from the array. The array grows or shrinks as necessary. If LENGTH is omitted, the function removes everything from OFFSET onward. The following equivalences hold (assuming $[ is 0):

Direct Method Splice Equivalent
push(@a, $x, $y) splice(@a, $#a+1, 0, $x, $y)
pop(@a) splice(@a, -1)
shift(@a) splice(@a, 0, 1)
unshift(@a, $x, $y) splice(@a, 0, 0, $x, $y)
$a[$x] = $y splice(@a, $x, 1, $y)


p>The splice function is also handy for carving up the argument list passed to a subroutine. For example, assuming list lengths are passed before lists:

sub list_eq {            # compare two list values
   my @a = splice(@_, 0, shift);
   my @b = splice(@_, 0, shift);
   return 0 unless @a == @b;            # same len?
   while (@a) {
      return 0 if pop(@a) ne pop(@b);
   return 1;
if (list_eq($len, @foo[1..$len], scalar(@bar), @bar)) { ... }

It would probably be cleaner just to use references for this, however.


The reverse function reverses the order of the elements of its arguments, returning the resulting list. The original list is always unaltered, reverse works on a copy.

@array1 = (234, 89, 36, 98);
@array2 = reverse(@array1);    # @array2 = (98, 36, 89, 234)

The sort function does what you think it does, it sorts. Note the way numbers are sorted.

sort("one", "two", "three");  # one three two
sort(1, 2, 12, 24);           # 1, 12, 2, 24

The chomp function works on an array variable as well as a scalar variable. This function removes the last element.

@stuff = ("one\n", "two\n", "three");
chomp(@stuff); one two three
@days Same as ($days[0], $days[1],....$days[n])
@days[3..5] Same as ($days[3], $days[4], $days[5])
@days[3..5] Same as @days[3, 4, 5]
@days{'Jan', 'Feb'} Same as ($days{'Jan'}, $days{'Feb'})

Perl Hashes

Associative Arrays (Hashes): Hashes are also called associative arrays. I will be using the two terms interchangeably. Hashes are indexed by string values instead of an integer index value. Associative arrays, unlike scalar arrays, do not have a sense of order. There is no first addressable element. This is because the indexes of the hashes are strings and information is not stored in a predictable order.

A hash is best thought of as a two-column table, where the left column stores keys and the right colunm stores their associated scalar values. It's called a hash because a hashing algorithm is used to map each key string to an internal index into the table. To retrieve a value from a hash, you must know the key. If you know a key of hash %hash and you want to print out the value, you would use the following syntax:

print $hash{'mike'};

This example prints out the value of a key named mike in the hash named %hash. The interior of the curly braces (or the left-hand side of a => operator) of the hash will automatically interpret an identifier as a quoted string. So we can also write:

print $hash{mike} # notice that there are no quotes.

This is only true, however, if the contents are an unbroken sequence of alphanumerics or underscores. That is, we can't write:

$sound{mike willis}= "son of willis"; # wrong

if we mean:

$sound{"mike willis"} = "son of willis";

Populating a Hash:

Much like the normal array, an associative array can have all its values assigned at once. The following assign records to the hash %cities:

%cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West"); is equivalent to %cities = ("Toronto", "East" => "Calgary", "Central" => "Vancouver", "West"); is equivalent to %cities = ("Toronto", "East", "Calgary", "Central", "Vancouver", "West"); is equivalent to $cities{'Toronto'} = "East"; $cities{'Vancouver'} = "West"; $cities{'Calgary'} = "Central";

Functions: Data in hashes is stored in key/value pairs. In the example above, Toronto is the key, East is the value. Due to the nature of this assignment (which is unordered), hashes cannot be referenced like an array could. Contents of a hash can be listed by using either of the functions: keys, values, and each.

Keys: The keys function returns a list of the keys of the given associative array when used in a list context, and the number of keys when used in scalar context.

my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West"); for $key (keys %cities) { print "Key: $key Value: $cities{$key} \n"; }

In scalar context, the keys function gives the number of elements (key-value pairs) in the hash. For example:

if(%cities == 3) { # do something }

Values: The code above returns both keys and its values. If you want only the values of the hash and not the keys, use the values function.

my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West"); for $value (values %cities) { print "Value: $value \n"; } or @array = values(%cities);

Each: The each function iterates over the entire hash and returns all key-value pairs.

my %cities = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West"); while(($key, $value) = each(%cities)) { print "Key: $key Value: $cities{$key} \n"; }

Delete: How do you delete an element from a hash? It cannot be done by assigning a null. It can't be done by chopping it off. There is no order, so there is no connection you can chop off. To tend to this need, delete function was created.

delete $cities{"Toronto"};

This will delete the key value pair. Hash Slices: Like an array, hashes can also be sliced. Observe:

$cities{"Toronto"} = East; $cities{"Calgary"} = Central; $cities{"Vancouver"} = West;

This can be simplified to:

($cities{"Toronto"}, $cities{"Calgary"}, $cities{"Vancouver"}) = ("East", "Central", "West"); or @cities{"Toronto", "Calgary", "Vancouver"} = ("East", "Central", "West"); or @locations = qw(Toronto Calgary Vancouver); print "Places are: @cities{@locations}\n";

Hash slices can also be used to merge smaller hash into a larger one. In this example, the smaller hash takes precedence in the sense that if there are duplicate keys, the value from the smaller hash is used:

%destinations{keys %cities} = values %cities; or %destinations = (%destinations, %cities);

The values of %cities are merged into the %destinations hash.

Perl Control Structures

Computers are very efficient at decision making (of the decisions, they are programmed to make) and at repeating a task. Control structures allows the programmers define the decisions and to iterate.


This structure has the following syntax:

if (condition) {Code;} if (condition) {Code;}else {Code;} if (condition) {Code;}elsif {Code;}.... else {Code;}

Unlike C++, {} are not optional under any condition. They must be used even if they are followed by a single statement:

if($color eq "red") { print "red"; } elsif { print "white"; } else { print "blue"; }


We just ifs and all kinds of elses. Now suppose you don't want the if. You only want the else. You can use unless:

unless(condition) {Code;} unless ( $a < 13) { # do something }


The while loop repeats a bunch of statements until as long as the condition specified is true:

while (Condition) { Code; }


until is to while what unless is to if. In other words, it does the opposite of else. It iterates over a statement block until something is true.

unless (condition) { Code; }

do{} while/until:

In a while loop or the do loop, the condition is tested at the top. This means that if the condition is not true from the first place (or true in case of do), the code inside would never be executed. A lot of times you would want the code to run at least once before the condition is test. For such a scenario, there is do while and do until loops. The while and until act like they are supposed to. The only difference is that they are at the bottom instead of at the top.

do { Code; } while(Expression) $stops = 0; do { $stops = 0; print "Next stop? "; chomp($location = ); } until $stops > 5 || $location eq 'home';


The perl for loop acts much like C's for loop:

for (Declare / Initialize; Condition; Increment / Decrement) { # Code; } for( $i = 0; $i <= $#array; $i++ ) { print $array[$i]; # print each element of array, one per loop }

If you look carefully, you would see that there are three fields inside the for( ) loop. The leftmost one can be used for initialization. If you do not wish to initialize anything, leave it blank but do not omit the semicolon. You can leave any or all of the three fields blank. The middle field is the condition. As long as this condition is true, the loop would continue to run. As soon as it becomes false, the loop is exited. The rightmost field can be used to increment or decrement values.


The foreach statement takes a list of values and assigns them one at a time to a scalar variable, executing a block of code with each successive assignment.

foreach $i (@list) { # code } @array = (1, 2, 3); foreach $b (reverse @a) { print $b; }

The following is also possible because of an implied $_:

foreach (reverse @a) { print; }

next and last:

The next and last operators allow you to modify the flow of your loop. The next operator would allow you to skip to the end of your current loop iteration, and start the next iteration. The last operator would allow you to skip to the end of your block, as if your test condition had returned false.

@array = ( 1 .. 9 ); foreach $item (@array) { if ($item == 3 ) { next; } if ($item == 7 ) { last; } print $item, "\s"; } # 1 2 4 5 6

Take a look at the result. The number 3 is missing because next operator interrupted the loop before the print statement. 7, 8, and 9 are missing because last interrupted before the print statement.

Perl also provides Labels but I do not recommend anyone to use them.

Perl Subroutines

A subroutine is a small user-defined, self-contained subprogram. Like Perl's built-in functions, a subroutine is invoked by name and may have arguments passed to it. A subroutine may return a scalar or list value.

Defining subroutines:

Subroutines are defined using the sub keyword, followed by the subroutine code in curly braces:

sub dictionary_order { @ordered = sort @_; return @ordered; }

The following is an error because & was used:

sub &dictionary_order # Fatal compile_time error { retrun sort @_; }

Calling subroutines:

Subroutines are called by specifying their name, followed by a list of arguments:

@sorted = dictionary_order ("eat", "at", "Joes"); @sorted = dictionary_order (@unsorted); @sorted = dictionary_order (@sheep, @goats, "shepherd", $goatherd); @sorted = &dictionary_order("eat", "at", "Joes");

You can also call a subroutine without parentheses

sub make_sequence # args: (from, to, step_size) { # to see the arguments, you can do any of the following. print "@"; print $[0], " ", $[2], "etc"; %arg = @; print $arg{min}, " ", $arg{max}, " ", $arg{step_size};

  @list = (); 
  for ($n = $_[0]; $n < $_[1]; $n+=$_[2]) 
     push @list, $n; 
  return @list; 


# then later...

@stepped_sequence = make_sequence $min, $max, $step_size;

Passing arguments:

Just like any other list, if teh argument has nested lists or arrays, they are "flattened." Therefore, at the start of the third call to dictionary_order above, @_ would contain the contents of the array @sheep, followed by the contents of @goats, the value "shepherd", and finally the scalar value stored in $goatherd. It is possible to pass two or more arrays to a subroutine and keep them "unflattened" by using explicit references.

Refer back to first example in defining subroutines above. The arguments passed to the subroutine are available within its code block via the special @_ array. The built-in function return causes execution of the subroutine to finish immediately and the value specified after the return to be returned as the result. Using a return is optional in a subroutine. If none is specified, the subroutine automatically returns the value of the last statement it actually executed.

Because a subroutine's arguments are passed to it in the special array @_, and because all arrays in Perl are dynamically sized, any subroutine may be passed any number of arguments.

Named arguments:

Suppose we want to implement a subroutine called listdir that provides the functionality of our operating system's directory listing command (i.e., dir or ls). Such a subroutine might take arguments specifying which files to list, what type of files to consider, whether to list hidden files, what details of each file should be reported, whether files and directories should be listed recursively, how many columns to use, and whether the output should be paged or just dumped.

But we certainly don't want to have to specify every one of those nine parameters every time we call listdir:

listdir(undef, undef, 1, 1, undef, undef, undef, 4, 1);

Some programming languages provide a mechanism for naming the arguments passed to a subroutine. Perl supports named arguments in a cunning way. If we pretend that a particular subroutine takes a hash, rather than a list, we can use the => operator to associate a name with each argument. For example:

listdir(cols=>4, page=>1, hidden=>1, sep_dirs=>1);

Inside the subroutine, we simply initialize a hash with the resulting contents of the @_ array. We can access the arguments by name, using each name as the key to an entry in the hash. For example, we can define listdir like so:

sub listdir { %arg = @_; # Convert argument list to hash

  # Use defaults for missing arguments...

  $arg{match} = "*" unless exists $arg{match};
  $arg{cols} = 1 unless exists $arg{cols};
  # etc.

  # Use arguments to control behaviour...
  @files = get_files( arg{match} );
  push @files, get_hidden_files() if $arg{hidden};
  # etc.


Since the entries of a hash can be initialized in any convenient order, we no longer need to remember the order of the nine potential arguments, as long as we remember their names. Because hashes are flattened inside lists, if we have several calls that require the same subset of arguments, we can store that subset in a separate hash and reuse it:

%std_listing = (cols=>2, page=>1, sort_by=>"data");

listdir(file=>".txt", %std_listing); listdir(file=>".log", %std_listing); listdir(file=>"*.dat", %std_listing);

We can even override specific elements of the standard set of arguments, by placing an explicit version after the standard set. Then the explicit version will reinitialize (i.e. overwrite) the corresponding entry in the hash:

listdir(file=>"*.exe", %std_listing, sort_by=>"size");

Aliasing of parameters:

Elements of the @_ array are special in that they are not copies of the actual arguments of the function call. Rather they are aliases for those arguments. That means that if values are assigned to $[0], $[1], $_[2], etc., each value is actually assigned to the corresponding argument with which the current subroutine was invoked. In other words, its a call-by-reference rather than call-by-value. The following subroutine increments its first argument each time it's called, but keeps the result less than 10 at all times.

sub cyclic_incr { $[0] = ($[0]+1) % 10; }

The result would be:

$next_digit = 8; print $next_digit; # prints 8

cyclic_incr($next_digit); print $next_digit; # prints9

cyclic_incr($next_digit); print $next_digit; # prints 0

An unmodifiable value like 7 as opposed to a variable like $next_digit would cause a fatal error. If you don't intend to change the values of the original arguments, it's usually a good idea to explicitly copy the @_ array into a set of variables.

sub next_cyclic { ($number, $modulus) = @_; $number = ($number+1) % $modulus; return $number; }

The variables $number and $modulus are still global but more visible. For local variables use my keyword.

Calling Context When a subroutine is called, it's possible to detect whether it was expected to return

  • a scalar value
  • a list or
  • nothing at all

These three possibilities define three contexts in which a subroutine may be called.

listdir(@files); # void context: no return value expected $listed = listdir(@files); # scalar context: scalar return value expected @missing = listdir(@files); # list context: list return value expected ($f1, $f2) = listdir(@files); # list context print( listdir(@files) ); # list context

Wantarray function There is a built-in function in Perl, which tells the subroutine is expected to return. The function returns

  • undef if the current value was not expected to return a value.
  • "" if it was expected to return a scalar.
  • 1 if it was expected to return a list.

We could use this information to select the appropriate form of return statement (and perhaps optimize for cases where the return value would not be used). For example:

sub listdir { # Do file listing, and then:

  return @missing_files if wantarray();
  return $listed_count if defined(wantarray());


If a subroutine is always supposed to return a value, we could issue a warning whenever that return value is ignored:

use Carp;

sub listdir { # Do file listing, and then:

  return @missing_files if wantarray;
  return $listed_count if defined(wantarray);
  carp "subroutine &listdir was called in void context";


We use Carp::carp subroutine, instead of the built-in warn function, so that the warning reports the location of the call to listdir, instead of the location within listdir at which the error was actually detected.

Determining a subroutine's caller

The Carp module is useful because it reports the location of a subroutine's caller, rather than the location of the subroutine's code. caller function Unlike most languages, Perl makes it easy to determine where a subroutine was called. The built-in caller function provides details of the caller. This function works differently in string and list context: 1. String Context In scalar context caller returns:

  1. the package from which the current subroutine was called.
  2. the name of the file containing the code that called the current subroutine
  3. the line in that file from which the current subroutine was called

    1. List Context In list context, caller returns:
  4. the package from which the current subroutine was called.

  5. the name of the file containing the code that called the current subroutine
  6. the line in that file from which the current subroutine was called
  7. the name of the subroutine
  8. whether the subroutine was passed arguments
  9. the context in which the subroutine was called (the value returned by wantarray)
  10. the actual source code that called the subroutine (but only if the call was part of an eval TEXT statement)
  11. whether the subroutine was called as part of a require or use statement.


Subroutines can also be declared with a prototype, which is a series of specifiers that tells the compiler to restrict the type and number of arguments with which the subroutine may be invoked. For example, in the subroutine definition

sub insensitive_less_than ($$) { return lc($[0]) lt lc($[1]); }

the prototype is ($$) and specifies that the subroutine insensitive_less_than can only be called with two arguments, each of which will be treated as a scalar -- even if it's actually an array. In other words, a $ prototype causes the corresponding argument to be evaluated in a scalar context. That means, for example, that a call like insensitive_less_than(@a, @b) will be treated @a and @b as scalars. The two values passed to insensitive_less_than will be the lengths of @a and @b respectively, not their contents. This kind of introduced subtlety is a good reason to avoid using a prototype, unless you're very confident that you know its full consequences. Prototypes are only enforced when a subrouting is called using the name(args) syntax. Prototypes are not enforced when a subroutine is called with a leading & or through a subroutine reference. They are also ignored when an object method is called.

Perl References

In Perl, references are not just pointers, they are data types.

Creating a Reference

When a reference is declared, a new instance of the reference is created and stored in a scalar.

# Set up the data types. my $scalarVar = "Something";

# Create a reference to it. my $scalarRef = \$scalarVar;

Dereferencing a Reference

In order to access the information that a reference points to, the reference must be dereferenced. Perl's references do not automatically dereference themselves when used. e.g.

# Initialize variables my $scalarVar = "something"; my @arrayVar = qw(a b c d e); my %hashVar = ("Toronto" => "East", "Calgary" => "Central", "Vancouver" => "West");

# Create the references my $scalarRef = \$scalarVar; my $arrayRef = \@arrayVar; my $hashRef = \%hashVar;

# Print out the references. print "$scalarRef \n"; print "$arrayRef \n"; print "$hashRef \n";

# The output of the program is SCALAR(0xaddc4) ARRAY(0xadec0) HASH(0xade30)

So, how do we dereference? Dereferencing is different for scalar, array, etc. Lets look at each one of them.

Scalar References

A scalar reference is reference to a scalar value. The example below show how a scalar reference is created, dereferenced, and printed.

# Creating a scalar variable my $scalarVar = "something";

# Creating a scalar reference my $scalarRef = \$scalarVar;

# Printing a scalar variable print "Var: $scalarVar \n";

# Printing a scalar reference print "Ref: " . $$scalarRef . "\n";

# The output would be Var : something Ref: something

Note the two $ signs in the scalar reference. Also note the difference in the way the scalar variable and scalar reference is printed.

Array References:

Array reference is created using \ operator and dereferenced using @$.

# Create the array my @letters = qw(a b c d e);

# Create the array reference my $arrayRef = \@letters;

# Printing the array reference for $month (@$arrayRef) { print "Letters: $letter \n"; }

Hash References:

Hash references are created using \ operator and dereferenced using %$.

# Create and associative array my %who = ('Name' => 'Gizmo', 'Age' => 3, 'Height' => '10 cm', 'Weight' => '10 gm');

# Create the hash reference my $hashRef = %who;

# Print the associative array for $key (sort keys %$hashRef) { $value = $hashRef->{$key}; printf "Key: %10s Value: %-40s\n", $key, $value; }

# output of program Key: Age Value: 3 Key: Height Value: 10 cm Key: Name Value: Gizmo Key: Weight Value: 10 gm

Code References

A code reference points to a Perl subroutine. Code references are mainly used for callback functions, where a callback is a function that you ask to have called at a later time. Code references are created with \ operator and dereferenced with &$.

# define the callback function sub callBack { my ($mesg) = @_; print "$mesg\n"; }

# Create the code reference my $codeRef = \&callBack;

# Call the callback function with different parameters. &$codeRef("Hi someone"); &$codeRef("something");

Anonymous Array References

An anonymous array is an array without an associated name variable. This means the array has been defined and stored into a reference instead of an array variable. There will be times when you may want to create a temporary array but don't feel like creating a new array name. When you use an anonymous array, Perl creates the namespace for the array. To create an anonymous array, use square brackets around a list of values. The following is an anonymous array inside an anonymous array.

# create the anonymous array reference My $arrayRef = [[1, 2, 3, 4], 'a', 'b', 'c', 'd', 'e', 'f'];

# Print out some of the array print $arrayRef->[0][0] . "\n"; print $arrayRef->[0][1] . "\n"; print $arrayRef->[1] . "\n";

# output is 1 2 a

References are particularly useful in creating multidimensional data structures. As we saw earlier, nested lists are automatically flattened, so trying to build a list of lists doesn't work:

@table = ( ( 1, 2, 3 ), ( 2, 4, 6 ), ( 3, 6, 9 ), );

This fails to have the desired effect because flattening makes the above equivalent to:

@table = (1,2,3,2,4,6,3,6,9);

Fortunately, each element in a Perl array can store any kind of scalar value. Since a reference is just a special kind of scalar, it's possible to write:

@row1 = (1,2,3); @row2 = (2,4,6); @row3 = (3,6,9);

@cols = (\@row1, \@row2, \@row3);

$table = \@cols;

Now the elements in the "row" arrays can be accessed using the arrow notation:

print "2 x 3 is ", $table->[1]->[2];

Of course, tables like this are very popular, so Perl provides syntactic assistance. If we specify a list of values in square brackets instead of parentheses, the result is not a list, but a reference to a nameless (or anonymous) array. That array is automatically initialized to the specified values. So the above code could be written as:

$row1_ref = [ 1, 2, 3]; $row2_ref = [ 2, 4, 6]; $row3_ref = [ 3, 6, 9];

$table = [$row1_ref, $row_ref, $row3_ref];

or use nested brackets

my $table = [ [ 1, 2, 3], [ 2, 4, 6], [ 3, 6, 9], ]

And finally

print $table->[1]->[2];

can be replaced with

print $table->[1][2];

Anonymous Hash References

Anonymous hash or associative array references are created the same way anonymous array references are created. The hash is created and the reference is stored directly into the reference.

my $hashRef = {'Name' => 'Gizmo', 'Age' => 3, 'Height' => '10 cm'}; print $hashRef->{'Name'} . "\n"; print $hashRef->{'Age'} . "\n"; print $hashRef->{'Height'} . "\n";

# output is Gizmo 3 10 cm

It is possible to create references to anonymous hashes by replacing the parentheses of a hash-like list:

%association = ( cat=>"nap", dog=>"gone", mouse=>"ball" ); # parentheses

with curly braces:

$association = { cat=>"nap", dog=>"gone", mouse=>"ball" }; # curly braces

Like the [...] array constructor, the {...} hash constructor returns a reference, which must be assigned to scalar variable ($association), not to a hash (%association). Access to the resulting anonymous hash is only possible through the returned reference:

print $association->{cat};

We can even create multilevel hashes, by nesting anonymous hash references:

$behaviour = { cat => { nap => "lap", eat => "meat" }, dog => { prowl => "growl", pool => "drool" }, mouse => { nibble => "kibble" }, };

Accessing the data requires a chain of arrow operators:

print $behaviour->{cat}->{eat};

And, as with multidimensional arrays, any arrows after the first can be omitted:

print $behaviour->{mouse}{nibble};

Anonymous Subroutine References

An anonymous subroutine is a subroutine that has been defined without a name. The $ operator is used to access the anonymous routine. The following script creates a reference to an anonymous function.

my $codeRef = sub { my $mesg = shift; print "mesg\n"; };

&$codeRef ("hi someone"); &$codeRef("something");

Passing subroutine arguments as explicit references

References also provide a means of passing unflattened arrays or hashes into subroutines. Suppose we want to pass array of values to a subroutine. We can't call this subroutine in the obvious way:

insert(@ordered, $next_val);

because normal list flattening will squash the contents of @ordered and the value of $next_val into a single list. Instead, we could set up the subroutine insert so that it expected a reference to the array as its first argument:

sub insert { ($arr_ref, $new_val) = @_; @($arr_ref) = sort {$a<=>$b} (@{$arr_ref}, $new_val); # numerical sort }

We could then call it like so:

insert(\@ordered, $next_val);

Identifying a Referent

Because a scalar variable can store a reference to any kind of data, and dereferencing a reference with the wrong prefix leads to fatal errors, it's sometimes convenient to be able to determine the type of referent to which a specific reference refers. Perl provides a built-in function called ref that takes a scalar, such as $$slr_ref, and returns a description of the kind of reference it contains.

What ref returns If $slr_ref contain ... then ref($slr_ref) returns ... a scalar value undef a reference to a scalar "SCALAR" a reference to an array "ARRAY" a reference to a hash "HASH" a reference to a subroutine "CODE" a reference to a filehandle "IO" or "IO::Handle" a reference to a typeglob "GLOB" a reference to a precompiled pattern "Regexp" a reference to another reference "REF"

The ref function can be used to improve error messages.

die "Expected scalar reference" unless ref($slr_ref) eq "SCALAR";

or to allow a subroutine to automatically dereference any arguments that might be references:

sub trace { ($prefix, @args) = @_; foreach $arg ( @args ); { if (ref($arg) eq 'SCALAR') { print $prefix , ${$arg} } elsif (ref($arg) eq 'ARRAY') { print $prefix, @{$arg} } elsif (ref($arg) eq 'HASH') { print $prefix, $arg } else { print $prefix, $arg } } }

The ref function has a vital additional role in object-oriented Perl, where it can be used to identify the class to which a particular object belongs.

Perl Regular Expressions

Regular Expressions

Regular expressions are used to search for patterns in strings of data.

Pattern-Matching Operators:

Pattern-matching operators are the keywords in Perl that perform pattern matches. The difference between regular expression syntax and pattern-matching operators is that regular expressions allow the programmer to build complex expressions, whereas pattern-matching operators deals with how to use them. The syntax used to perform a pattern match on a string is:

$string =~ /regular expression/expression modifier (optional)

The strings inside / / will be searched for.

The two main pattern matching operators are m//, the match operator, and s///, the substitution operator. There is also a split operator, which takes an ordinary match operator as its first argument but otherwise behaves like a function.

Although we write m// and s/// here, you can pick your own quote characters. On the other hand, for the m// operator only, the m may be omitted if the delimiters you pick are in fact slashes. (You'll often see patterns written this way, for historical reasons.)

Regular Expression Syntax:

There are a lots and lots of regular expressions in Perl. The most common operator used to apply regular expressions on strings is what is called a pattern-binding operator (=~) and (!~). The first compares a string to the pattern and succeeds if the two match. The second binding operator compares the string to the pattern and succeeds if the comparision fails. The syntax.

$string !~ /regular expression/expression modifier (optional)

The rules of regular expression matching .


An expression modifier can be added to most regular expressions to modify the behaviour of the expression. The following is an example.

# Create a basic string. my $string = "Hello World!";

if ($string =~ /"Hello World!"/) { print "Case Match!\n"; }

if ($string =~ /"hello WORLD!"/i) { print "Case insensitive Match!\n"; }