[[Category:Programming languages]] [[image:Programming-republic-of-perl.gif|right|Programming Republic Of Perl]] '''Perl''', also '''Practical Extraction and Report Language''' (a [[backronym]], see [[#Name|below]]), is a [[programming language]] released by [[Larry Wall]] on [[December 18]], [[1987]] that borrows features from [[C programming language|C]], [[sed]], [[awk]], [[UNIX shell|shell]] scripting ([[UNIX shell|sh]]), and (to a lesser extent) from many other programming languages. == Rationale == Perl was designed to be a practical language to extract information from text files and to generate reports from that information. One of its mottos is "There's more than one way to do it" (TMTOWTDI - pronounced 'Tim Toady'). Another is ''Perl: the Swiss Army Chainsaw of Programming Languages''. One stated design goal is to make easy tasks easy and difficult tasks possible. Its versatility permits versions of many programming paradigms: [[procedural programming|procedural]], [[functional programming|functional]], and [[Object Oriented Programming|object-oriented]] (though some claim that Perl is not a cleanly designed language because of its multiple paradigms). Perl has a powerful [[regular expression]] engine built directly into its syntax. Perl is often considered the archetypal [[scripting programming languages|scripting language]] and has been called the "glue that holds the web together", as it is one of the most popular [[Common Gateway Interface|CGI]] languages. Its function as a "glue language" can be described broadly as its ability to tie together different systems and interfaces that were not designed to interoperate. Perl is one of the [[programming language]] components of the popular [[LAMP]] free software platform for web development. Perl is [[free software movement|free software]], available under a combination of the ''[[Artistic License]]'' and the [[GPL]]. It is available for most [[operating system|operating systems]] but is particularly prevalent on [[Unix]] and [[Unix-like]] systems (e.g: [[Linux]] and [[FreeBSD]]), and is growing in popularity on [[Microsoft Windows]] systems. As an example of Perl in action, [[Wikipedia]] itself was a [[Common Gateway Interface|CGI]] script written in Perl until January [[2002]]. Another example is [[Slashdot]], which runs on the Perl-based [[Slashcode]] software. When used on the web, Perl is often used in conjunction with the [[Apache web server]] and its [[mod_perl]] module. Perl is regarded by both its proponents and detractors as something of a grab bag of features and syntax. The difference between the two camps lies in whether this is seen as a virtue or a vice. Perl votaries maintain that this varied heritage is what makes the language so useful. Reference is often made to natural languages such as [[English language|English]] and to [[evolution]]. For example, Larry Wall has argued that: :''... we often joke that a camel is a horse designed by a committee, but if you think about it, the camel is pretty well adapted for life in the desert. The camel has evolved to be relatively self-sufficient. On the other hand, the camel has not evolved to smell good. Neither has Perl.'' In recognition of its ugly-but-useful nature, Perl has adopted the camel as its mascot. == Implementation == A huge collection of freely usable [[perl module]]s, ranging from advanced mathematics to database connectivity, networking and more, can be downloaded from a network of sites called [[CPAN]], an [[Acronym|acronym]] for Comprehensive Perl Archive Network. Most or all of the software on CPAN is also available under either the [[Artistic License]], the [[GPL]], or both. CPAN.pm is also the name of the Perl module that downloads and installs other Perl modules from one of the CPAN mirror sites: such installations can be done with interactive prompts, or can be fully automated. Although Perl has most of the ease-of-use features of an interpreted language, it does not strictly interpret and execute source code one line at a time. Rather, perl (the program) first compiles an entire program to an intermediate [[byte code]] (much like [[Java programming language|Java's]] byte code), optimizing as it goes, and then executes that byte code. It is possible to compile a Perl program to byte code to save the compilation step on later executions, though the "interpreter" is still needed to execute that code. == Current version == The current version, 5.8.5, includes [[Unicode]] support. Development of the next major release, Perl 6, is also underway. It will run on [[Parrot virtual machine|Parrot]], a [[virtual machine]] which is being developed as a possible multi-language target architecture. == Control structures == The basic control structures do not differ greatly from those used in the [[C programming language|C]] or [[Java programming language|Java]] programming languages: ''Loops'' while (''Boolean expression'') { ''statement(s)'' } do { ''statement(s)'' } while (''Boolean expression''); do { ''statement(s)'' } until (''Boolean expression''); for (''initialisation'' ; ''termination condition'' ; ''incrementing expr'') { ''statement(s)'' } foreach ( ''array'' ) { ''statement(s)'' } ''If-then-statements'' if (''Boolean expression'') { ''statement(s)'' } unless (''Boolean expression'') { ''statement(s)'' } if (''Boolean expression'') { ''statement(s)'' } else { ''statement(s)'' } if (''Boolean expression'') { ''statement(s)'' } elsif (''Boolean expression'') { ''statement(s)'' } ''For one-line statements, "until", "while" and "if" can also be used as follows'': ''statement(s)'' until ''Boolean expression''; ''statement(s)'' while ''Boolean expression''; ''statement(s)'' if ''Boolean expression''; == Subroutines == [[Subroutine]]s in Perl can be specified with the [[keyword]] 'sub'. [[Variable]]s passed to a subroutine appear in the subroutine as elements of the local (to the subroutine) scalar array @_. Calling a subroutine with three scalar variables results in array elements @_[0], @_[1], and @_[2] within the subroutine. Note that these elements would be referred to as the scalars $_[0], $_[1], and $_[2]. Also shift can be used, without specifying @_, to obtain each value. Changes to elements in the @_ array within the subroutine are reflected in the elements in the calling program. Subroutines naturally return the value of the last expression evaluated, though explicit use of the ''return'' statement is often encouraged for clarity. An example subroutine definition and call follows: sub cube { my $x = shift; $x * $x * $x; } ... $z = -4; $y = cube($z); print "$y\n"; Named parameters can be simulated by passing a hash. == Perl and [[SQL]] databases == [[DBI/DBD]] modules can be used to access most [[ANSI]] [[SQL]] databases, including [[MySQL]], [[PostgreSQL]] and [[Oracle database|Oracle]]. == Perl 5 == Perl5, the most current production version of perl, is an interpreter which processes the text of a Perl script at runtime. Thus, the [[debugger]] is invoked directly from the command line with
perl -dw ScriptName.pl Argument1 ... ...
Note that there is no limit to the number of arguments: Perl is polyadic; any number of arguments can be passed to any Perl subroutine, in general. This concept of "no arbitrary limits" is present in most other parts of the language as well. Perl can read a ten million byte string into a variable, if the machine has the memory for it.
== Perl 6 ==
Perl6 is currently under development, and is planned to separate parsing, compilation and runtime, making a virtual machine that is more attractive to developers looking to port other languages to the architecture.
[[Parrot virtual machine|Parrot]] is the Perl6 runtime, and can be programmed at a low level in [[Parrot assembly language]]. Parrot has existed in a limited form since December 2003. An increasing number of languages have been implemented to various degrees for the Parrot to be 'compiled' to Parrot assembly language opcodes. Besides a subset of the planned Perl6, these include [[BASIC programming language|BASIC]], [[Befunge]], [[Cola programming language|Cola]], [[Forth programming language|Forth]], [[Jako]], [[Ook!]], [[Plot programming language|Plot]], and even [[Python programming language|Python]], [[Ruby programming language|Ruby]], and [[Scheme]].
== Perl code samples ==
The canonical "[[hello world]]" program would be:
| Regular Expression | Description | Example
Note that all the if statements return a TRUE value |
|---|---|---|
| '''.''' | Matches an arbitrary character, but not a newline. |
$string1 = "Hello World\n";
if ($string1 =~ m/...../) {
print "$string1 has length >= 5\n";
}
|
| ( ) | Groups a series of pattern elements to a single element. When you match a pattern within parentheses, you can use any of $1, $2, ... $9 later to refer to the previously matched pattern. |
$string1 = "Hello World\n";
if ($string1 =~ m/(H..).(o..)/) {
print "We matched '$1' and '$2'\n";
}
Output:
We matched 'Hel' and 'o W';
|
| + | Matches the preceding pattern element one or more times. |
$string1 = "Hello World\n";
if ($string1 =~ m/l+/) {
print "There are one or more consecutive l's in $string1\n";
}
|
| ? | Matches zero or one times. |
$string1 = "Hello World\n";
if ($string1 =~ m/H.?e/) {
print "There is an 'H' and a 'e' separated by ";
print "0-1 characters (Ex: He Hoe)\n";
}
|
| ? | Matches the *, +, or {M,N}'d regexp that comes before as few times as possible. |
$string1 = "Hello World\n";
if ($string1 =~ m/(l+?o)/) {
print "The non-greedy match with one or more 'l' ";
print "followed by an 'o' is 'lo', not 'llo'.\n";
}
|
| * | Matches zero or more times. |
$string1 = "Hello World\n";
if ($string =~ m/el*o/) {
print "There is a 'e' followed by zero to many";
print "'l' followed by 'o' (eo, elo, ello, elllo)\n";
}
|
| {M,N} | Denotes the minimum M and the maximum N match count. |
$string1 = "Hello World\n";
if ($string1 =~ m/l{1,2}/) {
print "There exists a substring with at least 1";
print "and at most 2 l's in $string1\n";
}
|
| [...] | Denotes a set of possible matches. |
$string1 = "Hello World\n";
if ($string1 =~ m/[aeiou]+/) {
print "$string1 contains a one or more";
print "vowels\n";
}
|
| | | Matches one of the left or right operand. |
$string1 = "Hello World\n";
if ($string1 =~ m/(Hello|Hi)/) {
print "Hello or Hi is ";
print "contained in $string1";
}
|
| \b | Matches a word boundary. |
$string1 = "Hello World\n";
if ($string1 =~ m/\bllo\b/) {
print "There is a word that starts with";
print " 'llo'\n";
} else {
print "There are no words that start with";
print "'llo'\n";
}
|
| \w | Matches alphanumeric, including "_". |
$string1 = "Hello World\n";
if ($string1 =~ m/\w/) {
print "There is at least one alpha-";
print "numeric char in $string1 (A-Z, a-z, 0-9, _)\n";
}
|
| \W | Matches a non-alphanumeric character. |
$string1 = "Hello World\n";
if ($string1 =~ m/\W/) {
print "The space between Hello and ";
print "World is not alphanumeric\n";
}
|
| \s | Matches a whitespace character (space, tab, newline, formfeed) |
$string1 = "Hello World\n";
if ($string1 =~ m/\s.*\s/) {
print "There are TWO whitespace ";
print "characters separated by other characters in $string1";
}
|
| \S | Matches anything BUT a whitespace. |
$string1 = "Hello World\n";
if ($string1 =~ m/\S.*\S/) {
print "There are TWO non-whitespace ";
print "characters separated by other characters in $string1";
}
|
| \d | Matches a digit, same as [0-9]. |
$string1 = "99 bottles of beer on the wall.";
if ($string1 =~ m/(\d+)/) {
print "$1 is the first number in '$string1'\n";
}
Output:
99 is the first number in '99 bottles of beer on the wall.'
|
| \D | Matches a non-digit. |
$string1 = "Hello World\n";
if ($string1 =~ m/\D/) {
print "There is at least one character in $string1";
print "that is not a digit.\n";
}
|
| ^ | Matches the beginning of a line or string. |
$string1 = "Hello World\n";
if ($string1 =~ m/^He/) {
print "$string1 starts with the characters 'He'\n";
}
|
| $ | Matches the end of a line or string. |
$string1 = "Hello World\n";
if ($string1 =~ m/rld$/) {
print "$string1 is a line or string";
print "that ends with 'rld'\n";
}
|
| [^...] | Matches every character except the ones inside brackets. |
$string1 = "Hello World\n";
if ($string1 =~ m/[^abc]/) {
print "$string1 does not contain the characters a, b, and c\n";
}
|