If this is your first exposure to Perl, please read this document and the perl(1) man page before asking questions in comp.lang.perl.misc. If you're using v4 perl, that page contains all you need to know (or at least enough to get started). If you're using v5 perl, that page will show you where to look for specific information. When we refer to perlmod(1), it means the "perlmod" man page in section "1" of the manual, just as Foo(3pm), that means it's the "Foo" man page in section "3pm" (perl modules) of the library. The perl install does NOT automatically install the module man pages for you, however.
Hopefully the questions herein are asked enough that considerable net bandwidth can be saved by looking here before asking. Also, hopefully there is enough information contained here that someone who has never heard of Perl can read this and at least have some sort of idea as to what Perl is.
Some questions in this group aren't really about Perl, but rather about system-specific issues. You might also consult the Most Frequently Asked Questions list in comp.unix.questions for answers to this type of question.
The current version of perl is 5.001, perl 5.000 emerged into the world on 16 October, 1994. The previous non-beta version was 4.036 (version 4, patchlevel 36). Many of these questions were written for perl4, however a lot of perl5 information has also been added. Perl5 only features will be clearly marked as such, so as not to cause confusion for those still using perl4. You should upgrade to perl5 as soon as possible though (see below).
This list was initially written, and still hacked upon, by Tom Christiansen*. However, due to his erratic schedule, it is currently maintained by Stephen P Potter*. First person singular pronouns, when not in quoted postings, generally are Tom talking.
This document, and all its parts, are Copyright (c) 1994/1995, Stephen P Potter and Tom Christiansen, perlfaq@perl.com. All rights reserved. HTML by Tom Christiansen. Permisson to distribute this collection, in part or full, via electronic means (emailed, posted or archived) or printed copy are granted providing that no charges are involved, reasonable attempt is made to use the most current version, and all credits and copyright notices are retained. Requests for other distribution rights, including incorporation in commercial products, such as books, magazine articles, or CD-ROMs should be made to perlfaq@perl.com.
This FAQ is archived on ftp.cis.ufl.edu [128.227.100.198] in the file
pub/perl/doc/FAQ, as well as on rtfm.mit.edu [18.181.0.24] in
/pub/usenet/comp.lang.perl.*. If you have any suggested additions or
corrections to this article, please send them to
Here's the beginning of the description from the
perl(1)
man page:
I've also been continuously reminding myself of what Henry Spencer
calls ``second system syndrome'', in which everything under the sun gets
added, resulting in a colossal kludge, like OS 360. You'll find that
the new features in Perl 5 are all pretty minimalistic. The
object-oriented features in particular added only one new piece of
syntax, a C++-style method call.
We did break a few misfeatures in going to Perl 5. It seemed like the
first and last chance to do so. There's a list of the
incompatibilities in the documentation.
Perl 5 is a special case. I've been working on it for years. (This is
part of the reason 4.036 has been so stable!) There are many changes,
most of them for the better, I hope. I don't expect the transition to
be without pain. But that's why I stuck numbered versions out in your
bin directory, so that you can upgrade piecemeal if you like. And
that's why I made the -w switch warn about many of the incompatibilities.
And overriding all that, I've tried to keep it so that you don't have
to know much about the new stuff to use the old stuff. You can upgrade
your knowledge piecemeal too.
The extension mechanism is designed to take over most of the
evolutionary role from now on. And it's set up so that, if you don't
have a particular extension, you know it right up at the front.
In summary, almost every concern that you might think of has already
been (at least) thought about. In a perfect world, every concern
could be addressed perfectly. But in this world we just have to slog
through.
Larry now uses ``Perl'' to signify the language proper and ``perl'' the
implementation of it, i.e. the current interpreter. Hence Tom's
quip that ``Nothing but perl can parse Perl.''
On the other hand, the aesthetic value of casewise parallelism in
``awk'', ``sed'', and ``perl'' as much require the lower-case version as ``C'',
``Pascal'', and ``Perl'' require the upper-case version. It's also easier
to type ``Perl'' in typeset print than to be constantly switching in
Courier. :-)
In other words, it doesn't matter much, especially if all you're doing
is hearing someone talk about the language; case is hard to distinguish
aurally.
It depends on whether you are talking about the perl binary or
something that you wrote using perl. And, actually, even this isn't
necessarily true.
``Standard'' UNIX terminology is (roughly) this: programs are compiled
into machine code once and run multiple times, scripts are translated
(by a program) each time they are used. However, some say that a
program is anything written which is executed on a computer system.
Larry considers it a program if it is set in stone and you can't change
it, whereas if you can go in and hack at it, it's a script. Of course,
if you have the source code, that makes just about anything a
script. ;)
In general, it probably doesn't really matter. The terms are used
interchangeably. If you particularly like one or the other, use it. If
you want to call yourself a perl programmer, call them programs. If
you want to call yourself a perl scripter, call them scripts. Randal*
and I (at least) will call them hacks. (See question 2.10 ;)
Larry says that a script is what you give an actor, but a program is
what you give an audience.
The first reason is that most of Perl has been derived from standard
utilities, tools, and languages that you are (probably) already
familiar with. If you have any knowledge of the C programming language
and standard C library, the Unix Shell, sed and awk, Perl should be
simple and fun for you to learn.
The second reason that Perl is easy to learn is that you only have to
know a very small subset of Perl to be able to get useful results. In
fact, once you can master
The third reason is that you can get immediate results from your
scripts. Unlike a normal compiled language (like C or Pascal, for
example), you don't have to continually recompile your program every
time you change one little thing. Perl allows you to experiment and
test/debug quickly and easily. This ease of experimentation flattens
the learning curve even more.
If you don't know C or UNIX at all, it'll be a steeper learning curve,
but what you then learn from Perl will carry over into other areas,
like using the C library, UNIX system calls, regular expressions, and
associative arrays, just to name a few. To know Perl is to know UNIX,
and vice versa.
Most definitely. In fact, you should delete the binaries for sed, awk,
cc, gcc, grep, rm, ls, cat... well, just delete your /bin directory.
But seriously, of course you shouldn't. As with any job, you should
use the appropriate tool for the task at hand. Just because a hammer
will put screws into a piece of board, you probably don't want to do
that.
While it's true that the answer to the question ``Can I do (some
arbitrary task) in Perl?'' is almost always ``yes'', that doesn't mean
this is necessarily a good thing to do. For many people, Perl serves
as a great replacement for shell programming. For a few people, it
also serves as a replacement for most of what they'd do in C. But for
some things, Perl just isn't the optimal choice.
REXX is an interpreted programming language first seen on IBM systems.
Python is an interpreted programming language by Guido van Rossum*.
TCL is John Ousterhout*'s embeddable command language, designed just
for embedded command extensions, but lately used for larger
applications. TCL's most intriguing feature for many people is the
tcl/tk toolset that allows for interpreted X-based tools. Others use
it for its ``expect'' extension.
To avoid any flamage, if you really want to know the answer to this
question, probably the best thing to do is try to write equivalent
code to do a set of tasks. All three have their own newsgroups in
which you can learn about (but hopefully not argue about) these
languages.
To find out more about these or other languages, you might also check
out David Muir Sharnoff*'s posting ``Catalog of Compilers, Interpreters,
and Other Language Tools'' which he posts to comp.lang.misc,
comp.sources.d,
comp.archives.admin,
and news.answers newsgroups. It's
a comprehensive treatment of many different languages. (Caveat lector:
he considers Perl's syntax ``unappealing''.)
Perl is available from any comp.sources.misc archive.
You can use an
archie server (see the alt.sources FAQ in
news.answers) to find these
if you want.
If there is a site in Asia or Japan, please tell us about it. Thanks!
You can also retrieve perl via non-ftp methods:
The following is a list of known ftpmail sites. Please attempt to use
the site closest to you with the ftp archive closest to it. Many of
these sites already have perl on them. For information on how to use
one of these sites, send email containing the word ``help'' to the
address.
If all else fails, mail to Larry usually suffices.
There currently is no way of getting Perl via UUCP. If anyone knows of
a way, please contact me. The OSU site has discontinued the service.
Another possibility is to use UUNET, although they charge you for it.
You have been duly warned. Here's the advertisement:
UUNET now provides access to its extensive collection of UNIX
related sources to non- subscribers. By calling 1-900-468-7727
and using the login ``uucp'' with no password, anyone may uucp any
of UUNET's on line source collection. Callers will be charged 40
cents per minute. The charges will appear on their next tele-
phone bill.
The file uunet!/info/help contains instructions. The file
uunet!/index//ls-lR.Z contains a complete list of the files
available and is updated daily. Files ending in Z need to be
uncompressed before being used. The file uunet!~/compress.tar is
a tar archive containing the C sources for the uncompress program.
This service provides a cost effective way of obtaining
current releases of sources without having to maintain accounts
with UUNET or some other service. All modems connected to the
900 number are Telebit T2500 modems. These modems support all
standard modem speeds including PEP, V.32 (9600), V.22bis (2400),
Bell 212a (1200), and Bell 103 (300). Using PEP or V.32, a 1.5
megabyte file such as the GNU C compiler would cost $10 in con-
nect charges. The entire 55 megabyte X Window system V11 R4
would cost only $370 in connect time. These costs are less than
the official tape distribution fees and they are available now
via modem.
Perl runs on virtually all Unix machines simply by following the hints
file and instructions in the Configure script. This auto-configuration
script allows Perl to compile on a wide variety of platforms by
modifying the machine specific parts of the code. For most Unix
systems, or VMS systems for v5 perl, no porting is required. Try to
compile Perl on your machine. If you have problems, examine the README
file carefully. If all else fails, send a message to comp.lang.perl.misc
and crosspost to comp.sys.[whatever], there's probably someone out
there that has already solved your problem and will be able to help you
out.
Perl4.036 has been ported to many non-Unix systems, although currently
there are only a few (beta) v5 ports. All of the following are
mirrored at ftp://ftp.cis.ufl.edu:/pub/perl/src/. The following are
the (known) official distribution points. Please contact the porters
directly (when possible) in case of questions on these ports.
Note that the latest version of BigPerl4 can also be found at
any SimTel mirror site (ftp.ee.umanitoba.ca does not
necessarily have the latest version), such as:
ftp://oak.oakland.edu/SimTel/msdos/perl/
A beta-test version of bigperl based on Perl 5.000 can be
obtained from the following sites:
ftp://ftp.einet.net/pub/perl5
ftp://ftp.khoros.unm.edu/pub/perl/msdos
ftp://ftp.ee.umanitoba.ca/pub/msdos/perl/perl5
This beta bigperl also contains ported versions of a2p and s2p.
Timothy Murphy* also ported a version of perl to the Macintosh
using Think C. It has probably been abandoned in favour of the
MPW port, but is still available at [134.266.81.10]
ftp://ftp.maths.tcd.ie/pub/Mac/perl-4.035/.
Matthias Ulrich Neeracher* is working on a perl5 port to the
Macintosh. A PowerPC version is available at
ftp://err.ethz.ch/pub/neeri/MacPerlBeta.
The following directions are for perl, version 4. Perl, version 5,
should compile more easily. If not, send mail to The Perl Porters
Mailing List (perl5-porters@nicoh.com)
John Lees* reports:
I have built perl on Solaris 2.1, 2.2 beta, and 2.2 FCS. Take
/usr/ucb out of your path and do not use any BSD/UCB libraries.
Only -lsocket, -lnsl, and -lm are needed. You can use the hint for
Solaris 2.0, but the one for 2.1 is wrong. Do not use vfork. Do not
use -I/usr/ucbinclude. The result works fine for me, but of couse
does not support a couple of BSDism's.
Casper H.S. Dik* reports
Michael D'Errico* reports:
According to Andreas Koenig*, under NeXTstep 3.2, both perl4.036 and
perl5.000 compile with the supplied hints file.
However, Bill Eldridge* provides this message to help get perl4.036 on
NeXTstep 3.0 to work:
Many database-oriented extensions to Perl have been written.
Basically, these use the usub mechanism (see the usub/ subdirectory) in
the source distribution) to link in a database library, allowing
embedded calls to Informix, Ingres, Interbase, Oracle and Sybase.
Here are the authors of the various extensions:
Buzz Moschetti* has organized a project to create a higher level
interface to allow you to write your queries in a database-independent
fashion. If this type of project interests you, send mail to
<perldb-interest-request@vix.com> and asked to be placed on the
``perldb-interest'' mailing lists.
Here's a bit of advertising from Buzz:
The official archive for DBperl extensions is ftp://ftp.demon.co.uk/pub/perl/db
It's the home of the evolving DBperl API Specification.
Here's an extract from the updated README there:
snmperl was written by Guy Streeter (streeter@ingr.com), and was
posted in late February 1993 to comp.protocols.snmp. It can be found
archived at one of two (known) places:
Here is the gist of the README:
USE:
There are four subroutines defined in the callable interface:
snmp_get, snmp_next, snmp_set, and snmp_error.
snmp_get and snmp_next implement the GET and GETNEXT operations,
respectively. The first two calling arguments are the hostname and
Community string. The IP address of the host, as a dotted-quad ASCII
string, may be used as the hostname. The rest of the calling
arguments are a list of variables. See the CMU package documentation
for how variables may be specified.
snmp_set also takes hostname and Community string as arguments. The
remaining arguments are a list of triples consisting of variable name,
variable type, and value. The variable type is a string, such as
``INTEGER'' or ``IpAddress''.
snmp_get, snmp_next, and snmp_set return a list containing
alternating variables and values. snmp_get and snmp_next will simply
omit non-existent variables on return. snmp_set will fail completely
if one of the specified variables does not exist (or is read-only).
snmp_error will return a text string containing some error
information about the most recent snmp_get|next|set call, if it had an
error.
OTHER NOTES:
The changes I made to mib.c involve the formatting of variable values
for return to the caller. I took out the descriptive prefix so the
string contains only the value.
Enumerated types are returned as a string containing the symbolic
representation followed in parentheses by the numeric.
DISTRIBUTION and OWNERSHIP
perl and the CMU SNMP package have their own statements. Read them.
The work I've done is free and clear. Just don't say you wrote it if
you didn't, and don't say I wrote it if you change it.
No. Larry thinks it likely that he'll be certified before perl is.
Yes there is: comp.lang.perl.misc.
This group, which currently can get up to 150 messages per day,
contains all kinds of discussions about Perl; everything from bug
reports to new features to the history to humour and trivia. This is
the best source of information about anything Perl related, especially
what's new with Perl5. Because of its vast array of topics, it
functions as both a comp.lang.* style newsgroup (providing technical
information) and also as a rec.* style newsgroup, kind of a support
group for Perl addicts (PerlAnon?). There is also the group comp.lang.perl.announce, a
place specifically for announcements related to perl (new releases,
the FAQ, new modules, etc).
Larry is a frequent poster to this group as well as most (all?) of the
seasoned Perl programmers. Questions will be answered by some of the
most knowledgable Perl Hackers, often within minutes of a question
being posted (give or take distribution times).
There are a number of books either available or planned. Mostly
chronologically, they are:
This is probably the most well known and most useful book for 4.036 and
earlier. This part of O'Reilly's hugely successful ``Nutshell Handbook''
series. Besides serving as a reference guide for Perl, it also contains
tutorial material and is a great source of examples and cookbook
procedures, as well as wit and wisdom, tricks and traps, pranks and
pitfalls. The code examples contained therein are available from
ftp://ftp.ora.com/pub/examples/nutshell/programming_perl/perl.tar.Z or
ftp://ftp.cis.ufl.edu/pub/perl/ora/programming_perl. Corrections and
additions to the book can be found in the Perl4 man page right before
the BUGS section under the heading ERRATA AND ADDENDA.
Another of O'Reilly's ``Nutshell Handbooks'', by Randal Schwartz. This book is
a smaller, gentler introduction to perl and is based off of Randal's
perl classes. While in general this is a good book for learning perl
(like its title), early printings did contain many typos and don't
cover some of the more interesting features of perl. Please check the
errata sheet at ftp.ora.com, as well as the on-line examples.
If you can't find these books in your local technical bookstore, they
may be ordered directly from O'Reilly by calling 1-800-998-9938 if in
North America and 1-707-829-0515 otherwise.
Johan Vromans* created a beautiful reference guide. The reference
guide comes with the Camel book in a nice, glossy format. The LaTeX
(source) and PostScript (ready to print) versions are available for FTP
from ftp.cs.ruu.nl:/pub/DOC/perlref-4.036.1.tar.Z in Europe or from
ftp.cis.ufl.edu:/pub/perl/doc/perlref-4.036.tar.gz in the United
States. Obsolete versions in TeX or troff may still be available, but
these versions don't print as nicely. See also:
Johan has also updated and released a reference guide based on version
5.000. This is available from the same places as the 4.036 guide.
This version is also available from prep.gnu.ai.mit.edu in the /pub/gnu
section along with the perl5 source. It may be added to the standard
perl5 distribution sometime after 5.002. If you are using version
5.000, you will want to get this version rather than the 4.036 version.
Larry routinely carries around a camel stamp to use when autographing
copies of his book. If you can catch him at a conference you can
usually get him to sign your book for you.
Please note that none of the above books are perfect, all have some
inaccurances and typos. The two which Larry is directly associated
with (the O'Reilly books) are probably the most technically correct,
but also the most dated. Carefully looking over any book you are
considering purchasing will save you much time, money, and frustration.
Starting in the March, 1995 edition of Unix Review. Randal Schwartz* has been
authoring a bi-monthly Perl column. This has so far been an introductory
tutorial.
Larry Wall has published a 3-part article on perl in Unix World
(August through October of 1991), and Rob Kolstad also had a 3-parter
in Unix Review (May through July of 1990). Tom Christiansen also has
a brief overview article in the trade newsletter Unix Technology
Advisor from November of 1989. You might also investigate ``The Wisdom
of Perl'' by Gordon Galligher from SunExpert magazine; April 1991
Volume 2 Number 4. The Dec 92 Computer Language magazine also
contains a cover article on Perl, ``Perl: the Programmers Toolbox''.
Many other articles on Perl have been recently published. If you
have references, especially on-line copies, please mail them to
the FAQ maintainer for inclusion is this notice.
The USENIX LISA (Large Installations Systems Administration) Conference
have for several years now included many papers of tools written in
Perl. Old proceedings of these conferences are available; look in
your current issue of ``;login:'' or send mail to office@usenix.org
for further information.
Japan seems to be jumping with Perl books. If you can read japanese
here are a few you might be interested in. Thanks to Jeffrey Friedl*
and Ken Lunde* for this list (NOTE: my screen cannot handle japanese
characters, so this is all in English for the moment NOTE2: These
books are written in Japanese, these titles are just translations):
Title: How to Write Perl (Perl Shohou)
Author: Toshiyuki Masui
Pages: 352 Publisher: ASCII Corporation
Pub. Date: July 1, 1993 ISBN: 4-7561-0281-6
Price: 3200Y Author Email: masui@shocsl.sharp.co.jp
Comments: More advanced than ``Welcome..'' and not meant as an
introduction. Uses the standard perl and has examples for handling
Japanese text.
Title: Introduction to Perl (Nyuumon Perl)
Author: Shinji Kono
Pages: 203 Publisher: ASCII Corporation
Date: July 11, 1994 ISBN: 4-7561-0292-1
Price: 1800Y Author Email: kono@csl.sony.co.jp
Comments: Uses the interactive Perl debugger to explain how things
work.
Title: Perl Programming
Authors: L Wall & R Schwartz Translator: Yoshiyuki Kondo
Pages: 637+32 Publisher: Softbank Corporation
Pub. Date: February 28, 1993 ISBN: 4-89052-384-7
Price: 4500Y Author Email: cond@lsi-j.co.jp
Comments: Official Japanese translation of the Camel book,
``Programming Perl''. Somewhat laced with translator notes to
explain the humour. The most useful book. Also includes the Perl
Quick Reference -- in Japanese!
As of August, 1995, ORA has contracted with Stephen to handle the
Camel update. According to the accepted timeline, the first draft
is to be finished by the end of April, 1996. The tutorial sections
are being cut some, and the book will take on much more of a reference
style. Don't worry, it will still contain its distinctive humor and
flair.
There are no current plans to update the Llama. For the most part,
it serves as a good introduction for both major versions of perl.
There may be some minor editing to it, but probably nothing major.
If anything, it is more likely that a third book (working title:
Learning More Perl) will be written as a tutorial for the new perl5
paradigm.
Since 1993, several ftp sites have sprung up for Perl and Perl related
items. The site with the biggest repository of Perl scripts right now
seems to be ftp.cis.ufl.edu [128.227.100.198] in /pub/perl. The
scripts directory has an INDEX with over 400 lines in it, each
describing what the script does. The src directory has sources and/or
binaries for a number of different perl ports, including MS-Dos,
Macintosh and Windows/NT. This is maintained by the Computing Staff at
UF*.
Note: European users please use the site src.doc.ic.ac.uk
[149.169.2.1] in /pub/computing/programming/languages/perl/
The link speed would be a lot better for all. Contact
L.McLoughlin@doc.ic.ac.uk for more information. It is updated
daily.
There are also a number of other sites. I'll add more of them as I get
information on them.
[site maintainers: if you want to add a blurb here, especially if you
have something unique, please let me know. -spp]
The Comprehensive Perl Archive Network (CPAN) is in heavy development.
Once the main site and its mirrors are fully operational, this answer
will change to reflect its existence.
The World Wide Web is exploding with new Perl sites all the time. Some
of the more notable ones are:
http://www.cis.ufl.edu/perl
http://www.metronet.com/1h/perlinfo, which has a great section on
Perl5.
http://www.eecs.nwu.edu/perl/perl.html
http://web.nexor.co.uk/perl/perl.html, a great site for European
and UK users.
``Perl-Users'' is the mailing list version of the comp.lang.perl.misc
newsgroup. If you're not lucky enough to be on USENET you can post to
comp.lang.perl.misc by sending to one of the following addresses.
Which one will work best for you depends on which nets your site is
hooked into. Ask your local network guru if you're not certain.
The Perl-Users list is bidirectionally gatewayed with the USENET
newsgroup comp.lang.perl.misc. This means that VIRGINIA functions as a
reflector. All traffic coming in from the non-USENET side is
immediately posted to the newsgroup. Postings from the USENET side are
periodically digested and mailed out to the Perl-Users mailing list. A
digest is created and distributed at least once per day, more often if
traffic warrants.
All requests to be added to or deleted from this list, problems,
questions, etc., should be sent to:
Yes, there are. ftp.cis.ufl.edu:/pub/perl/comp.lang.perl.*/monthly has
an almost complete collection dating back to 12/89 (missing 08/91
through 12/93). They are kept as one large file for each month.
A more sophisticated query and retrieval mechanism is desirable.
Preferably one that allows you to retrieve article using a fast-access
indices, keyed on at least author, date, subject, thread (as in ``trn'')
and probably keywords. Right now, the MH pick command works for this,
but it is very slow to select on 18000 articles.
If you have, or know where I can find, the missing sections, please let
perlfaq@perl.com know.
Yes there is. Set your WAIS client to
archive.orst.edu:9000/comp.lang.perl.*. According to their
introduction, they have a complete selection from 1989 on.
Bill Middleton <wjm@feenix.metronet.com> offers this:
and other things to see examples of how other folks have done this
or that. This service is still under construction, but I'd like to
get feedback, if you have some time.
There's also a WaisSearch into all the RFC's and some other fairly
nifty stuff.
There is a #Perl channel on IRC (Internet Relay Chat) where Tom and
Randal have been known to hang out. Here you can get immediate answers
to questions from some of the most well-known Perl Hackers.
The perl5-porters (perl5-porters@nicoh.com) mailing list was created to
aid in communication among the people working on perl5. However, it
has overgrown this function and now also handles a good deal of traffic
about perl internals.
USENIX, LISA, SUG, WCSAS, AUUG, FedUnix and Europen sponsor tutorials
of varying lengths on Perl at the System Administration and General
Conferences. These public classes are typically taught by Tom Christiansen*.
In part, Tom and Randal teach Perl to help keep bread on their tables
long enough while they continue their pro bono efforts of documenting
perl (Tom keeps writing more man pages for it :-) and expanding the
perl toolkit through extension libraries, work which they enjoy doing
as it's fun and helps out the whole world, but which really doesn't
pay the bills. Such is the nature of free(ly available) software.
Send mail to <perlclasses@perl.com> for details and availability.
Tom is also available to teach on-site classes, included courses on
advanced perl and perl5. Classes run anywhere from one day to week
long sessions and cover a wide range of subject matter. Classes can
include lab time with exercises, a generally beneficial aspect. If you
would like more information regarding Perl classes or when the next
public appearances are, please contact Tom directly at 1.303.444.3212.
Randal Schwartz* provides a 2-day lecture-only and a 4-5 day lecture-lab
course based on his popular book ``Learning Perl''. For details, contact
Randal directly via email or at 1.503.777.0095.
Internet One provides a 2 day ``Introduction to Perl'' and 2 day
``Advanced Perl'' workshop. The 50% hands-on and 50% lecture format
allow attendees to write several programs themselves. Supplied
are the user manuals, reference copies of Larry Wall's ``Programming Perl'', and a UNIX directory of all training examples and
labs. To obtain outlines, pricing, or scheduling information, use
the following:
At this time, the known list of companies that ship Perl includes at
least the following, although some have snuck it into /usr/contrib or
its moral equivalent:
BSDI
Comdisco Systems
CONVEX Computer Corporation
Crosspoint Solutions
Data General
Dell
DRD Corporation
IBM (SP systems)
Intergraph
Kubota Pacific
Netlabs
SGI (without taintperl)
Univel
Some companies ship it on their ``User Contributed Software Tape'',
such as DEC and HP. Apple Computer has shipped the MPW version of
Macintosh Perl on one of their Developer CDs (EssentialsToolsObjects
#11) (and they included it under ``Essentials'' :-)
Many other companies use Perl internally for purposes of tools
development, systems administration, installation scripts, and test
suites. Rumor has it that the large workstation vendors (the TLA set)
are seriously looking into shipping Perl with their standard systems
``soon''.
People with support contracts with their vendors are actively
encouraged to submit enhancement requests that Perl be shipped
as part of their standard system. It would, at the very least,
reduce the FTP load on the Internet. :-)
If you know of any others, please send them in.
Not really. Although perl is included in the GNU distribution, at last
check, Cygnus does not offer support for it. However, it's unclear
whether they've ever been offered sufficient financial incentive to do
so. Feel free to try.
On the other hand, you do have comp.lang.perl.misc as a totally gratis
support mechanism. As long as you ask ``interesting'' questions, you'll
probably get plenty of help. :-)
While some vendors do ship Perl with their platforms, that doesn't mean
they support it on arbitrary other platforms. And in fact, all they'll
probably do is forward any bug reports on to Larry. In practice, this
is far better support than you could hope for from nearly any vendor.
If you purchase a product from Netlabs (the company Larry works for),
you actually can get a support contract that includes Perl.
The companies who won't use something unless they can pay money for it
will be left out. Often they're motivated by wanting someone whom they
could sue. If all they want is someone to help them out with Perl
problems, there's always the net. And if they really want to pay
someone for that help, well, any of a number of the regular Perl
``dignitaries'' would appreciate the money. ;-)
If companies want ``commercial support'' for it badly enough, speak up --
something might be able to be arranged.
These are the ``just another perl hacker'' signatures that some people
sign their postings with. About 100 of the of the earlier ones are
available from the various FTP sites.
When people started running out of tricky and interesting JAPHs, some
of them turned to writing ``Will hack perl for ...'' quotes. While
sometimes humourous, they just didn't have the flair of the JAPHs and
have since almost completely vanished.
Over a hundred quips by Larry, from postings of his or source code,
can be found in many of the FTP sites or through the World Wide Web at
"ftp://ftp.cis.ufl.edu/pub/perl/misc/lwall-quotes"
This is NOT a complete list, just some of the more common bugs that
tend to bite people. There are in 5.001:
instead of
Before posting about a bug, please make sure that you are using the
most recent versions of perl (currently 4.036 and 5.001) available.
Please also check at the major archive sites to see if there are any
development patches available (usually named something like
perl5.001a.patch or patch5.001a - the patch itself, or
perl5.001a.tar.gz - a prepatched distribution). If you are not using
one of these versions, chances are you will be told to upgrade because
the bug has already been fixed.
If you are reporting a bug in perl5, the best place to send your bug
is <perlbug@perl.com>, which is currently just an alias for
<perl5-porters@nicoh.com>. In the past, there have been problems with
the perlbug address. If you have problems with it, please send your
bug directly to <perl5-porters@nicoh.com>. You may subscribe to the list
in the customary fashion via mail to <perl5-porters-request@nicoh.com>.
Feel free to post your bugs to the comp.lang.perl.misc newsgroup as
well, but do make sure they still go to the mailing list.
If you are posting a bug with a non-Unix port, a non-standard Module
(such as Tk, Sx, etc) please see the documentation that came with it
to determine the correct place to post bugs.
To enhance your chances of getting any bug you report fixed:
You should post source code to whichever group is most appropriate,
but feel free to cross-post to comp.lang.perl.misc. If you want to
cross-post to alt.sources, please make sure it follows their
posting standards, including setting the Followups-To header
line to NOT include alt.sources; see their FAQ for details.
The
perlobj(1)
man page is a good place to start, and then you can
check out the excellent
perlbot(1)
man page written by the dean of perl
o-o himself, Dean Roehrich. Areas covered include the following:
The section on instance variables should prove very helpful to those
wondering how to get data inheritance in perl.
While it used to be deep magic, how to do this is now revealed in the
perlapi(1)
,
perlguts(1)
, and
perlcall(1)
man pages, which treat with
this matter extensively. You should also check the many extensions
that people have written (see question 1.19), many of which do this
very thing.
Perl.com is just Tom's domain name, registered as dedicated to ``Perl
training and consulting''. While not a full ftp site (he hasn't got
the bandwidth (yet)), it does have some interesting bits, most of which
are replicated elsewhere. It serves as a clearinghouse for certain
perl related mailing lists. The following aliases work:
To keep from cluttering up the FAQ and for easy reference all email
addresses have been collected in this location. For each person
listed, I offer my thanks for their input and help.
Now you can type in any legal Perl code, and it will be immediately
evaluated. You can also examine the symbol table, get stack
backtraces, check variable values, and if you want to, set breakpoints
and do the other things you can do in a symbolic debugger.
While there isn't one included with the perl source distribution (yet)
various folks have written packages that allow you to do at least some
sort of profiling. The strategy usually includes modifying the perl
debugger to handle profiling. Authors of these packages include
Wayne Thompson me@anywhere.EBay.Sun.COM
Ray Lischner lisch@sysserver1.mentor.com
Kresten Krab Thorup krab@iesd.auc.dk
The original articles by these folks containing their profilers are
available at ftp://convex.com/pub/perl/info/profiling.shar.
Recently, Dean Roehrich* has written a profiler for version 5 that
likely will be distributed with the standard release. For now, it
should be available through any of the extension archives as
DProf.tar.gz.
Yes!! It's a version of Berkeley yacc that outputs Perl code instead
of C code! You can get this from
ftp://ftp.sterling.com/local/perl-byacc1.8.2.tar.Z, or send the author
mail for details.
That depends on what you mean. If you want something that works like
vgrind on Perl programs, then the answer is ``yes, nearly''. Here's a
vgrind entry for perl:
David Levine uses this:
If what you mean is whether there is a program that will reformat the
program much as indent(1) will do for C, then the answer is no. The
complex feedback between the scanner and the parser (as in the things
that confuse vgrind) make it challenging at best to write a stand-alone
Perl parser.
Of course, if you follow the guidelines in
perlstyle(1), you shouldn't
need to reformat.
The short answer is: ``No, you can't compile perl into C. Period.''
However, having said that, it is believed that it would be possible to
write a perl to C translator, although it is a PhD thesis waiting to
happen. Anyone need a good challenging thesis?
In the way of further, detailed explication, it seems that the reasons
people want to do this usaully break down into one or more of the
following:
You might also look into autoloading functions on the fly, which
can greatly reduce start-up time.
Permission is hereby granted soley to the licencee for use of
this source code in its unaltered state. This source code may
not be modified by licencee except under direction of XYZZY
Inc. Neither may this source code be given under any
circumstances to non-licensees in any form, including source
or binary. Modification of this source constitutes breach of
contract, which voids any potential pending support
responsibilities by XYZZY Inc. Divulging the exact or
paraphrased contents of this source code to unlicensed parties
either directly or indirectly constitutes violation of federal
and international copyright and trade secret laws, and will be
duly prosecuted to the fullest extent permitted under law.
This software is provided by XYZZY Inc. ``as is'' and any
express or implied warranties, including, but not limited to,
the implied warranties of merchantability and fitness for a
particular purpose are disclaimed. In no event shall the
regents or contributors be liable for any direct, indirect,
incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute
goods or services; loss of use, data, or profits; or business
interruption) however caused and on any theory of liability,
whether in contract, strict liability, or tort (including
negligence or otherwise) arising in any way out of the use of
this software, even if advised of the possibility of such
damage.
If you maintain a central site that distributes software to
internal client machines, use rdist(1) to send around a proper
version periodically, perhaps using the -y option on the install
to flag destinations younger than the source.
Let it be noted than in the many, many years that Perl's author
has been releasing and supporting freely redistributable software,
he has NEVER ONCE been bitten by a bogus bug report generated by
someone breaking his code because they had access to it. Rather,
he and many other open software provided (where open software
means that for which the source is provided, the only truly open
software) have saved themselves countless hours of labor thousands
of times over because they've allowed people to inspect the source
for themselves. Proprietary source-code hoarding is its own
headache.
Thus, obscurity for the sake of maintainability would seem to be a
red herring.
Since Emacs version 19 patchlevel 22 or so, there has been both a
perl-mode.el and support for the perl debugger built in. These should
come with the standard Emacs 19 distribution.
In the perl source directory, you'll find a directory called
``emacs'', which contains several files that should help you.
Note that the perl-mode of emacs will have fits with main'foo (single
quote), and mess up the indentation and hilighting. However, note that
in perl5, you should be using main::foo. By the way, did we mention
that you should upgrade?
Daniel Smith <dls@best.com> is working on an interactive Perl shell
called SoftList. It's currently at version 3.0b7a (beta). SoftList
3.0b7a has tcsh-like command line editing, can let you define a file of
aliases so that you can run chunks of perl or UNIX commands, and so
on. You can pick up a copy at ftp.best.com in
/pub/dls/SoftList-3.0b7a.gz.
In release 4 of perl, the only way to do this was was to build a
curseperl binary by linking in your C curses library as described in
the usub subdirectory of the perl sources. This requires a modicum of
work, but it will be reasonably fast since it's all in C (assuming you
consider curses reasonably fast. :-) Programs written using this
method require the modified curseperl, not vanilla perl, to run.
While this is something of a disadvantage, experience indicates that
it's better to use curseperl than to try to roll your own using
termcap directly.
Fortunately, in version 5, Curses is a dynamically loaded extension by
William Setzer*. You
should be able to pick it up wherever you get Perl 5 from, or at least
these places (expect that the version may change by the time you read
this):
For a good example of using curses with Perl, you might want to pick
up a copy of Steven L Kunz's* ``perl menus'' package (``menu.pl'') via
anonymous FTP from ``ftp.iastate.edu''. It's in the directory /pub/perl
as:
menu.pl is supported on Perl4/curseperl and Perl5/Curses. Complete
user documentation is provided along with several demos and ``beginner
applications''. A menu utility module is provided that is a collection
of useful Perl curses routines (such as "pop-up query boxes) that may
be called from your applications.
Another possibility is to use Henk Penning's cterm package, a curses
emulation library written in perl. cterm is actually a separate
program with which you communicate via a pipe. It is available from
ftp.cs.ruu.nl [131.211.80.17] via anonymous ftp. in the directory
pub/PERL. You may also acquire the package via email in compressed,
uuencoded form by sending a message to mail-server@cs.ruu.nl
containing these lines:
Right now, you have several choices. If you are still using perl4, use
the WAFE or STDWIN packages, or try to make your own usub binding.
However, if you've upgraded to version 5, you have several exciting
possibilities, with more popping up each day. Right now, Tk and Sx
are the best known such extensions.
If you like the tk package, you should get the Tk extension kit,
written by Nick Ing-Simmons*. The official distribution point is at
ftp://ftp.wpi.edu/perl5/private/Tk-b8.tar.gz
but many of the major archive sites now have it in their /ext{entions}
directory also. Depending upon your location, you may be better off
checking there. Also, understand that the version number may have
changed by the time you read this.
This package replaced the tkperl5 project, by Malcolm Beattie*, which
was based on an older version of Tk, 3.6 as compared to the current
4.X. This package was also known as nTk (new Tk) while it was in the
alpha stages, but has been changed to just Tk now that it is in beta.
Also, be advised that you need at least perl5.001 (preferably 5.002,
when it becomes available) and the official unofficial patches.
You may also use the old Sx package, (Athena & Xlib), written by
originally written by by Dominic Giampaolo*, then and rewritten for Sx
by Frédéric Chauveau*. It's available from these sites:
WAFE is a package that implements a symbolic interface to the Athena
widgets (X11R5). A typical Wafe application consists in our framework
of two parts: the front-end (we call it Wafe for Widget[Athena]front
end) and an application program running typically as a separate
process. The application program can be implemented in an arbitrary
programming language and talks to the front-end via stdio. Since Wafe
(the front-end) was developed using the extensible TCL shell (cite John Ousterhout), an application program can dynamically submit requests to
the front-end to build up the graphical user interface; the
application can even down-load application specific procedures into
the front-end. The distribution contains sample application programs
in Perl, GAWK, Prolog, TCL, and C talking to the same Wafe binary.
Many of the demo applications are implemented in Perl. Wafe 0.9 can
be obtained via anonymous ftp from
Alternatively, you could use wish from tcl.
Yes -- dynamic loading comes with the distribution. That means that
you no longer need 18 different versions of fooperl floating around.
In fact, all of perl can be stuck into a libperl.so library and
then your /usr/local/bin/perl binary reduced to just 50k or so.
See
DynaLoader(3pm)
for details.
In perl4, the answer was kinda. One package has been released that does
this, by Roberto Salama*. He writes:
Here is a version of dylperl, dynamic linker for perl. The code here is
based on Oliver Sharp's May 1993 article in Dr. Dobbs Journal (Dynamic
Linking under Berkeley UNIX).
The Makefile assumes that uperl.o is in /usr/local/src/perl/... You
will probably have to change this to reflect your installation. Other
than that, just type 'make'...
The idea behind being able to dynamically link code into perl is that
the linked code should become perl functions, i.e. they can be invoked
as &foo(...). For this to happen, the incrementally loaded code must
use the perl stack, look at sample.c to get a better idea.
The few functions that make up this package are outlined below.
Comments are welcome. I submit this code for public consumption and,
basically, am not responsible for it in any way.
The undump program comes from the TeX distribution. If you have TeX,
then you may have a working undump. If you don't, and you can't get
one, AND you have a GNU emacs working on your machine that can clone
itself, then you might try taking its unexec() function and compiling
Perl with -DUNEXEC, which will make Perl call unexec() instead of
abort(). You'll have to add unexec.o to the objects line in the
Makefile. If you succeed, post to comp.lang.perl.misc about your
experience so others can benefit from it.
If you have a version of undump that works with Perl, please submit
its anon-FTP whereabouts to the FAQ maintainer.
John Dallman* has written a program ``#!perl.exe" which will do this.
It is available through anonymous ftp from ftp.ee.umanitoba.ca in the
directory /pub/msdos/perl/hbp_30.zip. This program works by finding
the script and perl.exe, building a command line and running perl.exe
as a child process. For more information on this, contact John
directly.
Sure, if they're simple enough. Of course, for most programs,
you'll enter them in a file and call perl on them from your
shell. That way you can go into the hack/execute/debug cycle.
But there are plenty of useful one-liner: see below. (Things
marked perl5 need to be run from v5.000 or better, but the
rest don't care.)
Ok, the last one was actually an obfuscate perl entry. :-)
(Larry wrote) This is a notion out of the Lisp world that says if you
define an anonymous function in a particular lexical context, it
pretends to run in that context even when it's called outside of the
context.
In human terms, it's a funny way of passing arguments to a subroutine
when you define it as well as when you call it. It's useful for
setting up little bits of code to run later, such as callbacks. You
can even do object-oriented stuff with it, though Perl provides a
different mechanism to do that already.
You can also think of it as a way to write a subroutine template without
using eval.
Here's a small example of how this works:
This prints:
This only applies to lexical variables, by the way. Dynamic variables
continue to work as they have always worked. Closure is not something
that most Perl programmers need trouble themselves about to begin with.
Those are type specifiers:
See the question on arrays of arrays for more about Perl pointers.
While there are a few places where you don't actually need these type
specifiers, except for files, you should always use them. Note that
<FILE> is NOT the type specifier for files; it's the equivalent of awk's
getline function, that is, it reads a line from the handle FILE. When
doing open, close, and other operations besides the getline function on
files, do NOT use the brackets.
Beware of saying:
Normally, files are manipulated something like this (with appropriate
error checking added if it were production code):
Often people request:
Larry's answer is:
You'll be pleased to know that I've been trying real hard to get
rid of unnecessary punctuation in Perl 5. You'll be displeased to
know that I don't think noun markers like $ and @ unnecessary.
Not only do they function like case markers do in human language,
but they are automatically distinguished within interpolative
contexts, and the user doesn't have to worry about different
syntactic treatments for variable references within or without
such a context.
But the & prefix on verbs is now optional, just as ``do'' is in
English. I do hope you do understand what I mean.
For example, you used to have to write this:
It can now be written more cleanly like this:
Strictly speaking, of course, $ and @ aren't case markers, but
number markers. English has mandatory number markers, and people
get upset when they doesn't agree.
It were just convenient in Perl (for the shellish interplative
reasons mentioned above) to pull the markers out to the front of
each noun phrase. Most people seems to like it that way. It
certainly seem to make more sense than putting them on the end,
like most varieties of BASIC does.
Actually, they don't; all C operators have the same precedence in Perl
as they do in C. The problem is with a class of functions called list
operators, e.g. print, chdir, exec, system, and so on. These are
somewhat bizarre in that they have different precedence depending on
whether you look on the left or right of them. Basically, they gobble
up all things on their right. For example,
will unlink all those file names. A common mistake is to write:
The problem is that this gets interpreted as
To avoid this problem, you can always make them look like function calls
or use an extra level of parentheses:
In perl5, there are low precedence ``and'', ``or'', and ``not'' operators,
which bind less tightly than comma. This allows you to write:
Sometimes you actually do care about the return value:
Yes, print() returns I/O success. That means
returns 5 times whether printing (2+4) succeeded, and
See the
perlop(1)
man page's section on Precedence for more gory details,
and be sure to use the -w flag to catch things like this.
One very important thing to be aware of is that if you start thinking
of Perl's $, @, %, and & as just flavored versions of C's * operator,
you're going to be sorry. They aren't really operators, per se, and
even if you do think of them that way. In C, if you write
is really
(by which I actually mean)
and not
See the difference? If not, check out
perlref(1)
for gory details.
Notice that the variables declared with my() are visible only within
the scope of the block which names them. They are not visible outside
of this block, not even in routines or blocks that it calls. local()
variables, on the other hand, are visible to routines that are called
from the block where they are declared. Neither is visible after the
end (the final closing curly brace) of the block at all.
Oh, lexical variables are only available in perl5. Have we mentioned
yet that you might consider upgrading? :-)
5.000 answer:
This only matters when you're making subroutines yourself, at least
so far. This will give you shallow binding:
When you call &$coderef(), it will get whatever dynamic $x happens
to be around when invoked. However, you can get the other behaviour
this way:
Now you'll access the lexical variable $x which is set to the
time the subroutine was created. Note that the difference in these
two behaviours can be considered a bug, not a feature, so you should
in particular not rely upon shallow binding, as it will likely go
away in the future. See
perlref(1)
.
5.001 Answer:
Perl will always give deep binding to functions, so you don't need the
eval hack anymore. Furthermore, functions and even formats
lexically declared nested within another lexical scope have access to
that scope.
See the question on ``What's a closure?''
The most efficient way is using pack and unpack. This is faster than
using substr. Here is a sample chunk of code to break up and put back
together again some fixed-format input lines, in this case, from ps.
You must use the type-globbing *VAR notation. Here is some code to
cat an include file, calling itself recursively on nested local
include files (i.e. those with #include "file", not #include
If you want finer granularity than 1 second (as usleep() provides) and
have itimers and syscall() on your system, you can use the following.
You could also use select().
It takes a floating-point number representing how long to delay until
you get the SIGALRM, and returns a floating- point number representing
how much time was left in the old timer, if any. Note that the C
function uses integers, but this one doesn't mind fractional numbers.
Perl's exception-handling mechanism is its eval operator. You
can use eval as setjmp and die as longjmp. Here's an example
of Larry's for timed-out input, which in C is often implemented
using setjmp and longjmp:
Here's an example of Tom's for doing atexit() handling:
You can register your own routines via the &atexit function now. You
might also want to use the &realcode method of Larry's rather than
embedding all your code in the here-is document. Make sure to leave
via die rather than exit, or write your own &exit routine and call
that instead. In general, it's better for nested routines to exit
via die rather than exit for just this reason.
In Perl5, it is easy to set this up because of the automatic processing
of per-package END functions. These work much like they would in awk.
See
perlfunc(1)
,
perlmod(1)
and
perlrun(1)
.
Eval is also quite useful for testing for system dependent features,
like symlinks, or using a user-input regexp that might otherwise
blowup on you.
Perl allows you to trap signals using the %SIG associative array.
Using the signals you want to trap as the key, you can assign a
subroutine to that signal. The %SIG array will only contain those
values which the programmer defines. Therefore, you do not have to
assign all signals. For example, to exit cleanly from a ^C:
There are two special ``routines'' for signals called DEFAULT and IGNORE.
DEFAULT erases the current assignment, restoring the default value of
the signal. IGNORE causes the signal to be ignored. In general, you
don't need to remember these as you can emulate their functionality
with standard programming features. DEFAULT can be emulated by
deleting the signal from the array and IGNORE can be emulated by any
undeclared subroutine.
In 5.001, the $SIG{__WARN__} and $SIG{__DIE__} handlers may be used to
intercept die() and warn(). For example, here's how you could promote
unitialized variables to trigger a fatal rather merely complaining:
Perl only understands octal and hex numbers as such when they occur
as literals in your program. If they are read in from somewhere and
assigned, then no automatic conversion takes place. You must
explicitly use oct() or hex() if you want this kind of thing to happen.
Actually, oct() knows to interpret both hex and octal numbers, while
hex only converts hexadecimal ones. For example:
Without the octal conversion, a requested mode of 755 would turn
into 01363, yielding bizarre file permissions of --wxrw--wt.
If you want something that handles decimal, octal, and hex input,
you could follow the suggestion in the man page and use:
If the dates are in an easily parsed, predetermined format, then you
can break them up into their component parts and call &timelocal from
the distributed perl library. If the date strings are in arbitrary
formats, however, it's probably easier to use the getdate program from
the Cnews distribution, since it accepts a wide variety of dates. Note
that in either case the return values you will really be comparing will
be the total time in seconds as returned by time().
Here's a getdate function for perl that's not very efficient; you can
do better than this by sending it many dates at once or modifying
getdate to behave better on a pipe. Beware the hardcoded pathname.
You can also get the GetDate extension module that's actually the C
code linked into perl from wherever fine Perl extensions are given
away. It's about 50x faster. If you can't find it elsewhere, I
usually keep a copy on perl.com for ftp, since I (Tom) ported it.
Richard Ohnemus <Rick_Ohnemus@Sterling.COM> actually has a getdate.y for
use with the Perl yacc (see question 3.3 "Is there a yacc for Perl?").
You might also consider using these:
You probably want 'getdate.shar'... these and other files can be ftp'd
from the /pub/perl/scripts directory on ftp.cis.ufl.edu. See the README
file in the /pub/perl directory for time and the European mirror site
details.
Here's an example of a Julian Date function provided by Thomas R. Kimpton*.
Perl does not have an explicit round function. However, it is very
simple to create a rounding function. Since the int() function simply
removes the decimal value and returns the integer portion of a number,
you can use
If you examine what this function is doing, you will see that any
number greater than .5 will be increased to the next highest integer,
and any number less than .5 will remain the current integer, which has
the same effect as rounding.
A slightly better solution, one which handles negative numbers as well,
might be to change the return (above) to:
which will modify the .5 to be either positive or negative, based on
the number passed into it.
If you wish to round to a specific significant digit, you can use the
printf function (or sprintf, depending upon the situation), which does
proper rounding automatically. See the perlfunc man page for more
information on the (s)printf function.
Version 5 includes a POSIX module which defines the standard C math
library functions, including floor() and ceil(). floor($num) returns
the largest integer not greater than $num, while ceil($num) returns the
smallest integer not less than $num. For example:
Post it to comp.lang.perl.misc and ask Tom or Randal a question about
it. ;)
Because Perl so lends itself to a variety of different approaches for
any given task, a common question is which is the fastest way to code a
given task. Since some approaches can be dramatically more efficient
that others, it's sometimes worth knowing which is best.
Unfortunately, the implementation that first comes to mind, perhaps as
a direct translation from C or the shell, often yields suboptimal
performance. Not all approaches have the same results across different
hardware and software platforms. Furthermore, legibility must
sometimes be sacrificed for speed.
While an experienced perl programmer can sometimes eye-ball the code
and make an educated guess regarding which way would be fastest,
surprises can still occur. So, in the spirit of perl programming
being an empirical science, the best way to find out which of several
different methods runs the fastest is simply to code them all up and
time them. For example:
Perl5 includes a new module called Benchmark.pm. You can now simplify
the code to use the Benchmarking, like so:
It will output something that looks similar to this:
For example, the following code will show the time difference between
three different ways of assigning the first character of a string to
a variable:
The results will be returned like this:
For more specific tips, see the section on Efficiency in the
``Other Oddments'' chapter at the end of the Camel Book.
You don't have to quote strings that can't mean anything else in the
language, like identifiers with any upper-case letters in them.
Therefore, it's fine to do this:
but you can't get away with this:
in place of
The requirements on semicolons have been increasingly relaxed. You no
longer need one at the end of a block, but stylistically, you're better
to use them if you don't put the curly brace on the same line:
is ok, as is
but you probably shouldn't do this:
because you might want to add lines later, and anyway, it looks
funny. :-)
Actually, I lied. As of 5.001, there are two autoquoting contexts:
Variable suicide is a nasty side effect of dynamic scoping and the way
variables are passed by reference. If you say
Then you have just clobbered $_[0]! Why this is occurring is pretty
heavy wizardry: the reference to $x stored in $_[0] was temporarily
occluded by the previous local($x) statement (which, you're recall,
occurs at run-time, not compile-time). The work around is simple,
however: declare your formal parameters first:
That doesn't help you if you're going to be trying to access @_
directly after the local()s. In this case, careful use of the package
facility is your only recourse.
Another manifestation of this problem occurs due to the magical nature
of the index variable in a foreach() loop.
What's happening here is that $m is an alias for each element of @num.
Inside &ug, you temporarily change $m. Well, that means that you've
also temporarily changed whatever $m is an alias to!! The only
workaround is to be careful with global variables, using packages,
and/or just be aware of this potential in foreach() loops.
The perl5 static autos via my() do not exhibit this problem.
This is a bug in 4.035. While in general it's merely a cosmetic
problem, it often comanifests with a highly undesirable coredumping
problem. Programs known to be affected by the fatal coredump include
plum and pcops. This bug has been fixed since 4.036. It did not
resurface in 5.001.
While the $^ variable contains the name of the current header format,
there is no corresponding mechanism to automatically do the same thing
for a footer. Not knowing how big a format is going to be until you
evaluate it is one of the major problems.
If you have a fixed-size footer, you can get footers by checking for
line left on page ($-) before each write, and printing the footer
yourself if necessary.
Another strategy is to open a pipe to yourself, using open(KID, "|-")
and always write()ing to the KID, who then postprocesses its STDIN to
rearrange headers and footers however you like. Not very convenient,
but doable.
See the
perlform(1)
man page for other tricks.
This is caused by a strange occurrence that often dubbed ``feeping
creaturism''. Larry is always adding one more feature, always getting
Perl to handle one more problem. Hence, it keeps growing. Once you've
worked with perl long enough, you will probably start to do the same
thing. You will then notice this problem as you see your scripts
becoming larger and larger.
Oh, wait... you meant a currently running program and its stack size.
Mea culpa, I misunderstood you. ;) While there may be a real memory
leak in the Perl source code or even whichever malloc() you're using,
common causes are incomplete eval()s or local()s in loops.
An eval() which terminates in error due to a failed parsing will leave
a bit of memory unusable.
A local() inside a loop:
will build up 100 versions of @array before the loop is done. The
work-around is:
This local array behaviour has been fixed for perl5, but a failed
eval() still leaks.
One other possibility, due to the way reference counting works, is
when you've introduced a circularity in a data structure that would
normally go out of scope and be unreachable. For example:
When $x goes out of scope, the memory can't be reclaimed, because
there's still something point to $x (itself, in this case). A
full garbage collection system could solve this, but at the cost
of a great deal of complexity in perl itself and some inevitable
performance problems as well. If you're making a circular data
structure that you want freed eventually, you'll have to break the
self-reference links yourself.
Yes, you can, since Perl has access to sockets. An example of the rup
program written in Perl can be found in the script ruptime.pl at the
scripts archive on ftp.cis.ufl.edu. I warn you, however, that it's not
a pretty sight, as it's used nothing from h2ph or c2ph, so everything is
utterly hard-wired.
Some System V based systems, notably Solaris 2.X, redefined some of the
standard socket constants. Since these were constant across all
architectures, they were often hardwired into the perl code. The
``proper'' way to deal with this is to make sure that you run h2ph
against sys/socket.h, require that file and use the symbolic names
(SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_RDM, and SOCK_SEQPACKET).
Note that even though SunOS 4 and SunOS 5 are binary compatible, these
values are different, and require a different socket.ph for each OS.
Under version 5, you can also ``use Socket'' to get the proper values.
From the manual:
Now you can freely use /$pattern/ without fear of any unexpected meta-
characters in it throwing off the search. If you don't know whether a
pattern is valid or not, enclose it in an eval to avoid a fatal run-
time error.
Perl5 provides a vastly improved way of doing this. Simply use the
new quotemeta character (\Q) within your variable.
Remember that the substr() function produces an lvalue, that is, it may
be assigned to. Therefore, to change the first character to an S, you
could do this:
This assumes that $[ is 0; for a library routine where you can't know
$[, you should use this instead:
To do things like translation of the first part of a string, use
substr, as in:
If you don't know the length of what to translate, something like this
works:
although in this case, it runs more slowly than does the previous
example.
If you want a count of a certain character (X) within a string, you can
use the tr/// function like so:
This is fine if you are just looking for a single character. However,
if you are trying to count multiple character substrings within a
larger string, tr/// won't work. What you can do is wrap a while loop
around a pattern match.
No, or at least, not by themselves.
Regexps just aren't powerful enough. Although Perl's patterns aren't
strictly regular because they do backreferencing (the \1 notation), you
still can't do it. You need to employ auxiliary logic. A simple
approach would involve keeping a bit of state around, something
vaguely like this (although we don't handle patterns on the same line):
A rather more elaborate subroutine to pull out balanced and possibly
nested single chars, like ` and ', { and }, or ( and ) can be found
on convex.com in /pub/perl/scripts/pull_quotes.
The basic idea behind regexps being greedy is that they will match the
maximum amount of data that they can, sometimes resulting in incorrect
or strange answers.
For example, I recently came across something like this:
This code was supposed to match everything between a set of
parentheses. The expected output was:
However, the backreference ($1) ended up containing "is) an (example",
clearly not what was intended.
In perl4, the way to stop this from happening is to use a negated
group. If the above example is rewritten as follows, the results are
correct:
In perl5 there is a new minimal matching metacharacter, '?'. This
character is added to the normal metacharacters to modify their
behaviour, such as ``*?'', ``+?'', or even ``??''. The example would now be
written in the following style:
Hint: This new operator leads to a very elegant method of stripping
comments from C code:
Since we're talking about how to strip comments under perl5, now is a
good time to talk about doing it in perl4. Since comments can be
embedded in strings, or look like function prototypes, care must be
taken to ignore these cases. Jeffrey Friedl* proposes the following
two programs to strip C comments and C++ comments respectively:
C comments:
C++ comments:
(Yes, Jeffrey says, those are complete programs to strip comments
correctly.)
I'm trying to split a string that is comma delimited into its different
fields. I could easily use split(/,/), except that I need to not split
if the comma is inside quotes. For example, my data file has a line
like this:
Due to the restriction of the quotes, this is a fairly complex
solution. However, we thankfully have Jeff Friedl* to handle these for
us. He suggests (assuming that your data is contained in the special
variable $_):
Well, it does. The thing to remember is that local() provides an array
context, and that the <FILE> syntax in an array context will read all the
lines in a file. To work around this, use:
You can use the scalar() operator to cast the expression into a scalar
context:
You should check out the Frequently Asked Questions list in
comp.unix.* for things like this: the answer is essentially the same.
It's very system dependent. Here's one solution that works on BSD
systems:
Under perl5, you should look into getting the ReadKey extension from
your regular perl archive.
A closely related question to the no-echo question below is how to
input a single character from the keyboard. Again, this is a system
dependent operation. As with the previous question, you probably want
to get the ReadKey extension. The following code may or may not help
you. It should work on both SysV and BSD flavors of UNIX:
You could also handle the stty operations yourself for speed if you're
going to be doing a lot of them. This code works to toggle cbreak
and echo modes on a BSD system:
Note that this is one of the few times you actually want to use the
getc() function; it's in general way too expensive to call for normal
I/O. Normally, you just use the <FILE> syntax, or perhaps the read()
or sysread() functions.
For perspectives on more portable solutions, use anon ftp to retrieve
the file /pub/perl/info/keypress from convex.com.
Under Perl5, with William Setzer's Curses module, you can call
&Curses::cbreak() and &Curses::nocbreak() to turn cbreak mode on and
off. You can then use getc() to read each character. This should work
under both BSD and SVR systems. If anyone can confirm or deny
(especially William), please contact the maintainers.
For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports:
To put the PC in ``raw'' mode, use ioctl with some magic numbers gleaned
from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes
across the net every so often):
Then to read a single character:
And to put the PC back to ``cooked'' mode:
So now you have $c. If ord($c) == 0, you have a two byte code, which
means you hit a special key. Read another byte with sysread(STDIN,$c,1),
and that value tells you what combination it was according to this
table:
This is all trial and error I did a long time ago, I hope I'm reading the
file that worked.
Terminal echoing is generally handled directly by the shell.
Therefore, there is no direct way in perl to turn echoing on and off.
However, you can call the command "stty [-]echo". The following will
allow you to accept input without it being echoed to the screen, for
example as a way to accept passwords (error checking deleted for
brevity):
Again, under perl 5, you can use Curses and call &Curses::noecho() and
&Curses::echo() to turn echoing off and on. Or, there's always the
ReadKey extension.
Yes, there is. Using the substitution command, you can match the
blanks and replace it with nothing. For example, if you have the
string " String " you can use this:
or even
Note however that Jeffrey Friedl* says these are only good for shortish
strings. For longer strings, and worse-case scenarios, they tend to
break-down and become inefficient.
For the longer strings, he suggests using either
It should also be noted that for generally nice strings, these tend to
be noticably slower than the simple ones above. It is suggested that
you use whichever one will fit your situation best, understanding that
the first examples will work in roughly ever situation known even if
slow at times.
This one will do it for you:
The reason you can't just do
You could have written that
Placed in a function:
This is especially important when you're working going to unpack
an ascii string that might have tabs in it. Otherwise you'll be
off on the byte count. For example:
Well, nothing precisely, but it's not a good way to write
maintainable code. It's just fine to use grep when you want
an answer, like
But using it in a void context like this:
Is using it for its side-effects, and side-effects can be mystifying.
There's no void grep that's not better written as a for() loop:
In the same way, a ?: in a void context is considered poor form:
When you can write it this way:
Of course, using ?: in expressions is just what it's made for,
and just fine (but try not to nest them.).
Remember that the most important things in almost any program are,
and in this order:
On the other hand, if you're just trying write JAPHs (aka Obfuscated
Perl entries), or write ugly code, you would probably invert these :-)
Always make sure to use a $ for single values and @ for multiple ones.
Thus element 2 of the @foo array is accessed as $foo[1], not @foo[1],
which is a list of length one (not a scalar), and is a fairly common
novice mistake. Sometimes you can get by with @foo[1], but it's
not really doing what you think it's doing for the reason you think
it's doing it, which means one of these days, you'll shoot yourself
in the foot; ponder for a moment what these will really do:
This may seem confusing, but try to think of it this way: you use the
character of the type which you want back. You could use @foo[1..3] for
a slice of three elements of @foo, or even @foo{A,B,C} for a slice of
of %foo. This is the same as using ($foo[1], $foo[2], $foo[3]) and
($foo{A}, $foo{B}, $foo{C}) respectively. In fact, you can even use
lists to subscript arrays and pull out more lists, like @foo[@bar] or
@foo{@bar}, where @bar is in both cases presumably a list of subscripts.
In Perl5, it's quite easy to declare these things. For example
And now reference $A[2]->[0] to pull out ``yy''. These may also nest
and mix with tables:
Perl4 is infinitely more difficult. Remember that Perl[0..4] isn't
about nested data structures. It's about flat ones, so if you're
trying to do this, you may be going about it the wrong way or using the
wrong tools. You might try parallel arrays with common subscripts.
But if you're bound and determined, you can use the multi-dimensional
array emulation of $a{'x','y','z'}, or you can make an array of names
of arrays and eval it.
For example, if @name contains a list of names of arrays, you can get
at a the j-th element of the i-th array like so:
or in one line
You could also use the type-globbing syntax to make an array of *name
values, which will be more efficient than eval. Here @name hold a list
of pointers, which we'll have to dereference through a temporary
variable.
For example:
In fact, you can use this method to make arbitrarily nested data
structures. You really have to want to do this kind of thing badly to
go this far, however, as it is notationally cumbersome.
Let's assume you just simply have to have an array of arrays of
arrays. What you do is make an array of pointers to arrays of
pointers, where pointers are *name values described above. You
initialize the outermost array normally, and then you build up your
pointers from there. For example:
Now make a couple of arrays of pointers to these:
And finally make an array of pointers to these arrays:
To access an element, such as AAA[i][j][k], you must do this:
Similar manipulations on associative arrays are also feasible.
You could take a look at recurse.pl package posted by Felix Lee*, which
lets you simulate vectors and tables (lists and associative arrays) by
using type glob references and some pretty serious wizardry.
In C, you're used to creating recursive datatypes for operations like
recursive decent parsing or tree traversal. In Perl, these algorithms
are best implemented using associative arrays. Take an array called
%parent, and build up pointers such that $parent{$person} is the name
of that person's parent. Make sure you remember that $parent{'adam'}
is 'adam'. :-) With a little care, this approach can be used to
implement general graph traversal algorithms as well.
This answer will work under perl5 only. Did we mention that you should
upgrade? There is a perl4 solution, but you are using perl5 now,
anyway, so there's no point in posting it. Right?
The best way to do this is to use an associative array to model your
structure, then either a regular array (AKA list) or another
associative array (AKA hash, table, or hash table) to store it.
Or even
Note that if you want an associative array of lists, you'll want to make
assignments like
And with lists of associative arrays, you'll use
Study these for a while, and in an upcoming FAQ, we'll explain them fully:
And while you're at it, take a look at these:
See
perlref(1)
for details.
There are several possible ways, depending on whether the
array is ordered and you wish to preserve the ordering.
a) If @in is sorted, and you want @out to be sorted:
This is nice in that it doesn't use much extra memory,
simulating uniq's behavior of removing only adjacent
duplicates.
b) If you don't know whether @in is sorted:
c) Like (b), but @in contains only small integers:
d) A way to do (b) without any loops or greps:
e) Like (d), but @in contains only small positive integers:
There are several ways to approach this. If you are going to make
this query many times and the values are arbitrary strings, the
fastest way is probably to invert the original array and keep an
associative array lying about whose keys are the first array's values.
Now you can check whether $is_blue{$some_color}. It might have been
a good idea to keep the blues all in an assoc array in the first place.
If the values are all small integers, you could use a simple
indexed array. This kind of an array will take up less space:
Now you check whether $is_tiny_prime[$some_number].
If the values in question are integers instead of strings, you can save
quite a lot of space by using bit strings instead:
You have to declare a sort subroutine to do this, or use an inline
function. Let's assume you want an ASCII sort on the values of the
associative array %ary. You could do so this way:
If you wanted a descending numeric sort, you could do this:
You can also inline your sort function, like this, at least if
you have a relatively recent patchlevel of perl4 or are running perl5:
If you wanted a function that didn't have the array name hard-wired
into it, you could so this:
If you want neither an alphabetic nor a numeric sort, then you'll
have to code in your own logic instead of relying on the built-in
signed comparison operators ``cmp'' and ``<=>''.
Note that if you're sorting on just a part of the value, such as a
piece you might extract via split, unpack, pattern-matching, or
substr, then rather than performing that operation inside your sort
routine on each call to it, it is significantly more efficient to
build a parallel array of just those portions you're sorting on, sort
the indices of this parallel array, and then to subscript your original
array using the newly sorted indices. This method works on both
regular and associative arrays, since both @ary[@idx] and @ary{@idx}
make sense. See page 245 in the Camel Book on ``Sorting an Array by a
Computable Field'' for a simple example of this.
For example, here's an efficient case-insensitive comparison:
While the number of elements in a @foobar array is simply @foobar when
used in a scalar, you can't figure out how many elements are in an
associative array in an analogous fashion. That's because %foobar in
a scalar context returns the ratio (as a string) of number of buckets
filled versus the number allocated. For example, scalar(%ENV) might
return ``20/32''. While perl could in theory keep a count, this would
break down on associative arrays that have been bound to dbm files.
However, while you can't get a count this way, one thing you can use
it for is to determine whether there are any elements whatsoever in
the array, since ``if (%table)'' is guaranteed to be false if nothing
has ever been stored in it.
As of perl4.035, you can says
keys() when used in a scalar context will return the number of keys,
rather than the keys themselves.
Pictures help... here's the %ary table:
And these conditions hold
If you now say
your table now reads:
and these conditions now hold; changes in caps:
Notice the last two: you have an undef value, but a defined key!
Now, consider this:
your table now reads:
and these conditions now hold; changes in caps:
See, the whole entry is gone!
Several reasons. One is because backticks do not interpolate within
double quotes in Perl as they do in shells.
Let's look at two common mistakes:
This should have been:
But you'll have an extra newline you might not expect. This
does not work as expected:
$back = `pwd`; chdir($somewhere); chdir($back); # WRONG
Because backticks do not automatically eat trailing or embedded
newlines. The chop() function will remove the last character from
a string. This should have been:
You should also be aware that while in the shells, embedding
single quotes will protect variables, in Perl, you'll need
to escape the dollar signs.
The natural way to program in those languages may not make for the fastest
Perl code. Notably, the awk-to-perl translator produces sub-optimal code;
see the a2p man page for tweaks you can make.
Two of Perl's strongest points are its associative arrays and its regular
expressions. They can dramatically speed up your code when applied
properly. Recasting your code to use them can help a lot.
How complex are your regexps? Deeply nested sub-expressions with {n,m} or
*operators can take a very long time to compute. Don't use ()'s unless
you really need them. Anchor your string to the front if you can.
Something like this:
next unless /^.*%.*$/;
runs more slowly than the equivalent:
next unless /%/;
Note that this:
There's no need to use /^.*foo.*$/ when /foo/ will do.
Remember that a printf costs more than a simple print.
Don't split() every line if you don't have to.
Another thing to look at is your loops. Are you iterating through
indexed arrays rather than just putting everything into a hashed
array? For example,
First of all, it would be faster to use Perl's foreach mechanism
instead of using subscripts:
Better yet, this could be sped up dramatically by placing the whole
thing in an associative array like this:
You should also look at variables in regular expressions, which is
expensive. If the variable to be interpolated doesn't change over the
life of the process, use the /o modifier to tell Perl to compile the
regexp only once, like this:
Finally, if you have a bunch of patterns in a list that you'd like to
compare against, instead of doing this:
If you build your code and then eval it, it will be much faster.
For example:
If these are system calls and you have the syscall() function, then
you're probably in luck -- see the next question. If you're using a
POSIX function, and are running perl5, you're also in luck: see
POSIX(3pm)
.
For arbitrary library functions, however, it's not quite so
straight-forward. See ``Where can I learn about linking C with Perl?''.
[Note: as of perl5, you probably want to just use h2xs instead, at
least, if your system supports dynamic loading.]
These are generated from your system's C include files using the h2ph
script (once called makelib) from the Perl source directory. This will
make files containing subroutine definitions, like &SYS_getitimer, which
you can use as arguments to your function.
You might also look at the h2pl subdirectory in the Perl source for how to
convert these to forms like $SYS_getitimer; there are both advantages and
disadvantages to this. Read the notes in that directory for details.
In both cases, you may well have to fiddle with it to make these work; it
depends how funny-looking your system's C include files happen to be.
If you're trying to get at C structures, then you should take a look
at using c2ph, which uses debugger ``stab'' entries generated by your
BSD or GNU C compiler to produce machine-independent perl definitions
for the data structures. This allows to you avoid hardcoding
structure layouts, types, padding, or sizes, greatly enhancing
portability. c2ph comes with the perl distribution. On an SCO
system, GCC only has COFF debugging support by default, so you'll have
to build GCC 2.1 with DBX_DEBUGGING_INFO defined, and use -gstabs to
get c2ph to work there.
See the file /pub/perl/info/ch2ph on convex.com via anon ftp
for more traps and tips on this process.
This message:
In general, this is a dangerous move because you can find yourself in a
deadlock situation. It's better to put one end of the pipe to a file.
For example:
If you have ptys, you could arrange to run the command on a pty and
avoid the deadlock problem. See the chat2.pl package in the
distributed library for ways to do this.
At the risk of deadlock, it is theoretically possible to use a
fork, two pipe calls, and an exec to manually set up the two-way
pipe. (BSD system may use socketpair() in place of the two pipes,
but this is not as portable.) The open2 library function distributed
with the current perl release will do this for you.
This assumes it's going to talk to something like adb, both writing to
it and reading from it. This is presumably safe because you ``know''
that commands like adb will read a line at a time and output a line at
a time. Programs like sort or cat that read their entire input stream
first, however, are quite apt to cause deadlock.
There's also an open3.pl library that handles this for stderr as well.
There are three basic ways of running external commands:
In the first case, both STDOUT and STDERR will go the same place as
the script's versions of these, unless redirected. You can always put
them where you want them and then read them back when the system
returns. In the second and third cases, you are reading the STDOUT
only of your command. If you would like to have merged STDOUT and
STDERR, you can use shell file-descriptor redirection to dup STDERR to
STDOUT:
Another possibility is to run STDERR into a file and read the file
later, as in
Note that you cannot simply open STDERR to be a dup of STDOUT
in your perl program and avoid calling the shell to do the redirection.
This doesn't work:
Be apprised that you must use Bourne shell redirection syntax in
backticks, not csh! For details on how lucky you are that perl's
system() and backtick and pipe opens all use Bourne shell, fetch the
file from convex.com called /pub/csh.whynot -- and you'll be glad that
perl's shell interface is the Bourne shell.
There's an &open3 routine out there which was merged with &open2 in
perl5 production.
These statements:
will not fail just for lack of the bogus_command. They'll only
fail if the fork to run them fails, which is seldom the problem.
If you're writing to the TOPIPE, you'll get a SIGPIPE if the child
exits prematurely or doesn't run. If you are reading from the
FROMPIPE, you need to check the close() to see what happened.
If you want an answer sooner than pipe buffering might otherwise
afford you, you can do something like this:
This works fine if bogus_command doesn't have shell metas in it, but
if it does, the shell may well not have exited before the kill 0. You
could always introduce a delay:
but this is sometimes undesirable, and in any event does not guarantee
correct behavior. But it seems slightly better than nothing.
Similar tricks can be played with writable pipes if you don't wish to
catch the SIGPIPE.
Perl provides a builtin variable which holds the status of the last
backtick command: $?. Here is exactly what the perlvar page says about
Because some stdio's set error and eof flags that need clearing.
Try keeping around the seekpointer and go there, like this:
If that doesn't work, try seeking to a different part of the file and
then back. If that doesn't work, try seeking to a different part of
the file, reading something, and then seeking back. If that doesn't
work, give up on your stdio package and use sysread. You can't call
stdio's clearerr() from Perl, so if you get EINTR from a signal
handler, you're out of luck. Best to just use sysread() from the
start for the tty.
Perl doesn't expand tildes -- the shell (ok, some shells) do.
The classic request is to be able to do something like:
which doesn't work. (And you don't know it, because you
did a system call without an ``|| die'' clause! :-)
If you know you're on a system with the csh, and you know
that Larry hasn't internalized file globbing, then you could
get away with
but that's pretty iffy.
A better way is to do the translation yourself, as in:
More robust and efficient versions that checked for error conditions,
handed simple ~/blah notation, and cached lookups are all reasonable
enhancements.
Larry's standard answer is to send it through the shell to perl filter,
otherwise known at tchrist@perl.com. Contrary to popular belief, Tom Christiansen isn't a real person. He is actually a highly advanced
artificial intelligence experiment written by a graduate student at the
University of Colorado. Some of the earlier tasks he was programmed to
perform included:
(This IS a joke... please quit calling me and asking about it!)
Actually, there is no automatic machine translator. Even if there
were, you wouldn't gain a lot, as most of the external programs would
still get called. It's the same problem as blind translation into C:
you're still apt to be bogged down by exec()s. You have to analyze
the dataflow and algorithm and rethink it for optimal speedup. It's
not uncommon to see one, two, or even three orders of magnitude of
speed difference between the brute-force and the recoded approaches.
Sure, you can connect directly to them using sockets, or you can run a
session on a pty. In either case, Randal's chat2 package, which is
distributed with the perl source, will come in handly. It address
much the same problem space as Don Libes's expect package does. Two
examples of using managing an ftp session using chat2 can be found on
convex.com in /pub/perl/scripts/ftp-chat2.shar .
Caveat lector: chat2 is documented only by example, may not run on
System V systems, and is subtly machine dependent both in its ideas
of networking and in pseudottys. See also question 4.21, ``Why doesn't
my sockets program work under System V (Solaris)?''
Randal also has code showing an example socket session for handling the
telnet protocol to get a weather report. This can be found at
ftp.cis.ufl.edu:/pub/perl/scripts/getduatweather.pl.
Gene Spafford* has a nice ftp library package that will help with ftp.
As of perl4.036, there is a certain amount of globbing that is passed
out to the shell and not handled internally. The following code (which
will, roughly, emulate ``chmod 0644 *'')
is the equivalent of
Until globbing is built into Perl, you will need to use some form of
non-globbing work around.
Something like the following will work:
If you've installed tcsh as /bin/csh, you'll never have this problem.
Larry says that the solution is to put a call to seek in yourself.
First try
The statement seek(GWFILE, 0, 1); doesn't change the current position,
but it does clear the end-of-file condition on the handle, so that the
next
If that doesn't work (depends on your stdio implementation), then
you need something more like this:
Generally speaking, if you need to do this you're either using poor
programming practices or are far too paranoid for your own good. If you
need to do this to hide a password being entered on the command line,
recode the program to read the password from a file or to prompt for
it. (see question 4.24) Typing a password on the command line is
inherently insecure as anyone can look over your shoulder to see it.
If you feel you really must overwrite the command line and hide it, you
can assign to the variable ``$0''. For example:
It should be noted that some OSes, like Solaris 2.X, read directly from
the kernel information, instead of from the program's stack, and hence
don't allow you to change the command line.
In the strictest sense, it ``can't'' be done. However, there is special
shell magic which may allow you to do it. I suggest checking out
comp.unix.shell and reading the comp.unix.questions FAQ.
When perl is started, you are creating a child process. Due to the way
the Unix system is designed, children cannot permanently affect their
parent shells.
When a child process is created, it inherits a copy of its parents
environment (variables, current directory, etc). When the child
changes this environment, it is changing the copy and not the original,
so the parent isn't affected.
If you must change the parent from within a perl script, you could try
having it write out a shell script or a C-shell script and then using
``. script'' or ``source script'' (sh, Csh, respectively)
If you've ever tried to use a variable for a filehandle, you may well
have had some problems. This is just revealing one of the icky places
in perl: filehandles aren't first-class citizens the way everything
else is, and it really gets in the way sometimes.
Of course, it's just fine to say
But you'll still get into trouble for trying:
You can also do this:
But this is ok:
There are about four ways of passing in a filehandle. The way
everyone tries is just to pass it as a string.
Unfortunately, that doesn't work so well, because the package that
the printit() function is executing in may in fact not be the
one that the handle was opened it:
A simple fix would be to pass it in fully qualified;
because if you don't, the function is going to have to do
some crazy thing like this:
CODE 1:
CODE 2:
However, it turns out that you don't have to use a typeglob inside
the function. This also works:
CODE 3:
As does this even, in case you want to make an object by blessing
your reference:
CODE 4:
I used to think that you had to use 1 or preferably 2, but apparently
you can get away with number 3 and 4 as well. This is nice because it
avoids ever assigning to a typeglob as we do it 2, which is a bit
risky.
Some other problems with #1: if you're using strict subs, then you
aren't going to be able to do that: the strict subs will gets you.
Instead, you'll have to pass in 'main::Some_Handle', but then down in
your function, you'll get blown away by strict refs, because you'll be
using a string a symbol. So really, the best way is to pass the
typeglob (or occasionally a reference to the same).
Normally perl ignores trailing blanks in filenames, and interprets
certain leading characters (or a trailing "|") to mean something
special. To avoid this, you might want to use a routine like this.
It makes non-fullpathnames into explicit relative ones, and tacks
a trailing null byte on the name to make perl leave it alone:
This function works reasonably well to figure out whether a variable
will be disliked by the taint checks automatically enabled by setuid
execution:
and in particular, never does any system calls.
General Information and Availability
1.1) What is Perl?
1.2) What are perl4 and perl5, are there any differences?
1.3) What features does perl5 provide over perl4?
1.4) Where can I get docs on perl5?
1.5) Will perl5 break my perl4 scripts?
1.6) When will Perl stabilize?
1.7) What's the difference between "perl" and "Perl"?
1.8) Is it a perl program or a perl script?
1.9) Is perl difficult to learn?
1.10) Should I program everything in perl?
1.11) How does perl compare with other scripting languages, like Tcl, Python or REXX?
1.12) Where can I get Perl over the Internet (FTP)?
1.13) How can I get Perl via email?
1.14) How can I get Perl via UUCP?
1.15) Are there other ways of getting perl?
1.16) Has perl been ported to machine FOO?
1.17) How do I get perl to compile on Solaris?
1.18) How do I get perl to compile on a NeXT?
1.19) What extensions are available for Perl and where can I get them?
1.20) What is dbperl and where can I get it?
1.21) Which DBM should I use?
1.22) Is there an SNMP aware perl?
1.23) Is there an ISO or ANSI certified version of Perl?
1.1) What is Perl?
Perl is a compiled scripting language written by Larry Wall*.
Perl is an interpreted language optimized for scanning arbitrary
text files, extracting information from those text files, and
printing reports based on that information. It's also a good
language for many system management tasks. The language is
intended to be practical (easy to use, efficient, complete)
rather than beautiful (tiny, elegant, minimal). It combines
(in the author's opinion, anyway) some of the best features
of C, sed, awk, and sh, so people familiar with those languages
should have little difficulty with it. (Language historians
will also note some vestiges of csh, Pascal, and even
BASIC-PLUS.) Expression syntax corresponds quite closely to C
expression syntax. Unlike most Unix utilities, perl does not
arbitrarily limit the size of your data--if you've got the
memory, perl can slurp in your whole file as a single string.
Recursion is of unlimited depth. And the hash tables used by
associative arrays grow as necessary to prevent degraded
performance. Perl uses sophisticated pattern matching techniques
to scan large amounts of data very quickly. Although optimized
for scanning text, perl can also deal with binary data, and can
make dbm files look like associative arrays (where dbm is
available). Setuid perl scripts are safer than C programs
through a dataflow tracing mechanism which prevents many
stupid security holes. If you have a problem that would
ordinarily use sed or awk or sh, but it exceeds their
capabilities or must run a little faster, and you don't want to
write the silly thing in C, then perl may be for you. There are
also translators to turn your sed and awk scripts into perl
scripts. OK, enough hype.
1.2) What are perl4 and perl5, are there any differences?
Perl4 and perl5 are different versions of the language. Perl4 was the
previous release, and perl5 is ``Perl: The Next Generation.''
Perl5 is, essentially, a complete rewrite of the perl source code
from the ground up. It has been modularized, object oriented,
tweaked, trimmed, and optimized until it almost doesn't look like
the old code. However, the interface is mostly the same, and
compatibility with previous releases is very high.
1.3) What features does perl5 provide over perl4?
If you get the newest source (from any of the main FTP sites), you will
find a directory full of man pages (possibly to be installed as section
1p and 3pm) that discuss the differences, new features, old
incompatibilies and much more. Here, however, are some highlights as
to the new features and old incompatibilities.
(Thanks to Tom Christiansen* for this section)
*foo = \$bar;
*foo = \&bletch;
output_autoflush STDOUT 1;
goto &realsub.
1.4) Where can I get docs on perl5?
The complete perl documentation is available with the Perl
distribution, or can be accessed from the following sites.
Note that the PerlDoc ps file is 240 pages long!!
http://www.metronet.com/0/perlinfo/perl5/manual/perl.html
http://web.nexor.co.uk/perl/perl.html (Europe)
ftp://ftp.cis.ufl.edu/pub/perl/doc/PerlDoc.ps.gz
ftp://ftp.uu.net/languages/perl/PerlDoc.ps.gz
ftp://www.metronet.com/pub/perl/perl5/manual/PerlDoc.ps.gz
ftp://ftp.zrz.tu-berlin.de/pub/unix/perl/PerlDoc.ps.gz (Europe)
ftp://ftp.cs.ruu.nl/pub/PERL/perl5.0/doc/PerlDoc.ps.gz (Europe)
ftp://sungear.mame.mu.oz.au/pub/perl/doc/PerlDoc.ps.gz (Oz)
unable to access as of 7/15/95
ftp://www.metronet.com/pub/perl/perl5/manual/perl5-info.tar.gz
1.5) Will perl5 break my perl4 scripts?
In general, no. However, certain bad old practices have become highly
frowned upon. The following are the most important of the known
incompatibilities between perl4 and perl5. See
perltrap(1)
for more
details.
needs to be
Mail("foo@bar.com")
The compiler catches this.
Mail("foo\@bar.com");
1.6) When will Perl stabilize?
When asked at what point the Perl code would be frozen, Larry answered:
Part of the redesign of Perl is to allow us to more or less freeze
the language itself. It won't totally freeze, of course, but I think
the rate of change of the core of the language is asymptotically
approaching 0. In fact, as time goes on, now that we have an official
extension mechanism, some of the things that are currently in the core
of the language may move out (transparently) as extensions. This has
already happened to dbmopen().
The whole idea behind
Perl is to be a fast text-processing, system-maintenance, zero-startup
time language. If it gets to be so large and complicated that it isn't
fast-running and easy to use, it won't be to anyone's benefit.
My motto from the start has been, ``If it ain't broke, don't fix it.''
I've been trying very hard not to remove those features from Perl that
make it what it is. At the same time, a lot of streamlining has gone
into the syntax. The new yacc file is about half the size of the old
one, and the number of official reserved words has been cut by 2/3.
All built-in functions have been unified (dualified?) as either list
operators or unary operators.
I really like a lot of the features in Perl, but in order for Perl to
be useful on a long term basis, those features have to stay put. I
bought the Camel book less than a year ago and it sounds like within
another year it will be obsolete.
The parts of Perl that the Camel book covers have not changed all that
much. Most old scripts still run. Many scripts from Perl version 1.0
still run. We'll certainly be revising the Camel, but the new man
pages are split up such that it's pretty easy to ferret out the new
info when you want it.
Not only is it a lot of work to recompile Perl
on 20+ machines periodically, but it's hard to write scripts that are
useful in the long term if the guts of the language keep changing.
(And if I keep having to buy new books. I keep hearing about new
features of Perl 5 that aren't documented in any of the perl 5
documentation that I can find.)
I think you'll find a lot of folks who think that 4.036 has been a
pretty stable platform.
Are there any plans to write a Perl compiler? While interpreted Perl
is great for many applications, it would also be cool to be able to
precompile many scripts. (Yes, I know you can undump things, but
undump isn't provided with Perl and I haven't found a copy.) The
creation of a perl library and dynamically-loadable modules seems
like a step in that direction.
Yes, part of the design of Perl 5 was to make it possible to write a
compiler for it. It could even be done as an extension module, I
suppose. Anyone looking for a master's thesis topic?
1.7) What's the difference between ``perl'' and ``Perl''?
32! [ ord('p') - ord('P') ]
1.8) Is it a perl program or a perl script?
1.9) Is perl difficult to learn?
Not at all. Many people find Perl extremely easy to learn. There are
at least three main reasons for this.
#!/usr/local/bin/perl
print "Hello, world\n";
you can start writing Perl scripts. In fact, you will probably never
have to (or be able to) know everything about Perl. As you feel the
need or desire to use more sophisticated features (such as C structures
or networking), you can learn these as you go. The learning curve for
Perl is not a steep one, especially if you have the headstart of having
a background in UNIX. Rather, its learning curve is gentle and
gradual, but it is admittedly rather long.
1.10) Should I program everything in Perl?
1.11) How does Perl compare with other scripting languages, like Tcl, Python
or REXX?
1.12) How can I get Perl over the Internet?
Version 4:
Volume Issues Patchlevel and Notes
------ ------ ------------------------------------------------
18 19-54 Patchlevel 3, Initial posting.
20 56-62 Patches 4-10
Version 5:
Volume Issues Patchlevel and Notes
------ ------ -----------------------------------------------
45 64-128 Initial Posting, patchlevel 0.
Since 1993, a number of archives have sprung up specifically for Perl
and Perl related items. Larry maintains the official distribution
site (for both perl4.036 and perl5) at netlabs. Probably the largest
archive is at the University of Florida. In order of probability these
sites will have the sources.
Site Directory and notes IP
--------------------------------------------- -------
North America:
ftp://ftp.netlabs.com/pub/outgoing/perl5.0/ 192.94.48.152
ftp://ftp.cis.ufl.edu/pub/perl/src/5.0/ 128.227.100.198
ftp://prep.ai.mit.edu/pub/gnu/ 18.71.0.38
not current as of 7/15/95
ftp://ftp.uu.net/languages/perl/ 192.48.96.9
not current as of 7/15/95
ftp://ftp.khoros.unm.edu/pub/perl/ 198.59.155.28
not current as of 7/15/95
ftp://ftp.cbi.tamucc.edu/pub/duff/Perl/ 165.95.1.3
ftp://ftp.metronet.com/pub/perl/sources/ 192.245.137.1
ftp://genetics.upenn.edu/perl5/ 128.91.200.37
Europe:
ftp://ftp.cs.ruu.nl/pub/PERL/perl5.0/src/ 131.211.80.17
ftp://ftp.funet.fi/pub/languages/perl/ports/perl5/ 128.214.248.6
ftp://ftp.zrz.tu-berlin.de/pub/unix/perl/ 130.149.4.40
ftp://src.doc.ic.ac.uk/packages/perl5/ 146.169.17.5
Australia:
ftp://sungear.mame.mu.oz.au/pub/perl/src/5.0/ 128.250.209.2
South America (mirror of ftp://prep.ai.mit.edu/pub/gnu):
ftp://ftp.inf.utfsm.cl/pub/gnu/ 146.83.198.3
http://src.doc.ic.ac.uk/packages/perl5/ 146.169.17.5
gopher://src.doc.ic.ac.uk/0/packages/perl5/ 146.169.17.5
1.13) How can I get Perl via Email?
United States:
Massachusetts: ftpmail@decwrl.dec.com
New Jersey: bitftp@pucc.princeton.edu
North Carolina: ftpmail@sunsite.unc.edu
Europe/UK:
Germany: ftpmail@ftp.uni-stuttgart.de
bitftp@vx.gmd.de
UK: ftpmail@doc.ic.ac.uk
Australia: ftpmail@cs.uow.edu.au
Henk P Penning* suggests that if you are in Europe you should try the
following (if you are in Germany or the UK, you should probably use one
of the servers listed above):
Email: Send a message to 'mail-server@cs.ruu.nl' containing:
begin
path your_email_address
send help
send PERL/perl5.0/INDEX
end
The path-line may be omitted if your message contains a normal
From:-line. You will receive a help-file and an index of the
directory that contains the Perl stuff.
1.14) How can I get Perl via UUCP?
1.15) Are there other ways of getting perl?
Anonymous Access to UUNET's Source Archives
1-900-GOT-SRCS
3110 Fairview Park Drive, Suite 570
Falls Church, VA 22042
+1 703 876 5050 (voice)
+1 703 876 5059 (fax)
info@uunet.uu.net
1.16) Has perl been ported to machine FOO?
There is a mailing list for discussing Macintosh Perl. Contact
mpw-perl-request@iis.ee.ethz.ch.
1.17) How do I get Perl to compile on Solaris?
You must remove all the references to /usr/ucblib AND
/usr/ucbinclude. And ignore the Solaris_2.1 hints. They are wrong.
The undefining of vfork() probably has to do with the confusion it
gives to the compilers. If you use cc, you mustn't compile
util.c/tutil.c with -O. I only used the following libs: -lsocket
-lnsl -lm (there is a problem with -lmalloc)
If you are using Solaris 2.x, the signal handling is broken. If
you set up a signal handler such as 'ripper' it will be forgotten
after the first time the signal is caught. To fix this, you need
to recompile Perl. Just add '#define signal(x,y) sigset((x),(y))'
after the '#include
1.18) How do I get Perl to compile on a Next?
To get perl to compile on NeXTs, you need to combine the ANSI
and BSD headers:
cd /usr/include
mkdir ansibsd
cd ansibsd
ln -s ../ansi
ln -s ../bsd
Then, follow the configuration instructions for NeXTs, replacing
all mention of -I/usr/include/ansi or -I/usr/include/bsd with
-I/usr/include/ansibsd.
1.19) What extensions are available from Perl and where can I get them?
Some of the more popular extensions include those for windowing,
graphics, or data base work. Most of the major sites contain an
archive of the extensions, usually in the ext directory. Since the
list of available extensions changes so often, I have opted to list
only the sites and directories, not the individual extensions, please
check the closest archive for more information
1.20) What is dbperl and where can I get it?
What Target DB Who
-------- ----------- ----------------------------------------
?Infoperl Informix Kurt Andersen (kurt@hpsdid.sdd.hp.com)
Ingperl Ingres Tim Bunce (timbo@ig.co.uk) and Ted Lemon
Interperl Interbase Buzz Moschetti (buzz@bear.com)
Isqlperl Informix William Hails bill@tardis.co.uk
Oraperl Oracle Kevin Stock (kstock@Auspex.com)
Pgperl Postgres Igor Metz (metz@iam.unibe.ch)
*Sqlperl Ingres Ted Lemon (mellon@ncd.com)
Sybperl Sybase Michael Peppler (mpeppler@itf.ch)
Uniperl Unify 5.0 Rick Wargo (rickers@coe.drexel.edu)
? Does this one still exist?
*Sqlperl appears to have been subsumed by Ingperl
Perl is an interpreted language with powerful string, scalar, and
array processing features developed by Larry Wall that ``nicely
bridges the functionality gap between sh(1) and C.'' Since
relational DB operations are typically textually oriented, perl is
particularly well-suited to manage the data flows. The C source
code, which is available free of charge and runs on many platforms,
contains a user-defined function entry point that permits a
developer to extend the basic function set of the language. The
DBperl Group seeks to exploit this capability by creating a
standardized set of perl function extensions (e.g. db_fetch(),
db_attach()) based on the SQL model for manipulating a relational
DB, thus providing a portable perl interface to a variety of
popular RDMS engines including Sybase, Oracle, Ingres, Informix,
and Interbase. In theory, any DB engine that implements a dynamic
SQL interpreter in its HLI can be bolted onto the perl front end
with predicatable results, although at this time backends exist
only for the aforementioned five DB engines.
DBI/ The home of the DBI archive. To join the DBI mailing list
send your request to perldb-interest-REQUEST@vix.com
DBD/ Database Drivers for the DBI ...
Oracle/ By Tim Bunce (not yet ready!)
Ingres/ By Tim Bunce (not yet started!)
mod/ Other Perl 5 Modules and Extensions ...
Sybperl/ By Michael Peppler, mpeppler@itf.ch
perl4/ Perl 4 extensions (using the usub C interface)
oraperl/ ORACLE 6 & 7 By Kevin Stock, kstock@auspex.com
sybperl/ SYBASE 4 By Michael Peppler, mpeppler@itf.ch
ingperl/ INGRES By Tim Bunce timbo@ig.co.uk and Ted Lemon
isqlperl/ INFORMIX By William Hails, bill@tardis.co.uk
interperl/ INTERBASE By Buzz Moschetti, buzz@bear.com
oraperl/ ORACLE 6 & 7 By Kevin Stock (sadly no longer on the net)
sybperl/ SYBASE 4 By Michael Peppler, mpeppler@itf.ch
ingperl/ INGRES By Tim Bunce timbo@ig.co.uk and Ted Lemon
isqlperl/ INFORMIX By William Hails, bill@tardis.co.uk
interperl/ INTERBASE By Buzz Moschetti, buzz@bear.com
uniperl/ UNIFY 5.0 By Rick Wargo, rickers@coe.drexel.edu
pgperl/ POSTGRES By Igor Metz, metz@iam.unibe.ch
btreeperl/ NDBM perl extensions. By John Conover, john@johncon.com
ctreeperl/ C-Tree perl extensions. By John Conover, john@johncon.com
duaperl/ X.500 Directory User Agent. By Eric Douglas.
scripts/ Perl and shell scripts
rdb/ RDB is a perl RDBMS for ASCII files. By Walt Hobbs,
hobbs@rand.org
shql/ SHQL is an interactive SQL database engine. Written as a
shell script, SHQL interprets SQL commands and
manipulates flat files based on those commands. By
Bruce Momjian, root@candle.uucp
xbase/ Perl scripts for accessing xBase style files (dBase III)
refinfo/ Reference information
sqlsyntax/ Yacc and lex syntax and C source code for SQL1 and SQL2
from ftp.uu.net:/pub/uunet/published/oreilly/nutshell/yacclex,
and a draft SQL3 syntax from Jeff Fried <jfried@informix.com>+
formats/ Details of file formats such as Lotus 1-2-3 .WK1
There are also a number of non SQL database interfaces for perl
available from ftp.demon.co.uk. These include:
Directory Target System Authors and notes
--------- ------------- -------------------------------------------
btreeperl NDBM extension John Conover (john@johncon.com)
ctreeperl CTree extension John Conover (john@johncon.com)
duaperl X.500 DUA Eric Douglas
rdb RDBMS Walt Hobbs (hobbs@rand.org)
shql SQL Engine Bruce Momjian (root@candle.uucp)
1.21) Which DBM should I use?
As shipped, Perl (version 5) comes with interfaces for several DBM
packages (SDBM, old DBM, NDBM, GDBM, Berkeley DBM) that are not supplied
but either come with your system are readily accessible via FTP. SDBM
is guaranteed to be there. For a comparison, see
AnyDBM_File(3pm)
and
DB_File(3pm)
.
1.22) Is there an SNMP aware Perl?
FILE -rw-rw-r-- 3407 Aug 11 1992 snmperl.README
FILE -rw-r--r-- 17678 Aug 11 1992 snmperl.tar.Z
This directory contains the source code to add callable C subroutines
to perl. The subroutines implement the SNMP functions ``get'',
``getnext'', and ``set''. They use the freely-distributable SNMP package
(version 1.1b) from CMU.
I didn't find all the places where the CMU library writes to stderr
or calls exit() directly.
Guy Streeter
streeter@ingr.com
April 1, 1992 (not a joke!)
1.23) Is there an ISO or ANSI certified version of Perl?
Informational Sources
2.1) Is there a USENET group for perl?
2.2) Have any books or magazine articles been published about perl?
2.3) When will the Camel and Llama books be updated?
2.4) What FTP resources are available?
2.5) What WWW/gopher resources are available?
2.6) Can people who don't have access to USENET get comp.lang.perl.misc?
2.7) Are archives of comp.lang.perl.* available?
2.8) Is there a WAIS server for comp.lang.perl.*?
2.9) What other sources of information about Perl or training are available?
2.10) Where can I get training classes on Perl?
2.11) What companies ship or use perl?
2.12) Is there commercial, third-party support for perl?
2.13) What is a JAPH? What does "Will hack perl for ..." mean?
2.14) Where can I get a collection of Larry Wall witticisms?
2.15) What are the known bugs?
2.16) Where should I post bugs?
2.17) Where should I post source code?
2.18) Where can I learn about object-orienting Perl programming?
2.19) Where can I learn about linking C with Perl? [h2xs, xsubpp]
2.20) What is perl.com?
2.21) What do the asterisks (*) throughout the FAQ stand for?
2.1) Is there a USENET group for Perl?
2.2) Have any books or magazine articles been published about Perl?
SEwP is not meant as instruction in the Perl language, but rather
as an example of how Perl may be used to assist in the semi-formal
software engineering development cycles. There's a lot of Perl
code that's fairly well commented, but most of the book describes
software engineering methodologies. For the perl-challenged,
there's a light treatment of the language as well, but they refer
to the llama and the camel for the real meat.
Title: Welcome to Perl Country (Perl-no Kuni-he Youkoso)
Authors: Kaoru Maeda, Hiroshi Koyama, Yasushi Saito and Arihito
Fuse
Pages: 268+9 Publisher: Science Company
Pub. Date: April 25, 1993 ISBN: 4-7819-0697-4
Price: 2472Y Author Email: maeda@src.ricoh.co.jp
Comments: Written during the time the Camel book was being
translated. A useful introduction, but uses jperl (Japanese Perl)
which is not necessarily compatible.
2.3) When will the Camel and Llama books be updated?
2.4) What FTP resources are available?
2.5) What WWW/gopher resources are available?
2.6) Can people who don't have access to USENET get comp.lang.perl.misc?
Perl-Users@UVAARPA.VIRGINIA.EDU
Perl-Users-Request@uvaarpa.Virginia.EDU
2.7) Are archives of comp.lang.perl.misc available?
2.8) Is there a WAIS server for comp.lang.perl.*?
I have setup a perl script retrieval service and WaisSearch here at
feenix. To check it out, just point your gopher at us, and select the
appropriate menu option. The WaisSearch is of the iubio type, which
means you can do boolean searching. Thus you might try something
like:
caller
ioctl and fcntl
grep and socket not curses
2.9) What other sources of information about Perl or training are available?
2.10) Where can I get training classes on Perl?
2.11) What companies use or ship Perl?
2.12) Is there commercial, third-party support for Perl?
2.13) What is a JAPH? What does "Will hack perl for ..." mean?
2.14) Where can I get a list of Larry Wall witticisms?
2.15) What are the known bugs?
#if __GNUC__
was incorrectly translated into
if ( &__GNUC__ ) {
if ( defined(&__GNUC__) ? &__GNUC__ : 0 ) {
2.16) Where should I post bugs?
2.17) Where should I post source code?
2.18) Where can I learn about object-oriented Perl programming?
Idx Subsections in perlobj.1 Lines
1 NAME 2
2 DESCRIPTION 16
3 An Object is Simply a Reference 60
4 A Class is Simply a Package 31
5 A Method is Simply a Subroutine 34
6 Method Invocation 75
7 Destructors 14
8 Summary 7
Idx Subsections in perlbot.1 Lines
1 NAME 2
2 INTRODUCTION 9
3 Instance Variables 43
4 Scalar Instance Variables 21
5 Instance Variable Inheritance 35
6 Object Relationships 33
7 Overriding Superclass Methods 49
8 Using Relationship with Sdbm 45
9 Thinking of Code Reuse 111
2.19) Where can I learn about linking C with Perl? [h2xs, xsubpp]
perl-packrats: The archivist list
perl-porters: The porters list
perlbook: The Camel/Llama/Alpaca writing committee
perlbugs: The bug list (perl-porters for now)
perlclasses: Info on Perl training
perlfaq: Submissions/Errata to the Perl FAQ
(Tom and Steve)
perlrefguide: Submissions/Errata to the Perl RefGuide
(Johan)
2.21) What do the asterisks (*) throughout the FAQ stand for?
Larry Wall lwall@netlabs.com
Tom Christiansen tchrist@wraeththu.cs.colorado.edu
Stephen P Potter spp@psa.pencom.com
Andreas König k@franz.ww.TU-Berlin.DE
Bill Eldridge bill@cognet.ucla.edu
Buzz Moschetti buzz@bear.com
Casper H.S. Dik casper@fwi.uva.nl
David Muir Sharnoff muir@tfs.com
Dean Roehrich roehrich@ironwood.cray.com
Dominic Giampaolo dbg@sgi.com
Frédéric Chauveau fmc@pasteur.fr
Gene Spafford spaf@cs.purdue.edu
Guido van Rossum guido@cwi.nl
Henk P Penning henkp@cs.ruu.nl
Jeff Friedl jfriedl@omron.co.jp
Johan Vromans jv@NL.net
John Dallman jgd@cix.compulink.co.uk
John Lees lees@pixel.cps.msu.edu
John Ousterhout ouster@eng.sun.com
Jon Biggar jon@netlabs.com
Ken Lunde lunde@mv.us.adobe.com
Malcolm Beattie mbeattie@sable.ox.ac.uk
Matthias Neeracher neeri@iis.ee.ethz.ch
Michael D'Errico mike@software.com
Nick Ing-Simmons Nick.Ing-Simmons@tiuk.ti.com
Randal Schwartz merlyn@stonehenge.com
Roberto Salama rs@fi.gs.com
Steven L Kunz skunz@iastate.edu
Theodore C. Law TEDLAW@TOROLAB6.VNET.IBM.COM
Thomas R. Kimpton tom@dtint.dtint.com
Tim Bunce timbo@ig.co.uk
Timothy Murphy tim@maths.tcd.ie
UF Computer Staff consult@cis.ufl.edu
William Setzer William_Setzer@ncsu.edu
Programming Aids
3.1) How do I use perl interactively?
3.2) Is there a perl profiler?
3.3) Is there a yacc for perl?
3.4) Is there a pretty printer (similar to indent(1)) for perl?
3.5) How can I convert my perl scripts directly to C or compile them into binary form?
3.6) Where can I get a perl mode for emacs?
3.7) Is there a perl shell?
3.8) How can I use curses with perl?
3.9) How can I use X or Tk with perl?
3.10) Can I dynamically load C user routines?
3.11) What is undump and where can I get it?
3.12) How can I get '#!perl' to work under MS-DOS?
3.13) Can I write useful perl programs on the command line?
3.14) What's a "closure"?
3.1) How can I use Perl interactively?
The easiest way to do this is to run Perl under its debugger. If you
have no program to debug, you can invoke the debugger on an `empty'
program like this:
perl -de 0
(The more positive hackers prefer perl -de 1. :-)
3.2) Is there a Perl profiler?
3.3) Is there a yacc for Perl?
3.4) Is there a pretty-printer (similar to indent(1)) for Perl?
PERL|perl|Perl:\
:pb=^\d?(sub|package)\d\p\d:\
:bb={:be=}:cb=#:ce=$:sb=":se=\e":lb=':\
:le=\e':tl:\
:id=_:\
:kw=\
if for foreach unless until while continue else elsif \
do eval require \
die exit \
defined delete reset \
goto last redo next dump \
local undef return \
write format \
sub package
It doesn't actually do everything right; in particular,
things like $#, $', s#/foo##, and $foo'bar all confuse it.
# perl 4.x David Levine <levine@ics.uci.edu> 05 apr 1993
# Derived from Tom Christiansen's perl vgrindef. I'd like to treat all of
# perl's built-ins as keywords, but vgrind fields are limited to 1024
# characters and the built-ins overflow that (surprise :-). So, I didn't
# include the dbm*, end*, get*, msg*, sem*, set*, and shm* functions. I
# couldn't come up with an easy way to distinguish beginnings of literals
# ('...') from package prefixes, so literals are not marked.
# Be sure to:
# 1) include whitespace between a subprogram name and its opening {
# 2) include whitespace before a comment (so that $# doesn't get
# interpreted as one).
perl4:\
:pb=^\d?(sub|package)\d\p\d:\
:id=$%@_:\
:bb=\e{:be=\e}:cb=\d\e#:ce=$:sb=\e":se=\e":\
:kw=accept alarm atan2 bind binmode caller chdir chmod chop \
chown chroot close closedir connect continue cos crypt defined delete \
die do dump each else elsif eof eval exec exit exp fcntl fileno flock \
for foreach fork format getc gmtime goto grep hex if include index int \
ioctl join keys kill last length link listen local localtime log lstat \
m mkdir next oct open opendir ord pack package pipe pop print printf \
push q qq qx rand read readdir readlink recv redo rename require reset \
return reverse rewinddir rindex rmdir s scalar seek seekdir select send \
shift shutdown sin sleep socket socketpair sort splice split sprintf \
sqrt srand stat study sub substr symlink syscall sysread system \
syswrite tell telldir time times tr truncate umask undef unless unlink \
unpack unshift until utime values vec wait waitpid wantarray warn while \
write y:
3.5) How can I convert my perl scripts directly to C or compile them into
binary form?
This is UNPUBLISHED PROPRIETARY SOURCE CODE of XYZZY, Inc.; the
contents of this file may not be disclosed to third parties,
copied or duplicated in any form, in whole or in part, without
the prior written permission of XYZZY, Inc.
shar /usr/local/{lib,bin,man}/perl myprog
Just don't overwrite their own Perl installation if they have one!
3.6) Where can I get a perl-mode for emacs?
3.7) Is there a Perl shell?
Not really. Perl is a programming language, not a command
interpreter. There is a very simple one called ``perlsh''
included in the Perl source distribution. It just does this:
$/ = ''; # set paragraph mode
$SHlinesep = "\n";
while ($SHcmd = <>) {
$/ = $SHlinesep;
eval $SHcmd; print $@ || "\n";
$SHlinesep = $/; $/ = '';
}
Not very interesting, eh?
3.8) How can I use curses with perl?
menu.pl.v3.1.tar.Z
menu.pl is a complete menu front-end for perl+curses and demonstrates
a lot of things (plus it is useful to boot if you want full-screen
menu selection ability). It provides full-screen menu selection
ability for three menu styles (single-selection, multiple-selection,
and ``radio-button''). The ``perl menus'' package also includes routines
for full-screen data entry. A ``template'' concept is implemented to
create a simple (yet flexible) perl interface for building data-entry
screens for registration, database, or other record-oriented tasks.
See the question on retrieving perl via mail for more information on
how to retrieve other items of interest from the mail server
there.
begin
send PERL/cterm.shar.Z
end
3.9) How can I use X or Tk with Perl?
STDWIN is a library written by Guido van Rossum* (author of the Python
programming language) that is portable between Mac, Dos and X11. One
could write a Perl agent to speak to this STDWIN server.
ftp.wu-wien.ac.at[137.208.3.5]:pub/src/X11/wafe-0.9.tar.Z
#!/usr/local/bin/perl
#####################################################################
# An example of calling wish as a subshell under Perl and
# interactively communicating with it through sockets.
#
# The script is directly based on Gustaf Neumann's perlwafe script.
#
# Dov Grobgeld dov@menora.weizmann.ac.il
# 1993-05-17
#####################################################################
$wishbin = "/usr/local/bin/wish";
die "socketpair unsuccessful: $!!\n" unless socketpair(W0,WISH,1,1,0);
if ($pid=fork) {
select(WISH); $| = 1;
select(STDOUT);
# Create some TCL procedures
print WISH 'proc echo {s} {puts stdout $s; flush stdout}',"\n";
# Create the widgets
print WISH &;lt;&;lt;TCL;
# This is a comment "inside" wish
frame .f -relief raised -border 1 -bg green
pack append . .f {top fill expand}
button .f.button-pressme -text "Press me" -command {
echo "That's nice."
}
button .f.button-quit -text quit -command {
echo "quit"
}
pack append .f .f.button-pressme {top fill expand} \\
.f.button-quit {top expand}
TCL
;
# Here is the main loop which receives and sends commands
# to wish.
while (
3.10) Can I dynamically load C user routines?
dyl.h
dyl.c - code extracted from Oliver Sharp's article
hash.h
hash.c - Berkeley's hash functions, should use perl's but
could not be bothered
dylperl.c - perl usersubs
user.c - userinit function
sample.c - sample code to be dyl'ed
sample2.c - "
test.pl - sample perl script that dyl's sample*.o
&dyl("file.o"): dynamically link file.o. All functions
and non-static variables become visible from within perl. This
function returns a pointer to an internal hash table corresponding
to the symbol table of the newly loaded code.
eg: $ht = &dyl("sample.o")
This function can also be called with the -L and -l ld options.
eg: $ht = &dyl("sample2.o", "-L/usr/lib", "-lm")
will also pick up the math library if sample.o
accesses any symbols there.
&dyl_find("func"): find symbol 'func' and return its symbol table entry
&dyl_functions($ht): print the contents of the internal hash table
&dyl_print_symbols($f): prints the contents of the symbol returned by
dyl_find()
There is very little documentation, maybe something to do for a future
release. The files sample.o, and sample2.o contain code to be
incrementally loaded, test.pl is the test perl script.
3.11) What is undump and where can I get it?
3.12) How can I get #!perl to work under MS-DOS?
3.13) Can I write useful perl programs on the command line?
# what's octal value of random char (":" in this case)?
perl -e 'printf "%#o\n", ord(shift)' ":"
# sum first and last fields
perl -lane 'print $F[0] + $F[1]'
# strip high bits
perl -pe 'tr/\200-\377/\000-\177/'
# find text files
perl -le 'for(@ARGV) {print if -f && -T}' *
# trim newsrc
perl5 -i.old -pe 's/!.*?(\d+)$/! 1-$1/' ~/.newsrc
# cat a dbmfile
perl -e 'dbmopen(%f,shift,undef);while(($k,$v)=each%f){print "$k:\
$v\n"}' /etc/aliases
# remove comments from C program
perl5 -0777 -pe 's{/\*.*?\*/}{}gs' foo.c
# make file a month younger than today, defeating reaper daemons
perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' *
# find first unused uid
perl5 -le '$i++ while getpwuid($i); print $i'
# find first unused uid after 100, even with perl4
perl -le '$i = 100; $i++ while ($x) = getpwuid($i); print $i'
# detect pathetically insecurable systems
perl5 -le 'use POSIX; print "INSECURE" unless sysconf(_PC_CHOWN_RESTRICTED)'
# display reasonable manpath
echo $PATH | perl5 -nl -072 -e '
s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}'
sub newprint {
my $x = shift;
return sub { my $y = shift; print "$x, $y!\n"; };
}
$h = newprint("Howdy");
$g = newprint("Greetings");
# Time passes...
&$h("world");
&$g("earthlings");
Howdy, world!
Greetings, earthlings!
Note particularly that $x continues to refer to the value passed into
newprint() despite the fact that the my $x has seemingly gone out
of scope by the time the anonymous subroutine runs. That's what
closure is all about.
General Programming, Regexps, and I/O
4.1) What are all these $@%*<> signs and how do I know when to use them?
4.2) Why do Perl operators have different precedence than C operators?
4.3) What's the difference between dynamic and static (lexical) scoping?
4.4) What's the difference between deep and shallow binding?
4.5) How can I manipulate fixed-record-length files?
4.6) How can I make a file handle local to a subroutine?
4.7) How can I sleep or alarm for under a second?
4.8) How can I do an atexit() or setjmp()/longjmp()? (Exception handling)
4.9) How can I catch signals?
4.10) Why isn't my octal data interpretted correctly?
4.11) How can I compare two date strings?
4.12) How can I find the Julian Day?
4.13) Does perl have a round function? What about ceil() and floor()?
4.14) What's the fastest way to code up a given task in perl?
4.15) Do I always/never have to quote my strings or use semicolons?
4.16) What is variable suicide and how can I prevent it?
4.17) What does ``Malformed command links'' mean?
4.18) How can I set up a footer format to be used with write()?
4.19) Why does my program keep growing in size?
4.20) Can I do RPC?
4.21) Why doesn't my sockets program work under System V (Solaris)? What does the error message ``Protocol not supported'' mean?
4.22) How can I quote a variable to use in a regexp?
4.23) How can I change the first N letters of a string?
4.24) How can I count the number of occurrences of a substring within a string?
4.25) Can I use Perl regular expressions to match balanced text?
4.26) What does it mean that regexps are greedy? How can I get around it?
4.27) How do I use a regular expression to strip C style comments from a file?
4.28) How can I split a [character] delimited string except when inside [character]?
4.29) Why doesn't local($foo) = <FILE>; work right?
4.30) How can I detect keyboard input without reading it?
4.31) How can I read a single character from the keyboard under UNIX and DOS?
4.32) How can I get input from the keyboard without it echoing to the screen?
4.33) Is there any easy way to strip blank space from the beginning/end of a string?
4.34) How can I output my numbers with commas added?
4.35) How do I expand tags in a string?
4.36) What's wrong with grep in a void context?
4.1) What are all these $@%*<> signs and how do I know when to use them?
$foo = BAR;
Which wil be interpreted as
$foo = 'BAR';
and not as
$foo =
If you always quote your strings, you'll avoid this trap.
open (FILE, ">/tmp/foo.$$");
print FILE "string\n";
close FILE;
If instead of a filehandle, you use a normal scalar variable with file
manipulation functions, this is considered an indirect reference to a
filehandle. For example,
$foo = "TEST01";
open($foo, "file");
After the open, these two while loops are equivalent:
while (<$foo>) {}
while (
as are these two statements:
close $foo;
close TEST01;
but NOT to this:
while (<$TEST01>) {} # error
^
^ note spurious dollar sign
This is another common novice mistake; often it's assumed that
open($foo, "output.$$");
will fill in the value of $foo, which was previously undefined. This
just isn't so -- you must set $foo to be the name of a filehandle
before you attempt to open it.
How about changing perl syntax to be more like awk or C? I $$mean @less
$-signs =
Then it would be less like the shell. :-)
&california || &bust;
california or bust;
4.2) How come Perl operators have different precedence than C operators?
unlink $foo, "bar", @names, "others";
unlink "a_file" || die "snafu";
unlink("a_file" || die "snafu");
unlink("a_file") || die "snafu";
(unlink "a_file") || die "snafu";
unlink $foo, "bar", @names, "others" or die "snafu";
unless ($io_ok = print("some", "list")) { }
$io_ok = print(2+4) * 5;
print(2+4) * 5;
returns the same 5*io_success value and tosses it.
*x[i]
then the brackets will bind more tightly than the star, yielding
*(x[i])
But in perl, they DO NOT! That's because the ${}, @{}, %{}, and &{}
notations (and I suppose the *{} one as well for completeness) aren't
actually operators. If they were, you'd be able to write them as *()
and that's not feasible. Instead of operators whose precedence is
easily understandable, they are instead figments of yacc's grammar.
This means that:
$$x[$i]
{$$x}[$i]
${$x}[$i]
${$x[$i]}
4.3) What's the difference between dynamic and static (lexical) scoping?
What are my() and local()?
[NOTE: This question refers to perl5 only. There is no my() in perl4]
Scoping refers to visibility of variables. A dynamic variable is
created via local() and is just a local value for a global variable,
whereas a lexical variable created via my() is more what you're
expecting from a C auto. (See also ``What's the difference between
deep and shallow binding.'') In general, we suggest you use lexical
variables wherever possible, as they're faster to access and easier to
understand. The ``use strict vars'' pragma will enforce that all
variables are either lexical, or full classified by package name. We
strongly suggest that you develop your code with ``use strict;'' and the
-w flag. (When using formats, however, you will still have to use
dynamic variables.) Here's an example of the difference:
#!/usr/local/bin/perl
$myvar = 10;
$localvar = 10;
print "Before the sub call - my: $myvar, local: $localvar\n";
&sub1();
print "After the sub call - my: $myvar, local: $localvar\n";
exit(0);
sub sub1 {
my $myvar;
local $localvar;
$myvar = 5; # Only in this block
$localvar = 20; # Accessible to children
...
}
print "Inside first sub call - my: $myvar, local: $localvar\n";
&sub2();
}
sub sub2 {
print "Inside second sub - my: $myvar, local: $localvar\n";
}
4.4) What's the difference between deep and shallow binding?
{
my $x = time;
$coderef = sub { $x };
}
{
my $x = time;
$coderef = eval "sub { \$x }";
}
require 5.001;
sub mkcounter {
my $start = shift;
return sub {
return ++$start;
}
}
$f1 = mkcounter(10);
$f2 = mkcounter(20);
print &$f1(), &$f2();
11 21
print &$f1(), &$f2(), &$f1();
12 22 13
4.5) How can I manipulate fixed-record-length files?
# sample input line:
# 15158 p5 T 0:00 perl /mnt/tchrist/scripts/now-what
$ps_t = 'A6 A4 A7 A5 A*';
open(PS, "ps|");
$_ =
4.6) How can I make a file handle local to a subroutine?
sub cat_include {
local($name) = @_;
local(*FILE);
local($_);
warn "
4.7) How can I call alarm() or usleep() from Perl?
# alarm; send me a SIGALRM in this many seconds (fractions ok)
# tom christiansen <tchrist@mox.perl.com>
sub alarm {
require 'syscall.ph';
require 'sys/time.ph';
local($ticks) = @_;
local($in_timer,$out_timer);
local($isecs, $iusecs, $secs, $usecs);
local($itimer_t) = 'L4'; # should be &itimer'typedef()
$secs = int($ticks);
$usecs = ($ticks - $secs) * 1e6;
$out_timer = pack($itimer_t,0,0,0,0);
$in_timer = pack($itimer_t,0,0,$secs,$usecs);
syscall(&SYS_setitimer, &ITIMER_REAL, $in_timer, $out_timer)
&& die "alarm: setitimer syscall failed: $!";
($isecs, $iusecs, $secs, $usecs) = unpack($itimer_t,$out_timer);
return $secs + ($usecs/1e6);
}
4.8) How can I do an atexit() or setjmp()/longjmp() in Perl? (Exception handling)
$SIG{ALRM} = 'TIMEOUT';
sub TIMEOUT { die "restart input\n" }
do { eval { &realcode } } while $@ =~ /^restart input/;
sub realcode {
alarm 15;
$ans =
sub atexit { push(@_exit_subs, @_) }
sub _cleanup { unlink $tmp }
&atexit('_cleanup');
eval <<'End_Of_Eval'; $here = __LINE__;
# as much code here as you want
End_Of_Eval
$oops = $@; # save error message
# now call his stuff
for (@_exit_subs) { &$_() }
$oops && ($oops =~ s/\(eval\) line (\d+)/$0 .
" line " . ($1+$here)/e, die $oops);
4.9) How do I catch signals in perl?
$SIG{'INT'} = 'CLEANUP';
sub CLEANUP {
print "\n\nCaught Interrupt (^C), Aborting\n";
exit(1);
}
#!/usr/bin/perl -w
require 5.001;
$SIG{__WARN__} = sub {
if ($_[0] =~ /uninit/) {
die $@;
} else {
warn $@;
}
};
4.10) Why doesn't Perl interpret my octal data octally?
{
print "What mode would you like? ";
$mode = <STDIN>;
$mode = oct($mode);
unless ($mode) {
print "You can't really want mode 0!\n";
redo;
}
chmod $mode, $file;
}
$val = oct($val) if $val =~ /^0/;
4.11) How can I compare two date strings?
sub getdate {
local($_) = shift;
s/-(\d{4})$/+$1/ || s/\+(\d{4})$/-$1/;
# getdate has broken timezone sign reversal!
$_ = `/usr/local/lib/news/newsbin/getdate '$_'`;
chop;
$_;
}
date.pl - print dates how you want with the sysv +FORMAT method
date.shar - routines to manipulate and calculate dates
ftp-chat2.shar - updated version of ftpget. includes library and demo
programs
getdate.shar - returns number of seconds since epoch for any given
date
ptime.shar - print dates how you want with the sysv +FORMAT method
4.12) How can I find the Julian Day?
#!/usr/local/bin/perl
@theJulianDate = ( 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 );
#************************************************************************
#**** Return 1 if we are after the leap day in a leap year. *****
#************************************************************************
sub leapDay
{
my($year,$month,$day) = @_;
if (year % 4) {
return(0);
}
if (!(year % 100)) { # years that are multiples of 100
# are not leap years
if (year % 400) { # unless they are multiples of 400
return(0);
}
}
if (month < 2) {
return(0);
} elsif ((month == 2) && (day < 29)) {
return(0);
} else {
return(1);
}
}
#************************************************************************
#**** Pass in the date, in seconds, of the day you want the *****
#**** julian date for. If your localtime() returns the year day *****
#**** return that, otherwise figure out the julian date. *****
#************************************************************************
sub julianDate
{
my($dateInSeconds) = @_;
my($sec, $min, $hour, $mday, $mon, $year, $wday, $yday);
($sec, $min, $hour, $mday, $mon, $year, $wday, $yday) =
localtime($dateInSeconds);
if (defined($yday)) {
return($yday+1);
} else {
return($theJulianDate[$mon] + $mday + &leapDay($year,$mon,$mday));
}
}
print "Today's julian date is: ",&julianDate(time),"\n";
4.13) Does perl have a round function? What about ceil() and floor()?
sub round {
my($number) = shift;
return int($number + .5);
}
return int($number + .5 * ($number <=> 0));
#!/usr/local/bin/perl
use POSIX qw(ceil floor);
$num = 42.4; # The Answer to the Great Question (on a Pentium)!
print "Floor returns: ", floor($num), "\n";
print "Ceil returns: ", ceil($num), "\n";
Which prints:
Floor returns: 42
Ceil returns: 43
4.14) What's the fastest way to code up a given task in perl?
$COUNT = 10_000; $| = 1;
print "method 1: ";
($u, $s) = times;
for ($i = 0; $i < $COUNT; $i++) {
# code for method 1
}
($nu, $ns) = times;
printf "%8.4fu %8.4fs\n", ($nu - $u), ($ns - $s);
print "method 2: ";
($u, $s) = times;
for ($i = 0; $i < $COUNT; $i++) {
# code for method 2
}
($nu, $ns) = times;
printf "%8.4fu %8.4fs\n", ($nu - $u), ($ns - $s);
use Benchmark;
timethese($count, {
Name1 => '...code for method 1...',
Name2 => '...code for method 2...',
... });
Benchmark: timing 100 iterations of Name1, Name2...
Name1: 2 secs (0.50 usr 0.00 sys = 0.50 cpu)
Name2: 1 secs (0.48 usr 0.00 sys = 0.48 cpu)
use Benchmark;
timethese(100000, {
'regex1' => '$str="ABCD"; $str =~ s/^(.)//; $ch = $1',
'regex2' => '$str="ABCD"; $str =~ s/^.//; $ch = $&',
'substr' => '$str="ABCD"; $ch=substr($str,0,1); substr($str,0,1)="",
});
Benchmark: timing 100000 iterations of regex1, regex2, substr...
regex1: 11 secs (10.80 usr 0.00 sys = 10.80 cpu)
regex2: 10 secs (10.23 usr 0.00 sys = 10.23 cpu)
substr: 7 secs ( 5.62 usr 0.00 sys = 5.62 cpu)
4.15) Do I always/never have to quote my strings or use semicolons?
$SIG{INT} = Timeout_Routine;
or
@Days = (Sun, Mon, Tue, Wed, Thu, Fri, Sat, Sun);
$foo{while} = until;
$foo{'while'} = 'until';
for (1..10) { print }
@nlist = sort { $a <=> $b } @olist;
for ($i = 0; $i < @a; $i++) {
print "i is $i\n" # <-- oops!
}
This is like this
------------ ---------------
$foo{line} $foo{"line"}
bar => stuff "bar" => stuff
4.16) What is variable suicide and how can I prevent it?
$x = 17;
&munge($x);
sub munge {
local($x);
local($myvar) = $_[0];
...
}
sub munge {
local($myvar) = $_[0];
local($x);
...
}
@num = 0 .. 4;
print "num begin @num\n";
foreach $m (@num) { &ug }
print "num finish @num\n";
sub ug {
local($m) = 42;
print "m=$m $num[0],$num[1],$num[2],$num[3]\n";
}
Which prints out the mysterious:
num begin 0 1 2 3 4
m=42 42,1,2,3
m=42 0,42,2,3
m=42 0,1,42,3
m=42 0,1,2,42
m=42 0,1,2,3
num finish 0 1 2 3 4
4.17) What does ``Malformed command links'' mean?
4.18) How can I set up a footer format to be used with write()?
4.19) Why does my Perl program keep growing in size?
for (1..100) {
local(@array);
}
local(@array);
for (1..100) {
undef @array;
}
sub oops {
my $x;
$x = \$x;
}
4.22) How can I quote a variable to use in a regexp?
$pattern =~ s/(\W)/\\$1/g;
4.23) How can I change the first N letters of a string?
substr($var,0,1) = 'S';
substr($var,$[,1) = 'S';
While it would be slower, you could in this case use a substitute:
$var =~ s/^./S/;
But this won't work if the string is empty or its first character is a
newline, which ``.'' will never match. So you could use this instead:
$var =~ s/^[^\0]?/S/;
substr($var, $[, 10) =~ tr/a-z/A-Z/;
/^(\S+)/ && substr($_,$[,length($1)) =~ tr/a-z/A-Z/;
For some things it's convenient to use the /e switch of the substitute
operator:
s/^(\S+)/($tmp = $1) =~ tr#a-z#A-Z#, $tmp/e
4.24) How can I count the number of occurrences of a substring within a
string?
$string="ThisXlineXhasXsomeXx'sXinXit":
$count = ($string =~ tr/X//);
print "There are $count Xs in the string";
$string="-9 55 48 -2 23 -76 4 14 -44";
$count++ while $string =~ /-\d+/g;
print "There are $count negative numbers in the string";
4.25) Can I use Perl regular expressions to match balanced text?
while(<>) {
if (/pat1/) {
if ($inpat++ > 0) { warn "already saw pat1" }
redo;
}
if (/pat2/) {
if (--$inpat < 0) { warn "never saw pat1" }
redo;
}
}
4.26) What does it mean that regexps are greedy? How can I get around it?
$_="this (is) an (example) of multiple parens";
while ( m#\((.*)\)#g ) {
print "$1\n";
}
is
example
while ( m#\(([^)]*)\)#g ) {
while (m#\((.*?)\)#g )
s:/\*.*?\*/::gs
4.27) How do I use a regular expression to strip C style comments from a
file?
#!/usr/bin/perl
$/ = undef;
$_ = <>;
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|([^/"']*("[^"\\]*(\\[\d\D][^"\\]*)*"[^/"']*|'[^'\\]*(\\[\d\D][^'\\]*)*'[^/"']*|/+[^*/][^/"']*)*)#$2#g;
print;
#!/usr/local/bin/perl
$/ = undef;
$_ = <>;
s#//(.*)|/\*[^*]*\*+([^/*][^*]*\*+)*/|"(\\.|[^"\\])*"|'(\\.|[^'\\])*'|[^/"']+# $1 ? "/*$1 */" : $& #ge;
print;
4.28) How can I split a [character] delimited string except when inside
[character]?
SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"
undef @field;
push(@fields, defined($1) ? $1:$3)
while m/"([^"\\]*(\\.[^"\\]*)*)"|([^,]+)/g;
4.29) Why doesn't local($foo) = <FILE> work right?
local($foo);
$foo = <FILE>;
local($foo) = scalar(<FILE>);
4.30) How can I detect keyboard input without reading it?
sub key_ready {
local($rin, $nfd);
vec($rin, fileno(STDIN), 1) = 1;
return $nfd = select($rin,undef,undef,0);
}
4.31) How can I read a single character from the keyboard under UNIX and DOS?
$BSD = -f '/vmunix';
if ($BSD) {
system "stty cbreak /dev/tty 2>&1";
}
else {
system "stty", '-icanon',
system "stty", 'eol', "\001";
}
$key = getc(STDIN);
if ($BSD) {
system "stty -cbreak /dev/tty 2>&1";
}
else {
system "stty", 'icanon';
system "stty", 'eol', '^@'; # ascii null
}
print "\n";
sub set_cbreak { # &set_cbreak(1) or &set_cbreak(0)
local($on) = $_[0];
local($sgttyb,@ary);
require 'sys/ioctl.ph';
$sgttyb_t = 'C4 S' unless $sgttyb_t; # c2ph: &sgttyb'typedef()
ioctl(STDIN,&TIOCGETP,$sgttyb) || die "Can't ioctl TIOCGETP: $!";
@ary = unpack($sgttyb_t,$sgttyb);
if ($on) {
$ary[4] |= &CBREAK;
$ary[4] &= ~&ECHO;
} else {
$ary[4] &= ~&CBREAK;
$ary[4] |= &ECHO;
}
$sgttyb = pack($sgttyb_t,@ary);
ioctl(STDIN,&TIOCSETP,$sgttyb) || die "Can't ioctl TIOCSETP: $!";
}
$old_ioctl = ioctl(STDIN,0,0); # Gets device info
$old_ioctl &= 0xff;
ioctl(STDIN,1,$old_ioctl | 32); # Writes it back, setting bit 5
sysread(STDIN,$c,1); # Read a single character
ioctl(STDIN,1,$old_ioctl); # Sets it back to cooked mode.
# PC 2-byte keycodes = ^@ + the following:
# HEX KEYS
# --- ----
# 0F SHF TAB
# 10-19 ALT QWERTYUIOP
# 1E-26 ALT ASDFGHJKL
# 2C-32 ALT ZXCVBNM
# 3B-44 F1-F10
# 47-49 HOME,UP,PgUp
# 4B LEFT
# 4D RIGHT
# 4F-53 END,DOWN,PgDn,Ins,Del
# 54-5D SHF F1-F10
# 5E-67 CTR F1-F10
# 68-71 ALT F1-F10
# 73-77 CTR LEFT,RIGHT,END,PgDn,HOME
# 78-83 ALT 1234567890-=
# 84 CTR PgUp
4.32) How can I get input from the keyboard without it echoing to the
screen?
print "Please enter your password: '';
system("stty -echo");
chop($password=
4.33) Is there any easy way to strip blank space from the beginning/end of
a string?
s/^\s*(.*?)\s*$/$1/; # perl5 only!
s/^\s+|\s+$//g; # perl4 or perl5
s/^\s+//; s/\s+$//;
$_ = $1 if m/^\s*((.*\S)?)/;
or
s/^\s*((.*\S)?)\s*$/$1/;
4.34) How can I print out a number with commas into it?
sub commify {
local($_) = shift;
1 while s/^(-?\d+)(\d{3})/$1,$2/;
return $_;
}
$n = 23659019423.2331;
print "GOT: ", &commify($n), "\n";
GOT: 23,659,019,423.2331
s/^(-?\d+)(\d{3})/$1,$2/g;
Is that you have to put the comma in and then recalculate anything.
Some substitutions need to work this way. See the question on
expanding tabs for another such.
4.35) How do I expand tabs in a string?
1 while s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
while (s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e) {
# spin, spin, spin, ....
}
sub tab_expand {
local($_) = shift;
1 while s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
return $_;
}
$NG = "/usr/local/lib/news/newsgroups";
open(NG, "< $NG") || die "can't open $NG: $!";
while (
4.36) What's wrong with grep() or map() in a void context?
@bignums = grep ($_ > 100, @allnums);
@triplist = map {$_ * 3} @allnums;
grep{ $_ *= 3, @nums);
for (@nums) { $_ *= 3 }
fork ? wait : exec $prog;
if (fork) {
wait;
} else {
exec $prog;
die "can't exec $prog: $!";
}
Notice at no point did cleverness enter the picture.
Arrays and Shell and External Program Interactions
5.1) What is the difference between $array[1] and @array[1]?
5.2) How can I make an array of arrays or other recursive data types?
5.3) How can I make an array of structures containing various data types?
5.4) How can I extract just the unique elements of an array?
5.5) How can I tell whether an array contains a certain element?
5.6) How can I sort an associative array by value instead of by key?
5.7) How can I know how many entries are in an associative array?
5.8) What's the difference between "delete" and "undef" with %arrays?
5.9) Why don't backticks work as they do in shells?
5.10) Why does my converted awk/sed/sh script run more slowly in perl?
5.11) How can I call my system's unique C functions from perl?
5.12) Where do I get the include files to do ioctl() or syscall()? [h2ph]
5.13) Why do setuid perl scripts complain about kernel problems?
5.14) How can I open a pipe both to and from a command?
5.15) How can I capture STDERR from an external command?
5.16) Why doesn't open() return an error when a pipe open fails?
5.17) Why can't my script read from STDIN after I gave it ^D (EOF)?
5.18) How can I translate tildes (~) in a filename?
5.19) How can I convert my shell script to perl?
5.20) Can I use perl to run a telnet or ftp session?
5.21) Why do I sometimes get an "Argument list to long" when I use <*>?
5.22) How do I do a "tail -f" in perl?
5.23) Is there a way to hide perl's command line from programs such as "ps"?
5.24) I {changed directory, modified my environment} in a perl script. How
come the change disappeared when I exited the script? How do I get
my changes to be visible?
5.25) How can I pass a filehandle to a function, or make a list of
filehandles?
5.26) How can I open a file with a leading ">" or trailing blanks?
5.27) How can I tell if a variable is tainted?
5.1) What is the difference between $array[1] and @array[1]?
@foo[0] = `cmd args`;
@foo[1] = <FILE>
Just always say $foo[1] and you'll be happier.
5.2) How can I make an array of arrays or other recursive data types?
@A = (
[ 'ww' .. 'xx' ],
[ 'xx' .. 'yy' ],
[ 'yy' .. 'zz' ],
[ 'zz' .. 'zzz' ],
);
%T = (
key0, { k0, v0, k1, v1 },
key1, { k2, v2, k3, v3 },
key2, { k2, v2, k3, [ 'a' .. 'z' ] },
);
Allowing you to reference $T{key2}->{k3}->[3] to pull out 'd'.
$ary = $name[$i];
$val = eval "\$$ary[$j]";
$val = eval "\$$name[$i][\$j]";
{ local(*ary) = $name[$i]; $val = $ary[$j]; }
@w = ( 'ww' .. 'xx' );
@x = ( 'xx' .. 'yy' );
@y = ( 'yy' .. 'zz' );
@z = ( 'zz' .. 'zzz' );
@ww = reverse @w;
@xx = reverse @x;
@yy = reverse @y;
@zz = reverse @z;
@A = ( *w, *x, *y, *z );
@B = ( *ww, *xx, *yy, *zz );
@AAA = ( *A, *B );
local(*foo) = $AAA[$i];
local(*bar) = $foo[$j];
$answer = $bar[$k];
5.3) How do I make an array of structures containing various data types?
%foo = (
'field1' => "value1",
'field2' => "value2",
'field3' => "value3",
...
);
...
@all = ( \%foo, \%bar, ... );
print $all[0]{'field1'};
@all = (
{
'field1' => "value1",
'field2' => "value2",
'field3' => "value3",
...
},
{
'field1' => "value1",
'field2' => "value2",
'field3' => "value3",
...
},
...
)
$t{$value} = [ @bar ];
%{$a[$i]} = %old;
$table{'some key'} = @big_list_o_stuff; # SCARY #0
$table{'some key'} = \@big_list_o_stuff; # SCARY #1
@$table{'some key'} = @big_list_o_stuff; # SCARY #2
@{$table{'some key'}} = @big_list_o_stuff; # ICKY RANDALIAN CODE
$table{'some key'} = [ @big_list_o_stuff ]; # same, but NICE
$table{"051"} = $some_scalar; # SCARY #3
$table{"0x51"} = $some_scalar; # ditto
$table{051} = $some_scalar; # ditto
$table{0x51} = $some_scalar; # ditto
$table{51} = $some_scalar; # ok, i guess
$table{"51"} = $some_scalar; # better
$table{\@x} = $some_scalar; # SCARY #4
$table{[@x]} = $some_scalar; # ditto
$table{@x} = $some_scalar; # SCARY #5 (cf #0)
5.4) How can I extract just the unique elements of an array?
$prev = 'nonesuch';
@out = grep($_ ne $prev && (($prev) = $_), @in);
undef %saw;
@out = grep(!$saw{$_}++, @in);
@out = grep(!$saw[$_]++, @in);
undef %saw;
@saw{@in} = ();
@out = sort keys %saw; # remove sort if undesired
undef @ary;
@ary[@in] = @in;
@out = sort @ary;
5.5) How can I tell whether an array contains a certain element?
@blues = ('turquoise', 'teal', 'lapis lazuli');
undef %is_blue;
for (@blues) { $is_blue{$_} = 1; }
@primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
undef @is_tiny_prime;
for (@primes) { $is_tiny_prime[$_] = 1; }
@articles = ( 1..10, 150..2000, 2017 );
undef $read;
grep (vec($read,$_,1) = 1, @articles);
Now check whether vec($read,$n,1) is true for some $n.
5.6) How do I sort an associative array by value instead of by key?
foreach $key (sort by_value keys %ary) {
print $key, '=', $ary{$key}, "\n";
}
sub by_value { $ary{$a} cmp $ary{$b}; }
sub by_value { $ary{$b} <=> $ary{$a}; }
foreach $key ( sort { $ary{$b} <=> $ary{$a} } keys %ary ) {
print $key, '=', $ary{$key}, "\n";
}
foreach $key (&sort_by_value(*ary)) {
print $key, '=', $ary{$key}, "\n";
}
sub sort_by_value {
local(*x) = @_;
sub _by_value { $x{$a} cmp $x{$b}; }
sort _by_value keys %x;
}
@idx = ();
for (@data) { push (@idx, "\U$_") }
@sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0..$#data];
5.7) How can I know how many entries are in an associative array?
$count = keys %ARRAY;
5.8) What's the difference between "delete" and "undef" with %arrays?
keys values
+------+------+
| a | 3 |
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+
$ary{'a'} is true
$ary{'d'} is false
defined $ary{'d'} is true
defined $ary{'a'} is true
exists $ary{'a'} is true (perl5 only)
grep ($_ eq 'a', keys %ary) is true
undef $ary{'a'}
keys values
+------+------+
| a | undef|
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+
$ary{'a'} is FALSE
$ary{'d'} is false
defined $ary{'d'} is true
defined $ary{'a'} is FALSE
exists $ary{'a'} is true (perl5 only)
grep ($_ eq 'a', keys %ary) is true
delete $ary{'a'}
keys values
+------+------+
| x | 7 |
| d | 0 |
| e | 2 |
+------+------+
$ary{'a'} is false
$ary{'d'} is false
defined $ary{'d'} is true
defined $ary{'a'} is false
exists $ary{'a'} is FALSE (perl5 only)
grep ($_ eq 'a', keys %ary) is FALSE
5.9) Why don't backticks work as they do in shells?
$foo = "$bar is `wc $file`"; # WRONG
$foo = "$bar is " . `wc $file`;
chop($back = `pwd`); chdir($somewhere); chdir($back);
Shell: foo=`cmd 'safe $dollar'`
Perl: $foo=`cmd 'safe \$dollar'`;
5.10) How come my converted awk/sed/sh script runs more slowly in Perl?
next if /Mon/;
next if /Tue/;
next if /Wed/;
next if /Thu/;
next if /Fri/;
runs faster than this:
next if /Mon/ || /Tue/ || /Wed/ || /Thu/ || /Fri/;
which in turn runs faster than this:
next if /Mon|Tue|Wed|Thu|Fri/;
which runs much faster than:
next if /(Mon|Tue|Wed|Thu|Fri)/;
@list = ('abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'stv');
for $i ($[ .. $#list) {
if ($pattern eq $list[$i]) { $found++; }
}
foreach $elt (@list) {
if ($pattern eq $elt) { $found++; }
}
%list = ('abc', 1, 'def', 1, 'ghi', 1, 'jkl', 1,
'mno', 1, 'pqr', 1, 'stv', 1 );
$found += $list{$pattern};
(but put the %list assignment outside of your input loop.)
for $i (1..100) {
if (/$foo/o) {
&some_func($i);
}
}
@pats = ('_get.*', 'bogus', '_read', '.*exit', '_write');
foreach $pat (@pats) {
if ( $name =~ /^$pat$/ ) {
&some_func();
last;
}
}
@pats = ('_get.*', 'bogus', '_read', '.*exit', '_write');
$code = <<EOS
while (<>) {
study;
EOS
foreach $pat (@pats) {
$code .= <<EOS
if ( /^$pat\$/ ) {
&some_func();
next;
}
EOS
}
$code .= "}\n";
print $code if $debugging;
eval $code;
5.11) How can I call my system's unique C functions from Perl?
5.12) Where do I get the include files to do ioctl() or syscall()? [h2ph]
5.13) Why do setuid Perl scripts complain about kernel problems?
YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET!
FIX YOUR KERNEL, PUT A C WRAPPER AROUND THIS SCRIPT, OR USE -u AND UNDUMP!
is triggered because setuid scripts are inherently insecure due to a
kernel bug. If your system has fixed this bug, you can compile Perl
so that it knows this. Otherwise, create a setuid C program that just
execs Perl with the full name of the script. Here's what the
perldiag(1)
man page says about this message:
YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET!
(F) And you probably never will, since you probably don't have
the sources to your kernel, and your vendor probably doesn't
give a rip about what you want. Your best bet is to use the
wrapsuid script in the eg directory to put a setuid C wrapper
around your script.
5.14) How do I open a pipe both to and from a command?
# first write some_cmd's input into a_file, then
open(CMD, "some_cmd its_args < a_file |");
while (
# or else the other way; run the cmd
open(CMD, "| some_cmd its_args > a_file");
while ($condition) {
print CMD "some output\n";
# other code deleted
}
close CMD || warn "cmd exited $?";
# now read the file
open(FILE,"a_file");
while (<FILE>) {
5.15) How can I capture STDERR from an external command?
system $cmd;
$output = `$cmd`;
open (PIPE, "cmd |");
$output = `$cmd 2>&1`;
open (PIPE, "cmd 2>&1 |");
$output = `$cmd 2>&some_file`;
open (PIPE, "cmd 2>&some_file |");
open(STDERR, ">&STDOUT");
$alloutput = `cmd args`; # stderr still escapes
Here's a way to read from both of them and know which descriptor
you got each line from. The trick is to pipe only STDOUT through
sed, which then marks each of its lines, and then sends that
back into a merged STDOUT/STDERR stream, from which your Perl program
then reads a line at a time:
open (CMD,
"(cmd args | sed 's/^/STDOUT:/') 2>&1 |");
while (
5.16) Why doesn't open return an error when a pipe open fails?
open(TOPIPE, "|bogus_command") || die ...
open(FROMPIPE, "bogus_command|") || die ...
$kid = open (PIPE, "bogus_command |"); # XXX: check defined($kid)
(kill 0, $kid) || die "bogus_command failed";
$kid = open (PIPE, "bogus_command </dev/null |");
sleep 1;
(kill 0, $kid) || die "bogus_command failed";
The status returned by the last pipe close, backtick
(``) command, or system() operator. Note that this
is the status word returned by the wait() system
call, so the exit value of the subprocess is
actually ($? >> 8). Thus on many systems, $? & 255
gives which signal, if any, the process died from,
and whether there was a core dump. (Mnemonic:
similar to sh and ksh.)
5.17) Why can't my perl program read from STDIN after I gave it ^D (EOF) ?
$where = tell(LOG);
seek(LOG, $where, 0);
5.18) How can I translate tildes in a filename?
open(FILE, "~/dir1/file1");
open(FILE, "~tchrist/dir1/file1");
$filename = <~tchrist/dir1/file1>;
$filename =~ s#^~(\w+)(/.*)?$#(getpwnam($1))[7].$2#e;
5.19) How can I convert my shell script to Perl?
* monitor comp.lang.perl.misc and collect statistics on which
questions were asked with which frequency and to respond to them
with stock answers. Tom's programming has since outgrown this
paltry task, and it was assigned to an undergraduate student from
the University of Florida. After all, we all know that students
from UF aren't able to do much more than documentation anyway.
Against all odds, that undergraduate student has become a
professional system administrator, perl programmer, and now
author of the second edition of "Programming Perl".
* convert shell programs to perl programs
5.20) Can I use Perl to run a telnet or ftp session?
5.21) Why do I sometimes get an "Arguments too long" error when I use <*>?
while (<*>) {
chmod 0644, $_;
}
open(FOO, "echo * | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
while (
opendir(DIR,'.');
chmod 0644, grep(/\.c$/, readdir(DIR));
closedir(DIR);
This example is taken directly from ``Programming Perl'' page 78.
5.22) How do I do a "tail -f" in Perl?
seek(GWFILE, 0, 1);
for (;;) {
for ($curpos = tell(GWFILE); $_ =
5.23) Is there a way to hide perl's command line from programs such as "ps"?
#!/usr/local/bin/perl
$0 = "Hidden from prying eyes";
open(PS, "ps |") || die "Can't PS: $!";
while (
5.25) How can I use pass a filehandle to a function, or make a list of
filehandles?
$fh = "/some/path";
open($fh, "< $fh");
print $fh "string\n";
$fharray[$i] = "/some/path";
open($fharray[$i], "< $fharray[$i]");
print $fharray[$i] "stuff\n";
$tmp_fh = $fharray[$i];
print $tmp_fh "stuff\n";
print { $fharray[$i] } "stuff\n";
printit(Some_Handle);
printit(main::Some_Handle);
sub printit {
my $fh = shift;
my $package = (caller)[0];
$fh =~ s/^[^':]+$/$package::$&/;
while (<$fh>) {
print;
}
}
A better solution is to pass a typeglob instead:
printit(*Some_Handle);
sub printit {
local *FH = shift;
while (
printit(*Some_Handle);
sub printit {
my $fh = shift;
while (<$fh>) {
print;
}
}
printit(\*Some_Handle);
sub printit {
my $fh = shift;
while (<$fh>) {
print;
}
}
5.26) How can open a file with a leading ">" or trailing blanks?
sub safe_filename {
local($_) = shift;
m#^/# ? "$_\0" : "./$_\0";
}
$fn = &safe_filename("<<<something really wicked ");
open(FH, "> $fn") || "couldn't open $fn: $!";
5.27) How can I tell if a variable is tainted?
sub tainted {
! eval { join('',@_), kill 0; 1; };
}
Other resources at this site: