Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

perlcc & Compiling Perl Script


Nov02: perlcc & Compiling Perl Script

Robert is the former maintainer of the Linux Frequently Asked Questions with Answers Usenet FAQ. He can be reached at rkiesling@ mainmatter.com.


Perl has always had a compiler, as the perlcompile manual page provided with Version 5.8.0 points out, but its normal use is only to generate the internal bytecode that is run by the interpreter.

All recent versions of Perl, however, have had the ability to generate C source code or run a compiler to produce a standalone executable program. The library modules B.pm, C.pm, and CC.pm provide an interface to Perl's internal bytecode compiler. The perlcc script processes command-line options and runs all of the components of the compilation process. A listing of command-line options that perlcc accepts is given in Figure 1.

In earlier versions of Perl, compiler support for extension modules was inconsistent, reducing the usefulness of the compiler. There are still some language features that don't compile correctly, like unsigned math operations. But scripts that use complex extensions like Perl/Tk are much easier to turn into standalone programs using Perl 5.8.0.

There has been much reworking of the Perl internals—improvements in the IO abstraction layer, new interpreter threads, and code for a new version of MacPerl. This redesign effort has also made Perl scripts easier to compile.

The compiler is still experimental, though, and some of its modules, especially those that optimize code, are not yet fully integrated with the rest of the Perl libraries. It's likely that you'll need to experiment to find the best way to turn scripts into standalone programs.

Why Not to Compile Perl

Perl remains a primarily interpreted language. Some of the language's greatest advantages, such as its ability to bind variables very late in the interpreter process and its lack of strong data typing, make the language difficult to compile and add a lot of extra code to executables. Standalone programs don't run noticeably faster, and Perl gurus say that the executables compiled from scripts aren't any more secure than the interpreted versions.

However, standalone programs compiled from Perl do not need the interpreter or libraries to run. Also, compiling Perl scripts with statically linked extensions built into the interpreter might be a better option for producing specific applications. This could be the best reason to provide compiled programs, especially those that rely on libraries that users must install themselves.

The compiler prefers scripts to have the extension .p, which can be confusing if you also write Pascal programs. In fact, the compiler's design and operation is reminiscent of a Pascal p-code compiler, without the necessity of an extra run-time module.

Executables compiled from intermediate C code are huge as well, when statically linked against the Perl libraries. A minimal "Hello, world!" script like the hello.p script shown in Listing 1 results in a binary that is over 1 MB in size. Scripts that use the Perl/Tk GUI result in binaries of at least 5 MB. When generating C code, the compiler does not translate a Perl script into corresponding C source code. Instead, it generates a C representation of the internal bytecode, including the code for all the library modules that the script uses.

However, generating a C language source file and then using the compiler's optimization can reduce executable size by about 10 percent. Producing stripped binaries, which do not have symbol table information, can reduce the size of an executable by at least another 200 KB.

Building Perl with a shared version of its libraries can also reduce the size of executables, but then you need to include the shared libraries with the program for systems that don't have it already. You must also take care that the version of the library on the target system is the same as the version on the system the script was compiled on.

Table 1 shows the file sizes for three Perl scripts: the hello.p script mentioned earlier, touch.p, which mimics the UNIX touch utility (Listing 2), and textedit.p, a Perl/Tk script for a simple text editor (Listing 3).

Table 1 also lists the sizes of statically linked executables compiled with perlcc, the size of the generated C source files, and the sizes of executables generated by GCC with and without optimization.

The smallest executables are those generated using "gcc -O." Higher levels of optimization actually increase executable size slightly and result in much longer compilation times.

All of the scripts were compiled on a SPARCstation 20 with Solaris 8 and GCC 3.0.3. Other hardware platforms and operating systems produce similar results.

Compiling with perlcc and GCC

The perlcc command-line arguments are similar to those of a C compiler. The following command, for example, generates the executable textedit from the script textedit.p.

# perlcc textedit.p -o textedit

To generate the C source file textedit.p.c, from the textedit.p script, use the -S command-line argument.

# perlcc -S textedit.p

The libperl.a static library and the Perl include files are located in directories where Perl, not GCC, can find them. In addition to these directories, you must also specify which system libraries to link with the program. Example 1 shows the options that you must provide to GCC to compile and link the textedit.p script.

When Static Linking is Better

All compiled scripts also require linking to DynaLoader.a library, whether or not the script loads additional Perl libraries. However, the dynamic module loader is so completely integrated into Perl that the interpreter cannot function without it.

If the size of compiled programs is critical or you want to use Perl in embedded environments (although the language designers recommend against it), you should consider building the Perl interpreter with statically linked versions of the libraries you want to include.

Example 2 shows how to build Perl with the Tk library modules statically linked. You must first unpack the Perl/Tk source code in the Perl source tree. When you run the configure script, answer "n" to the dynamic loading option and specify which extensions you want to include.

In practice, building a statically linked interpreter requires that you specify all of the extensions you'll need. You'll need to enter all of them when configuring the Perl interpeter.

Configuring Perl in this way allows you to build application-specific executables, but you'll need to experiment to determine the best procedure for building the executable programs and which extensions to include in the interpreter.

Conclusion

Although Perl is primarily an interpreted language, it provides the tools and configuration options to produce compiled standalone programs that do not need the Perl interpreter or libraries to be present.

TPJ

Listing 1

#!/usr/local/bin/perl

print "Hello, world!\n";

Back to Article

Listing 2

#!/usr/local/bin/perl

use IO::File;

my $filename = $ARGV[0];

if (! length ($filename)) {
    print "Usage: touch \n";
    exit 1;
}

if (-f $filename) {  # Update access and modtimes on $filename if
                     # it already exists.
    my $timenow = time;
    utime $timenow, $timenow, $filename;
} else { # create the file
    my $fh = new IO::File $filename , O_CREAT;
    undef $fh;
}

Back to Article

Listing 3

#!/usr/local/bin/perl

use Tk;

my $mw = new MainWindow;

my $textwidget = $mw -> Text
    -> pack (-expand => 1, -fill => 'both');

MainLoop;


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.