Channels ▼

Web Development

Perl in High Performance Computing Environments

Mar03: Perl in High Performance Computing Environments

Perl in High Performance Computing Environments

The Perl Journal March 2003

By Moshe Bar

Moshe is a systems administrator and operating-system researcher and has a M.Sc and a Ph.D. in computer science. He can be contacted at

Linux has brought High Performance Computing (HPC) to the masses. Until a few years ago, only government agencies and big corporations could afford to crunch numbers with Crays and other supercomputers. Today, small companies or even individuals can build a cheap cluster of commodity Linux boxes and run compute-intensive applications.

Two distinct clustering paradigms exist for HPC: Beowulf-style and Single-System-Image-style (SSI). SSI clusters like openMosix ( or openSSI ( connect n Linux boxes to look like one giant, single computer with n CPUs. In the November TPJ, I showed how to make use of the simple but powerful Perl parallel fork manager module to create parallel environments. The parallel fork manager works best in SSI environments because each new forked child is automatically sent to a new cluster node for execution.

For more classic computing problems, such as fast Fourier transformations (FFTs), it is often more efficient to use the Beowulf approach. Beowulf technologies such as message passing interface (MPI) or parallel virtual machines (PVM) are based on external libraries implementing a virtual parallel computer paradigm. Programmers need to modify their applications to use parallelization directives to split up computational loops across multiple nodes.

Traditionally, programming languages such as Fortran and C have been used for number-crunching applications, in part due to the multitude of mathematical and engineering libraries available to programmers. In fact, HPC applications are quite often written ad hoc; that is, they are written by scientists for a particular, temporary problem and then discarded to be replaced by new ad hoc programs, taking the output of the previous program as input for the next step.

To these developers, speed of development is of prime importance. Fortran and C, however, do not lend themselves easily to fast development or ad hoc programming. Perl, on the other hand, is ideal for prototyping and ad hoc programming. Naturally, people have been devising ways to use Perl for number crunching.

Perl and/or scripting language opponents will be quick to point out that interpreted languages have, by definition, a speed disadvantage compared to compiled languages. In my own personal experience, this is only true where pure mathematical performance is concerned. However, as soon as you add some I/O and other outer-world interaction, Perl quickly gains ground in comparison. Additionally, given the ample computing power available on today's CPUs, a few percentage points advantage in cycle efficiency is not going to make such a big difference in execution time.

There are a number of parallel computing modules out there for Perl. The most widely used are the Parallel::Pvm and Parallel::Mpi modules from CPAN.

Using the Parallel Modules

Before hacking away on a number-crunching application, one should first install either PVM and/or MPI, and the corresponding Perl modules from CPAN. For some help in setting up the PVM environment, see 05/parallel/parallel-pvm/_d11729. For help with MPI setup, see

Once your cluster is installed, say with PVM, you start the virtual parallel computer with simple statements like this:

Use Parallel::Pvm;
Parallel::Pvm::addhosts("foo", "bar");

Once your cluster is up and running, you register your program to be a PVM implicitly by calling any PVM function or explicitly by doing something like this:

$mytask = Parallel::Pvm::mytask;

The next step would be to let an executable run on, say, 16 virtual hosts. In PVM lingo, you call that "spawning" executables. You do that by executing a line like this:

($ntasks, &tids) = Parallel::Pvm::spawn("executable', 16,,argv); 

In this example, the scalar $ntasks will hold the number of children spawned and $tids will hold the task ID's of the children. There are several arguments for the spawn function, obviously. The documentation of the module goes into great lengths to explain every single function's use.

Just like in MPI, you may want to send messages to instances of an executable running on a remote node. In Parallel::Pvm, you do that with a code sequence:

Parallel::Pvm::initsend ;

The first statement makes sure we have a send buffer container, and then we fill it with a double, string, and integer value, respectively. Once we fill the container, we send it to a particular task, $tid, and we tag this message with 999.

On the receiving end of this message, you unpack the content in reverse order by executing the following statements:

Return_code = Parallel::Pvm::recv;
$int_t = Parallel::Pvm::unpack;
($double_t,$str_t) = Parallel::Pvm::unpack;

There are dozens of other options and functions in Parallel::Pvm. You can build-in fault tolerance by respawning lost children, for instance. You can even provide parallel I/O or send nonblocking messages.

If you need to write a number-crunching application fast, then Parallel::Pvm is certainly worth considering. It's easy to use and powerful.


In MPI (similar to PVM), the central idea is to have several instances of the same executable running on various nodes, and use message passing for coordination among the instances. Just as with PVM, installing the MPI libraries is as easy as typing "rpm -install" under Linux (or through similar means under other operating systems). Perl programs making use of the Parallel:: MPI::Simple module should be launched with the standard MPI command:

mpirun -np 2 perl

MPI is very simple. These six functions allow you to write many programs:







In fact, in most cases, you don't even need communication between the nodes. Take, for instance, a simple program to calculate on any number of nodes available. For that, we use a discrete integration (Simpson's rule) under the curve x*x+y*y=1, a circle of radius 1 for 0<x<1, and multiply by 4.

Using MPI (see Listing 1), you first initialize MPI, then indicate the number of divisions. Now, within the integration loop, each node can compute a constant cycle of divisions. Finally, we assemble the results at the end at one of the instances.


For quick and not-so-dirty development of number-crunching applications, Perl (with the appropriate modules) can be an intelligent choice. Because relatively few people use this technique as of yet, there is still a lot of room for research in this field, particularly in the benchmarking area. Please get back to me with samples of your own parallel Perl programs, as I plan to return to this subject in a future article.


Listing 1


use lib qw(/usr/local/perlmod/Parallel-MPI-0.03/contrib/cpi/../../blib/arch

use Parallel::MPI qw(:all);

sub f {
    my ($a) = @_;
    return (4.0 / (1.0 + $a * $a));

my $PI25DT = 3.141592653589793238462643;

$numprocs = MPI_Comm_size(MPI_COMM_WORLD);
$myid =     MPI_Comm_rank(MPI_COMM_WORLD);

#printf(STDERR "Process %d\n", $myid);

$n = 0;
while (1) {
    if ($myid == 0) {
        if ($n==0) { $n=100; } else { $n=0; }
        $startwtime = MPI_Wtime();

    MPI_Bcast(\$n, 1, MPI_INT, 0, MPI_COMM_WORLD);

    last if ($n == 0);

    $h   = 1.0 / $n;
    $sum = 0.0;

    for ($i = $myid + 1; $i <= $n; $i += $numprocs) {
        $x = $h * ($i - 0.5);
        $sum += f($x);
    $mypi = $h * $sum;

    MPI_Reduce(\$mypi, \$pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

    if ($myid == 0) {
        printf("pi is approximately %.16f, Error is %.16f\n",
               $pi, abs($pi - $PI25DT));
        $endwtime = MPI_Wtime();
        printf("wall clock time = %f\n", $endwtime - $startwtime);


Back to Article

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
Dr. Dobb's TV