PDL: The Perl Data Language

The Perl Data Language is a Perl extension for numerical manipulation that provides the convenience of Perl with the speed of compiled C.


September 01, 1997
URL:http://www.drdobbs.com/pdl-the-perl-data-language/184410442

DDJ, Software Careers Fall 97: PDL: The Perl Data Language


Karl is an astronomer at the Anglo-Australian Observatory and can be reached at [email protected]. Frossie is a software engineer and astronomer at the UK Infrared Telescope in Hawaii and can be reached at [email protected]. Reprinted courtesy of The Perl Journal (http://www.tpj.com/).
Tables and Examples
Insert Extolling the virtues of Perl and its many uses to the many Perl users out there is preaching to the choir. Nevertheless, there is one fundamental area in computing where Perl has been conspicuously absent-number crunching.

Tarred by the same brush as other scripting languages, Perl (which in fact is semicompiled), is perceived as too slow and memory hungry for heavy numerical computations because it doesn't lend itself to storing and retrieving zillions of numbers quickly. This has been a source of great frustration to us, enthusiastic Perl users who resent being forced to use other environments for our astronomical data analysis. Perl's potential for manipulating numerical data sets speedily and elegantly via an extension was obvious. Hence PDL, the Perl Data Language, was conceived. PDL is a Perl extension for numerical manipulation that provides the convenience of Perl with the speed of compiled C.

PDL introduces a new data structure: the "pdl numerical array," often referred to as a "piddle." (This unfortunate nickname has led to some rather dubious puns in the source code.) A piddle is a special object that can contain a large block of efficiently stored numbers for manipulation with normal mathematical expressions. For example, if $a is a piddle containing a 346 chunk of data, then the Perl statement $b = sin($a) will set $b equal to $a; but with every value replaced, with its sine. Because each operation is implemented via compiled C code, it's nearly as fast as a hand-crafted C program.

PDL can be used "normally" from a script, but it also has a shell interface for interactive data analysis and prototyping. In this article, we demonstrate PDL using the PDL shell, called "perldl." We assume you have PDL-1.11 (see Table 1 for a list of PDL 1.11 functions), PGPLOT-2.0 (which itself requires the pgplot graphics library), and Perl 5.003. If you also have the right versions of the Term::ReadLine and Term::ReadKey modules, the perldl shell allows interactive command-line editing.

The perldl shell, invoked at the command line by perldl, behaves like Perl's debugger. For instance, we can assign values to variables and print them with p, as in Example 1. PDL is really about matrices. In Example 2, we create a 23 matrix and multiply it by itself.

To have true fun with PDL, we need some data. Luckily, the PDL distribution comes with a picture of the sky stored in FITS, the standard format for astronomical data. PDL also supplies rfits(), a function that reads FITS files and returns a piddle containing the data. Example 3(a) shows how we read in our image and plot it.

Now we have data (and we didn't have to spend three nights freezing on top of a mountain to get it). What do we know about it? We know that it is 16-bit with 65,536 elements. But is it 65536*1 or 256*256 or even 16*16*16*16? We can determine the dimensions of our data by using the dims() function; see Example 3(b). dims() is a PDL function that returns the dimensions of a piddle. Not surprisingly (after all, it's a picture of the sky), we have a two-dimensional image: 256*256.

To determine the mean and standard deviation of a piddle, we use the stats() function; Example 3(c). To print a portion of the piddle, we use the sec() function. Since we don't want to display all 65,536 numbers in this particular example, we use sec() to return the bottom-left corner instead. Example 3(d) displays the rectangle between elements (0,252) and (3,255). Additional dimensions are handled seamlessly by passing the extra coordinate values as arguments.

Perhaps you're getting restless at this point. Let's abandon the function calls and jump to the cool stuff. The PDL function imag() uses the PGPLOT library to display images. In Example 4(a), imag() displays the piddle $a; see Figure 1(a). It's full of stars! This is, in fact, an image of Messier 51, a spiral galaxy similar to our own but at a distance of 200,000,000,000,000,000,000 miles away, give or take a few billion. That's a bit too far for us to invade, but we can at least humiliate it, using the command in Example 4(b). This results in Figure 1(b).

Since we're exploring cosmology, we can create something out of nothing; see Example 5(a). As you can see, PDL functions can be chained together just like Perl functions. Two PDL functions are cascaded here: rvals() and zeroes(). First, zeroes() creates a piddle full of zeros-in this case, a 2020 matrix with every element zero. (There's also a ones() function that creates a piddle full of ones.) Then, rvals() fills that piddle with values representing the distance of each element from the center. We now use the exp() function to generate a two-dimensional Gaussian, as shown in Example 5(b) and Figure 2(a).

Figure 1:
Figure 1: (a) Image of Messier 51, a spiral galaxy similar to ours; (b) humiliated galaxy.

The less mathematically inclined will say it looks like a blob. We can inflict a bit more punishment on Messier 51 by convolving it with our newly created Gaussian filter; Example 5(c) and Figure 2(b). This enables us to simulate what we would see if we were observing through very bad viewing conditions, such as a (possibly drunken) haze. This operation takes 20-25 seconds on a Pentium 120 or Sparc 20 with PDL. Doing this with a 2D array in normal (non-PDL) Perl takes 13 minutes and uses 11 times as much memory.

Example 5(d) creates an unsharp masked image, as in Figure 2(c). This is often used in astronomy to emphasize sharp features against a bright background, such as stars in a galaxy, the giant luminous gas clouds we call HII regions, or foreground objects such as UFO's (err, weather balloons).

Where are We Now?

PDL was prototyped early in 1996 and is currently in beta release. (One author's habit of prototyping Perl modules while on astronomical observing trips leaves the other author wondering whether this is a previously unknown symptom of altitude sickness.) It is functional but not yet fully featured. It is under vigorous development, thanks to the work and enthusiasm of the perldl mailing list participants. The current stable version as we write this article is 1.11.

Figure 2:
Figure 2: (a) Two-dimensional Gaussian filter; (b) Messier 51 convolved with Gaussian filter; (c) unsharpened mask.
Several auxiliary modules for PDL are nearing completion: 3D graphics manipulation using OpenGL (Tuomas J. Lukka), an interface to the Meschach matrix library (Etienne Grossmann), an interface to the XMGR plotting tool (Tim Jenness), access to the TIFF, GIF, PostScript, and other formats supported by PBM+ (Christian Soeller), and an interface to the SLATEC numerical library (Tuomas J. Lukka). Other formats are under development for PDL 2.0, which will also feature virtual slicing, easier C extensions, and fast implicit looping.

Anyone wishing to reflect or opine on the rather technical issues surrounding PDL development is welcome to join the perldl mailing list at [email protected] or the porters list at [email protected]. Send your questions to the former and your complaints to the latter. Finally, more information (including Christian Soeller's PDL FAQ) is available at http://www.aao.gov.au/local/www/kgb/perldl/.

Acknowledgment

We thank Pat Seitzer and the IRAF group of the National Optical Astronomy Observations for permission to use the image of M51.

DDJ

DDJ, Software Careers Fall 97: PDL: Table

PDL: The Perl Data Language:
Tables and Examples

Software Careers Fall 1997 Dr. Dobb's Journal

by Karl Glazebrook and Frossie Economou


Jump to Table 1

Examples:

  perldl> $b = 2
  perldl> p $b/2
  1
  perldl> p $b/3
  0.666666666666667
  perldl>
Example 1: Assigning values to variables, then printing them.
  perldl> $a = pdl [5,3,4, [6,4,3];
  perldl> print $a;
  [
    [5,3,4]
    [6,4,3]
  ]
  perldl> $b = $a * $a;
  perldl> print $b;
  [
    [25,9,16]
    [36,16,9]
  ]
Example 2: Creating a 2X3 Matrix and multiplying it by itself.

(a)
perldl> $a = rfits "PDL1.11/m51.fits";
IO loaded
BITPIX = 16 size =65536 pixels
Reading 131072 bytes
BSCALE = 1.0000000000E0 && BZERO
       = 0.0000000000E0
perldl>

(b)
perldl> p dims $a
256 256
perldl>

(c)
perldl> p stats $a
104.193572998047 67.4254211880158
perldl>

(d)
perldl> p sec ($a,0,3,252,255)
[
  [50 51 54 53]
  [50 50 53 54]
  [51 52 53 52]
  [54 53 54 51]
]
perldl>

Example 3: (a)Reading in and plotting an image; (b) determining the dimensions of data by using the dims() function; (c) determining the mean and standard deviation of a piddle using the stats() function; (d) printing a portion of the piddle, using sec() function.

(a)
perldl> imag $a
Loaded PGPLOT
Displaying 256 x 256 image from
  24 to 500 ...

(b)
perldl> imag sin(0.05*$a)
Displaying image 256 x 256 from
  -0.999990224838257 to
  0.999992072582245 ...
Example 4: (a) imag() displayign the piddle $a; (b) results of humiliation.

(a)
perldl> $r = rvals zeroes 20,20

(b)
perldl> $g = exp(-($r/6)**2)/108.08
perldl> imag $g

(c)
perldl> $b = convolve $a.$g
perldl> imag $b

(d)
perldl> imag $a-$b
Example 5: . (a) Chaining PDL functions; (b) using the exp() function to generate a two-dimensional Gaussian; (c) convolving Messier 51 using the newly created Gaussian filter; (d) creating an unsharp masked image.


Tables

(a)                                

+ - * / > < >= Array operators/functions <= << >> & (same as Perl and C | ^ == != += but they act element by -= *= /= %= element) **= <<= >>= &= |= ^= <=> ** % ! ++ - "" atan2* sqrt* sin* cos* log* exp* abs* X Matrix multiplication ~ Matrix transpose byte short Type conversions ushort long float double convert pdl Create/copy a piddle topdl Coerce to piddle if scalar howbig Size of piddle datatype in bytes nelem Number of elements dims Return list of dimensions inplace Perform operation in place list Convert piddle to list, e.g. for (list $x) { } listindices Return list of index values (1D) log10* Take log base 10 min max sum Min/max/sum of piddle zeroes ones Create zero/one-filled piddle sequence Create sequence-filled piddle reshape Reshape the dimensions of a piddle sec Subsection of a piddle int* set Insertion/Setting at Return pixel value at (x, y, z, ...) axisvals* Fill piddle with axis values xvals* yvals* zvals* rvals Fill piddle with distance from its center callext Call external C code in dynamically loadable object convolve Convolve image with kernel (real space) hist Histogram of data stats Return mean and standard deviation transpose Matrix transpose qsort* Quick sort piddle median Median of piddle oddmedian Lower odd median of piddle (b)
fibonacci* Compute Fibonacci series (simple 1D example) cc8compt* Connected 8-component labelling (2D example) (c)
rfits Read a FITS format file wfits Write a FITS format file rcols Read columns in a text file into piddles rgrep Read regexp matches into piddles rdsa Read a DSA format file (Starlink systems only) (d)
imag Display an image ctab Load an image color table line Plot vector as connected points points Plot vector as points errb Plot error bars cont Display image as contour map bin Plot vector as histogram (bin hist $data, for instance) hi2d Plot image as 2D histogram poly Draw a polygon vect Display two images as a vector field hold Hold current plot window range; for example, for overlays release Autoscale new plot window for each command rel Synonym for "release" env Define a plot window, put on "hold" dev Explicitly set a new PGPLOT graphics device (e)
iis Display image iiscur Return a cursor position iiscirc Draw circles on image display saoimage Start SAOimage ximtool Start Ximtool (f)
flush Update C cache from piddle Perl data structure copy Copy a piddle new Create a piddle
Table 1: PDL 1.11 functions. Starred items (log10*, for instance) act as mutators: When you say log10(inplace($a)), every element of $a is replaced with its logarithm base 10. (a) Defined in PDL::Core; (b) defined in PDL::Examples; (c) defined in PDL::Io; (d) defined in PDL::Graphics::PG (PGPLOT graphics); (e) defined in PDL::Graphics::IIS (talk to IIS protocol image display widget); (f) PDL methods.


DDJ

Return to Article

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.