Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

ObjectivePerl: Objective-C-Style Syntax And Runtime for Perl


September, 2004: ObjectivePerl: Objective-C-Style Syntax And Runtime for Perl

The Perl Journal September, 2004

ObjectivePerl: Objective-C-Style Syntax And Runtime for Perl

By Kyle Dawkins

Kyle works for the New York-based consultant firm Central Park Software and can be reached at [email protected].


Ifirst encountered the Objective-C language on some of the earliest NeXT workstations in the early '90s and had been impressed by its elegance and power. It was very easy to learn and its approach to object-oriented programming was refreshing. Instead of having an all-powerful, strongly typed compiler (like C++, Ada, or Modula-3), it gave the programmer the choice of strong- or weak-typing at compile time. An incredibly rich run-time environment gave the programmer a great deal of flexibility and control over the objects in the system, unlike those other languages.

Then, in the mid-'90s, Java hit the software world with a crash. Java came solidly from the C++ world in terms of its design: A fussy compiler that forced strong-typing and an even fussier and underpowered runtime. Tunnel vision set in and much of the software world somehow convinced itself that Java was the "right" way to do OO and that it was "advanced" and "pure." That's one opinion—one with which Perl programmers often disagree.

After being stuck in the world of Java for years, I ended up in the world of Perl due to some great luck. In Perl, I immediately spotted the runtime-on-steroids that I missed from Objective-C. The extraordinary power and flexibility of the Perl compiler/run-time was a breath of fresh air, and I took to Perl like a duck to water.

Unfortunately, as time went by, some of the luster faded. I was disappointed with the Perl concept of objects; or rather, that Perl had a few different concepts of objects and they were all a bit "bent." An object is a blessed hash? Or a blessed array! Or a tied variable! It was all a bit much. Even though I knew that it was simply a testament to the amazing power of Perl, it didn't stop me from missing some of the things in Objective-C, and even Java, that were lacking in Perl.

Things That Perl Didn't Have

  • Real instance variables.
  • Visibility levels of instance variables.
  • Named arguments in method signatures.
  • Static versus instance methods.

To me, these are nonnegotiable in an object-oriented language. Let's go through each one and talk about it in relation to Perl.

Real Instance Variables

Perl emulates instance variables in many different ways. For objects that are constructed as blessed hashes, which I'd say is the most common paradigm in Perl, the closest thing to instance variables are the keys to the blessed hash. So if I have an object $self, I refer to its "window" instance variable like so:

$self->{window} = SomeWindow->new(10, 10, 100, 100);

This produces a lot of extraneous code in the form of $self->{blah} and sometimes makes it hard to catch errors at compile time, even when you're using strict, since hash keys are not checked in any way. Object-oriented languages like Java implement instance variables in a very clean way. You declare it once in the class:

public String someInstanceVariable = "Hey you!";

and then later on, in some instance method, you just use that variable as a local:

public String someMethod() {
   if (someInstanceVariable.equals("Hey you!")) {
      return "Foo!";
   }
   return "Bar!";
}

Visibility Levels for Instance Variables

Data abstraction and encapsulation is critical to the object-oriented model. In Perl, you really have to work hard to encapsulate your data. Damian Conway devotes 30 pages to it in his book Object Oriented Perl (Manning, ISBN 1884777791), providing numerous suggestions and techniques. Most Perl programmers trust themselves and others not to break the rules, and everyone's happy. But contrary to what a lot of people think, visibility levels are not for security purposes; they're to make it harder for you to shoot yourself in the foot.

Many object-oriented languages provide at least three levels of visibility for instance variables:

  • Public—any object can access this instance variable on its owning object.
  • Protected—only an instance of this object or an instance of a subclass of this object can access this instance variable.
  • Private—only an instance of this object can access the instance variable.

Some languages also define other levels. For example, Java defines "package" to mean that the instance variable is visible to any object in the same package as the owning instance. Since Perl doesn't really have real instance variables (see above), it obviously doesn't have any kind of built-in visibility system, either.

Named Arguments in Method Signatures

This is such a simple and effective concept that it's mind-boggling that the designers of Java missed it, and it's made the life of many a Java developer a total nightmare. But let's take a look at the concept first and what it means to Perl.

Most of us would agree that a method usually has a name, some arguments, a body, and a return value of some kind. There are many variations, of course (methods with no return values, or no name, or no arguments), but this is the general idea. That's all very well, but more often than not we end up with code that looks like this:

$newObject->init("Phil", 185, 95, "FRANCE", "BROWN", "GREEN");

What is all that? You can't tell from context exactly what the programmer meant by this line of code. You can get the gist, that's for sure. It's initializing $newObject with a slew of values, and we can probably surmise that "Phil" is a name, the next two numbers are probably height and weight (in metric values), then maybe country of birth, then probably hair color, and finally eye color. But maybe it's not "country of birth" but "country of residence"? And maybe BROWN refers to trousers, not hair color? I think you get the point. Java suffers from this terribly, particularly because of its method overloading: A class could have 10 methods, all called set, that take different numbers of arguments. So when you see this in someone's code:

myNewObject.set(10, "Tweak", new Integer(12), "Stalefish");

you're going to be lost.

The solution to this is named arguments. We see it in Perl fairly regularly (and yes, you should pay particular attention to the names in this example):

$newObject->init( -subject => "Phil",
                 -courseNumber => 185,
                 -numberOfStudents => 95,
                 -fieldTripLocation => "FRANCE",
                 -professor => "BROWN",
                 -teachingAssistant => "GREEN" );

That is much easier to understand than the earlier example, and we see that our original assumptions were in fact totally incorrect, too.

But you can take this one step further. If you were to look at the source code to the init method above, you would not immediately see what the arguments are:

sub init {
  my $self = shift;
  ...
}

so if you generated API documentation from your code (which is something I do), you would not see a canonical list of what that method expects. Moreover, since it's really a hash, it has no inherent order.

In Objective-C, this problem is solved by naming the arguments within the method signature itself:

- (void) initWithSubject:(char *)subject
           courseNumber:(char *)courseNumber
       numberOfStudents:(int)numberOfStudents
      fieldTripLocation:(char *)fieldTripLocation
               taughtBy:(char *)professor
             assistedBy:(char *)teachingAssistant    
{
      ...
}

So you can tell, both from the method signature and any invocation, what is going on. This also helps with class browsers, for example, where you can browse through the class hierarchy and see clearly which arguments go where for any method.

Static Versus Instance Methods

In Perl, there's no real distinction between static and instance methods. Sometimes it's nice, as a programmer, to be able to delineate methods that work on a class level from methods that work on an instance level. Most static methods in Perl are only static because their first argument is the class name, rather than an instance of the class. It's a matter of taste, but I prefer clearly indicating which methods are static and which are not.

Enter ObjectivePerl

I had many different projects on the go, some for work, some personal, but all Perl. I found myself longing for the day where I could write nice, clean, tight, readable code of the kind that Objective-C tended to encourage. But I also found myself very attached to the wonderful Perl runtime and its supreme flexibility. So I decided to harness that power and try to solve some of the problems I outlined earlier. But how?

My first decision was a simple one. I like Perl, I like Objective-C, so why not add the familiar Objective-C syntax to Perl, the way it was originally grafted onto C? Thanks to Damian Conway's excellent Filter::Simple, this proved to be possible. Well, almost.

ObjectivePerl Syntax

To understand ObjectivePerl syntax, you need a quick primer in Objective-C syntax. The core of Objective-C syntax is the message-passing mechanism. Rather than invoking methods on objects, you send messages to objects. It's a small distinction but it's subtly different from your standard object/method paradigm. Objects might not respond to a message, but that won't necessarily cause your code to yack. Also, objects can forward messages to their delegate objects, for example. Message passing is achieved using this syntax:

returnValue = [object message];

or with arguments:

returnValue = [someObject doThis:this withThat:that andSomethingElse:thisOtherThing];

You are free to embed messages in messages:

[someObject setValue:[someOtherObject getValue]];

Now, since we can't really use the square brackets in Perl (since they already do more than one job in Perl), I needed to find some not-too-different way to embed ObjectivePerl messaging into Perl. I did some hefty regexp searches of the Perl Standard Library and found that ~[object message] wouldn't conflict with anything.

So, using Filter::Simple, I started writing the parser that would translate this new ObjectivePerl syntax into regular Perl method invocations. (Some of you might wonder why I implemented this as a source filter rather than write a real parser, perhaps using Parse::RecDescent. The answer to that is simple: Most of my work is in mod_perl, which retains all its loaded modules in memory. I didn't want any large modules—like Parse::RecDescent—expanding the memory footprint of my Apache processes.) Here's the basic gist of the filter:

use ObjectivePerl;
...your objective perl code...
no ObjectivePerl;  # you don't need this if you're at the end of the file

After some tweaking, it actually worked, so I started working on the other aspects of ObjectivePerl that I wanted to include. I wrote some parsing goop that allowed the developer to define classes and their member methods using the Objective-C syntax:

@implementation ClassName [: ParentClass] [<Protocol, Protocol, ...>]
... class body ...
@end

Here's a silly example:

use ObjectivePerl;
@implementation MyClass

+ new {
       ~[$super new];
}
- someMethodWithArgument:$argument andAnother:$another {
  print "Argument: $argument\n";
  print "Another: $another\n";
}

@end

Methods are defined by either a leading "+" indicating that they are static methods, or a leading "-" indicating that they are instance methods. Static methods automagically have the variables $super and $className set, and instance methods get $super, and of course, $self.

So now you could instantiate that like this:

my $instance = ~[MyClass new];

and then invoke its one method like this:

~[$instance someMethodWithArgument:"Hey there" andAnother:$someValue];

What About Instance Variables?

I still needed to implement instance variables and visibility levels. It seemed like the Filter::Simple way of rewriting the source code before it's executed would suit this task, too, so I added the parsing of an instance variable declaration block, like this:

@implementation MyClass
{
  $someInstanceVariable;
  $someOtherInstanceVariable;
}
...
@end

Variables declared in the instance variable block are special insofar as your instance methods automatically have access to them. For example, you could write accessors like this:

- someInstanceVariable {
  return $someInstanceVariable;
}

- setSomeInstanceVariable:$value {
  $someInstanceVariable = $value;
}

which is a nice clean way to encapsulate your instance data. Of course, you can use the instance variables in any instance method, not just accessors:

- performSomeTask {
  my $taskVariable = $someInstanceVariable * 2;
  return ~[$self performSomeOtherTask:$taskVariable];
}

Right now, it's limited to scalars as instance variables, but I don't see that as being much of a problem since I use references all the time.

Instance Variable Visibility

For instance variables to be real, they need to be accessible by instances of the class, but also by instances of subclasses. Furthermore, the developer should be able to restrict access by subclasses to parent class instance variables, using visibility rules. This proved to be the trickiest part of the equation, and the solution is something that I won't go into here. However, the result is that there are two different visibility levels available to the developer: protected and private. The common visibility level, public, has no application here because of the way Perl dereferences objects. In Java, if instance variable rabbit of object instance hat is public, then anyone can do this:

System.out.println(hat.rabbit);

which reaches right into hat and pulls out the value of rabbit. In Perl, there's no such dereferencing mechanism because objects are mostly blessed hashes, and their instance variables are just specified by the hash keys.

To declare visibility levels in ObjectivePerl, you simply use the @protected and @private directives when declaring the instance variables in the declaration block:

@implementation MyClass
{
  @protected: $this, $that, $theOther;
  @private: $myDeepestThoughts, $lingerie;
}
...
@end

So in this case, all five of those variables are accessible to methods in MyClass, but only $this, $that, and $theOther are visible to any subclasses of MyClass. This doesn't mean that the variables are not there in subclasses, just that they can't be accessed directly from methods in the subclass. Here's an example to illustrate this:

use ObjectivePerl;
@implementation BaseClass
{
 @private: $private;
 @protected: $protected;
}

- protectedValue {
return $protected;
}

- setProtectedValue: $value {
$protected = $value;
}

- privateValue {
return $private;
}

- setPrivateValue:$value {
$private = $value;
}
@end

 BaseClass

- dumpProtectedValue {
 print $protected."\n";
}

- dumpPrivateValue {
 print ~[$self privateValue]."\n";
}

@end

Notice that the subclass can transparently refer to the $protected instance variable, even though it did not declare it itself. However, it cannot refer to $private because that will cause a compile-time error. However, it does actually have the $private variable and can set and get its value using the accessor methods defined by the parent class.

Why Bother?

When George Mallory was asked why he wanted to climb Mount Everest, he famously replied "Because it's there." Well, in many ways, this is the same kind of situation: For the most part, I did it because I could—for fun. I also find it useful, however, and I'm using it with some of my own projects. Your mileage may vary. I like the clean coding style, but it might not be your cup of tea. I also was intrigued by the CamelBones project (http://camelbones .sourceforge.net/) that allows you to write GUI applications for Mac OS X in Perl. It struck me that it would be an easy transition from Objective-C (which is the most common language for application development on Mac OS X) to an Objective-C-like version of Perl. From my experience so far, it is.

What Next?

Right now, there's quite a long to-do list, including:

  • Adding visibility levels to methods (like Java has, where a method—like an instance variable—can be private, protected, or public).
  • Implementing the Objective-C concept of Protocols (known as "interfaces" in the Java world).
  • Allowing nonscalars as instance variables.
  • Improving the debugging capabilities, which are presently very rudimentary.

But even in its current form, ObjectivePerl is very usable. Since it doesn't take anything away from regular Perl, you can mix and match at will. If you know how the runtime works, too, you can easily send messages to ObjectivePerl objects from regular Perl, and you can always invoke regular Perl methods from anywhere in ObjectivePerl. You can even subclass regular Perl classes in ObjectivePerl and it will work because underneath it all, your subclass is really just a regular Perl class with a lot of the goop handled for you.

If you're at all curious, go ahead and download it from CPAN and play around. I'd be happy to answer any questions you might have and more than happy to accept help in bug-stomping and improving it!

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.