Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

lvalue Accessor Methods With Value Validation


July, 2004: lvalue Accessor Methods With Value Validation

Juerd is the owner of Convolution (http://convolution.nl/), and is the author of numerous Perl modules, including Attribute::Property.


You like Perl. You appreciate its flexibility, expressiveness, power, and culture. Or maybe that's just me. I like that Perl enables me to write what I mean in a way that I feel comfortable with. More important is that it lets me change what I don't like. Even when something has always been done in a particular way, I can decide to make things easier for myself. Even better: I can share my code via CPAN and make things easier for everyone who likes it!

For a long time, I used accessor methods in the same style that most Perl coders know them. But it never felt natural. Let me try to explain this using some contrived example code:

{
  my $foo
  sub foo {
    @_ ? $foo = shift : $foo
  }
}

foo(42);
print foo;

Do you think that looks strange? I do. This is not something you're likely to encounter: a sub that exists only to provide access to a variable. Usually, the variable is used directly. I'm very comfortable with just $foo and I think you are, too.

But for some reason, the object-oriented world is different. For that world, someone once thought that access to variables should only happen through accessor methods, like the one illustrated above. The OO variant would shift a $self and then set an element in the structure that it represents, but the idea is the same.

When I learned OO in Perl, I found this very strange and I didn't like it. I don't find it strange anymore, but I still don't like it. In my perfect world, everything is simple and I don't think much about who can access my data and who should not be allowed to. But more importantly, things should be easy to use.

It may be easy to use such accessor methods and, with the help of some modules, it can even be done without thinking about the internals, but some very fundamental things that Perl programmers are used to are not possible simply because there is no lvalue to work with (see the sidebar "What is an lvalue?" for more on lvalues).

For example, when I made my first LWP program, I wanted to add the name of my program to the user agent string. The original user agent could stay because I was rather proud that I used Perl with LWP. For this, I had to call $ua->agent twice, once for reading the value and once for storing the new one. This really disappointed me because I wanted to use s/$/ $0/, or at least just .= " $0". (This was before letting the user agent string end with a space meant anything.)

Of course, it's possible to just access the internal hash, but that's not part of the public API and I had already learned that laziness is a virtue, so I didn't like having to type curlies all the time.

I don't want OO programming to be that much different from "normal" programming. A method should behave like a sub and a property should behave like a variable. That parentheses are required for method calls with arguments is bad enough already. Properties should behave like variables—they already do in Ruby, Python, Javascript, Perl 6, Visual Basic, PHP, and the like. But not in current Perl. In Perl, a property is usually a method that accesses a hash element, and accessing that hash element directly is frowned upon.

Using accessor methods feels like using FETCH and STORE instead of a tied hash: It works, but I'd rather have something a little more convenient.

Attributes

Perl now has attributes. To clarify things, attributes don't really have anything to do with object orientation: at least in the Perl 5 world, an attribute is not the same as a property. (If this is hard to understand, prepare yourself for Perl 6, where attributes become traits, properties become attributes, and properties are runtime traits.)

Attributes can be attached to variables and subroutines when they are declared. For this, the colon is used. Several built-in attributes exist: locked, method, lvalue, and unique. All of these in some way change the behavior of your variable or sub.

The one I'm after is lvalue. This lets me do exactly what I want—have a subroutine (or more specifically, a method) behave like a variable.

Simply put, you have to return an lvalue and that lvalue is then used. This "returning" cannot be done with the return keyword but instead has to be the natural expression the sub evaluates to. In other words, the last expression in the sub.

So my previous example, using this new feature, would be:

{
  my $foo;
  sub foo : lvalue {
    $foo
  }
}

foo = 15;
print foo;

Or, in the simple OO world where every object is just a reference to a blessed hash:

sub foo : lvalue {
  my $self = shift;
  $self->{foo}
}

$object->foo = 15;
print $object->foo;

Now the method can be used as if it were a variable. All those nice +=-like operators work with it and even s/// now works. It's still an accessor method, but at least it feels much better now—and I don't have to use curlies.

Tradition

History tells us that accessor methods don't just exist to hide internals but also to do something with the new value before it gets assigned: To trim whitespace from it, to automatically make it into an object, or more commonly, to test if the value is what we consider valid. Truth be told, it wasn't history that told me, it was the monks of Perl Monks (http://www.perlmonks.org/). Having learned about the lvalue methods and having used them for a while, I went there to ask why this technique was not used more often. Clearly, it was superior, I thought.

As it turns out, it's superior only for the class user—not for the class maintainer who likes to make sure no strange values are passed because that eventually means debugging a lot and tracing back to where the strange value came from. With normal accessor methods, you can see the new value before you decide that it is valid.

With lvalue methods, you usually don't know what happens later. Someone can take a reference to the variable you returned and assign it a new value 14 times. Traditionally, we have used accessor methods that do not behave like variables because if something behaves like a variable, we do not have total control.

Total Control

But that doesn't have to be true. It is probably true in other languages, but in Perl, we do in fact have the power to decide what should happen when our variables are used. We can tie them to a class and keep track of everything that happens, adjusting things if what happens is not what we want to happen.

tie is used either to link methods to a variable or to provide a variable-like interface to methods, depending on the point of view. The FETCH and STORE methods are called to fetch and store the value from and into the variable. Another word for variable is lvalue.

With tie, it is possible to keep track of what happens to an lvalue, and that is exactly what was missing: control over what happens to the variable after the sub has finished. Even though the recipe sounds simple, since all we have to do is tie our value before returning it, we get complex code when doing so. We also need a class to tie to.

Suppose we want to make sure that whatever value we put into foo is less than 50. That can be written as:

package Tie::Scalar::LessThanFifty;
use Carp::Croak;
use Tie::Scalar;
our @ISA = ('Tie::StdScalar');

sub STORE {
  my ($self, $new_value) = @_;
  croak "Invalid value" unless $new_value < 50;
  $$self = $new_value;
}

package My::Class;

sub foo : lvalue {
  my $self = shift;
  tie $self->{foo}, 'Tie::Scalar::LessThanFifty';
  $self->{foo}
}

$object->foo = 10;
$object->foo = 55;  # "Invalid value"

Improvement

While this works and solves the immediate problem, it isn't really easy to type and maintain. When written like this, a separate class is needed for each value test, and $self->{foo} stays tied long after the calling code has no need for it anymore.

So, ideally, the class we tie to should allow a coderef to be passed for the validity test to avoid needing dozens of Tie classes. And only a temporary variable that acts as an interface for the hash element should be tied so that the actual hash element itself stays untied and things aren't tied any longer than necessary.

It would be even better if there was a very simple syntax to create such new style properties.

Enter Attribute::Property

Attribute::Property does all this and a little more. It lets you write the big block of code above simply as:

sub foo : Property { $_ < 50 }

$object->foo = 10;
$object->foo = 55;  # "Invalid value for foo property"

Its interface is an attribute, like the built-in lvalue attribute. Attributes that are not built into Perl usually begin with a capital letter, like Property in this example. A Class::MethodMaker-like interface could just as easily have been used, but Perl has a new feature and I like trying out new features, especially if that feature lets people use very nice syntax without using a source filter.

Whenever you use the Property attribute, internally, your method is replaced with a complex and microoptimized method that has the lvalue attribute and returns a tied temporary variable. The original code block ($_ < 50, in this case) is hijacked and passed on to tie, so that it can be used later on when something decides to STORE a new value.

When a value is stored, the original code block is executed and its return value is evaluated in Boolean context. When True, the new value is assigned, but if False, something croaks. If the default error message might not be appropriate or you have a message that is more descriptive, it's just a matter of croaking before the STORE handler gets the chance to do so:

sub foo : Property {
  $_ < 50
    or croak 'Value for foo property exceeds limit' 
}

As you have probably already seen in the examples, the new value is available as $_. But just for convenience, it is also passed in @_. In @_, you'll find the object (or class) and the new value. $_ and $_[1] are both aliases for the new value. That means that you can choose to change it and that the value exists only once in memory. For example, to trim trailing whitespace from the new value:

sub bar : Property { s/\s+$//; 1 }

The extra 1 at the end is to make sure truth is returned. Otherwise, a value without trailing whitespace would have been considered an invalid value.

To ease migrating from the old accessor methods to the new lvalue accessor methods, Attribute::Property also lets the user use the old syntax. That means that $object->foo = 5 and $foo->object(5) do the same thing.

In case no value validation is needed, the validation code block can simply be left out. For efficiency, the value is then left untied. Don't forget that without a code block, subroutine declarations need a semicolon:

sub baz : Property;

If you have several properties, vertically align the colons to get very neat and readable code:

package Article;

sub title    : Property;
sub subtitle : Property;
sub author   : Property;
sub abstract : Property;
sub body     : Property;

All that's missing now is a constructor. You can either craft one yourself (as long as the object is a reference to a blessed hash) or use the simple one that Attribute::Property provides:

sub new      : New;

This is all that is needed to get a functional Article class.

Perlfection

Now, my properties finally behave like variables, without sacrificing validation and being able to alter the new value before it's stored. And all that comes with syntax that lets me write most properties on a single line.

A lot of time was spent on writing Attribute::Property, even though the module has only 80 lines of code, but now I write clean OO code in much less time, so it was for a good cause: laziness.

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.