Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

Why I Love Ruby


Jan03: Why I Love Ruby

Why I Love Ruby

The Perl Journal January 2003

By Simon Cozens

Simon is a freelance programmer and author, whose titles include Beginning Perl (Wrox Press, 2000) and Extending and Embedding Perl (Manning Publications, 2002). He's the creator of over 30 CPAN modules and a former Parrot pumpking. Simon can be reached at simon@ simon-cozens.org.


In December's The Perl Journal, my fellow columnist brian d foy presented an introduction to Ruby. Well, he's not the only person who's been taking a look at this relative newcomer to the language scene, and I have to admit that I've been growing a lot more impressed with it recently.

This month, I'll take you on another tour of some of the things that attracted me to Ruby.

Perl 6, Now!

Let's start with a polemic: Ruby provides what Perl 6 promises, right now. If you're excited about Perl 6, you should be very, very excited about Ruby. You want a clean OO model? It's there. You want iterators? Got them, too. You want user-redefinable operators? Check. Even the recent discussion on perl6-language about list operators—Ruby's got them all. In fact, a lot of the things that you're waiting on Perl 6 for are already there—and in some cases, cleaner, too.

Let's start by looking at some code. A typical example of object-oriented Perl 5 is shown in Example 1(a).

Not too bad, right? Except, well, package is a bit of a silly name, since it's actually a class; and it would be nicer if we could take arguments to the method in a bit more normal way. And that hash is a bit disconcerting. In Example 1(b), you can see what Perl 6 makes of it.

Much better—except that, unfortunately, you can't actually run Perl 6 code through anything right now. That's always a bit of a problem when you need to get stuff done. So let's see it again, but this time in Ruby; see Example 1(c).

Much neater, no? Apart from the bits that are exactly the same, of course. But what? No dollar signs on the variables? Well, you can have them if you want, but they mean something different in Ruby—dollar signs make variables global. But hey, don't you need something to tell you what's an array or a hash or a scalar? Not in Ruby—and actually, not in Perl 6 either, but for a different reason.

In Perl 6, variable prefixes are just a hint; Larry has said that you should consider them part of the name. You'll be able to dereference an array reference with $myvar[123] and a hash reference with $myvar{hello}, so things looking like scalars won't give you any indication of what's in them.

Ruby takes this approach further—values have types, variables do not. Since everything's an object in Ruby, it doesn't make sense to distinguish between "array variables" and "scalar variables"—everything's an object, and variables hold references to objects. If you get bored with your variable that has an array in it, you can put a hash in it. Ruby doesn't care; it's just a different kind of object.

So what are those "@" signs about? They're the Ruby equivalent of Perl 6's $.—method instance variables. The only slight difference is that we want to ensure that the age is an integer; so we call its to_i method to turn it into an integer. We can do this because, as we've mentioned, in Ruby, everything's an object.

Everything's an Object

They say that a foolish consistency is the hobgoblin of tiny minds, and Perl takes this approach to justify some of its more unusual quirks. But unfortunately, when it comes to programming languages, a lot of consistency isn't foolish at all.

And so with the advent of Perl 6, I found myself wishing for a little more consistency in the area of object-oriented programming. In fact, I really wanted to be able to treat everything as an object, so that I could be sure that it would respond to methods. Ruby gives me that. Let's spend a little time with Ruby's interactive shell—another neat feature—and see what that really means:

irb(main):001:0> a = [1, 2, 3, 4]
[1, 2, 3, 4]
irb(main):002:0> a.class
Array

So arrays are objects; that's pretty natural, as you will want to ask an array for its length, run maps and greps on it, and so on.

irb(main):003:0> a.reverse
[4, 3, 2, 1]

But what about the individual elements in the array?

irb(main):004:0> a[1].class
Fixnum

Mmm, so numbers are just Fixnum objects. But wait, what's a Fixnum?

irb(main):005:0> a[1].class.class
Class

Ah, so even classes are objects; they're just objects of class Class. Fair enough. So this shouldn't be a surprise either:

irb(main):006:0> a[1].class.class.class
Class

It's objects all the way down!

Naturally, this allows pretty interesting introspection possibilities. For instance, we can ask an Array what it can do for us:

irb(main):007:0> a.public_methods
["sort!", "clone", "&", "reverse", ...]

And of course, this list of methods is itself an Array, so we can tidy it up a bit:

irb(main):009:0> a.public_methods.sort

["&", "*", "+", "-", "<<", "<=>", "==", "===", "=~",
"[]", "[]=", "__id__", "__send__", "assoc", "at", 
"class", "clear", "clone", "collect", "collect!", 
"compact", "compact!", "concat", "delete", 
"delete_at", "delete_if", "detect", "display", "dup", 
"each", ...]

Notice that since everything's an object, almost all operators are just methods on objects. One of those operator methods, ===, is particularly interesting; Ruby calls this the "Case equality operator," and it's very similar to a concept you'll see bandied around in Perl 6...

Making the Switch

Perl 6 is touted to have an impressive new syntax for switch/case statements called "given statements." With a given block, you can pretty much compare anything to anything else using the =~ "smart match" operator and Perl will do the right thing. Use when and a string, and it will test whether the given argument is string equivalent; use when and a regular expression, and it will test whether the argument matches the regex; use when and a class name, and it will test whether the argument is an object of that class. Really neat, huh?

Now I want to make you wonder where that idea came from.

Here's a piece of Perl 6 code taken directly from Exegesis 4:

my sub get_data ($data) {
    given $data {
        when /^\d+$/    { return %var{""} = $_ }
        when 'previous' { 
            return %var{""} // fail NoData
        }
        when %var { 
            return %var{""} = %var{$_}
        }
        default { 
die Err::BadData : msg=>"Don't understand $_"
        }
    }
}

And translated into Ruby:

def get_data (data)
  case data
   when /^\d+$/    ; return var[""] = data
   when 'previous' ; return var[""] || (fail No Data)
   when var        ; return var[""] = var[data]
   else 
     raise Err::BadData, "Don't understand #{data}"
   end
end

Of course, this doesn't quite do what we want because Ruby's default case comparison operator for hashes just checks to see whether two things are both the same hash. The Perl 6ish smart match operator checks through the hash to see whether data is an existing hash key. This code looks very much like the Perl 6 version, but it's not the same.

And we were doing so well.

Everything is Overridable

Not to worry. Not only is everything an object in Ruby, (almost) everything can be overriden, and the Hash class's === method is no exception. So all we need to do is write our own === method that tests to see if its argument is a valid hash key:

class Hash
    def === (x) 
        return has_key?(x)
    end
end

And presto, our case statement now does the right thing. The has_key? method on a Hash object checks to see whether the hash has a given key. But wait, where's the Hash object? Because we're defining an object method, the receiver for the method is implicitly defined as self. And it just so happens that self is the default receiver for any other methods we call inside our definition, so has_key?(x) is equivalent to self.has_key?(x). Now it all makes sense.

Of course, it's a little dangerous to redefine Hash#=== globally, in case other things depend on it. Maybe it would be better to create a variant of Hash by subclassing it:

class MyHash < Hash
    def === (x) 
        return has_key?(x)
    end
end


var = MyHash[...];

As you can see, this means that we can define === methods for our own classes, and they'll also do the right thing inside of when statements.

It also means that we can redefine some of the built-in operators to do whatever we want. For instance, Ruby doesn't support Perl-style string-to-number conversion:

irb(main):001:0> 1 + "0.345"
TypeError: String can't be coerced into Fixnum
        from (irb):1:in '+'
        from (irb):1
        irb(main):002:0>

And this is one of the things people like about Perl; "scalar" is the basic type, and strings are converted to numbers and back again when context allows for it. Ruby can't do that. Bah, Ruby must really suck, then.

Now we are going to do something very unRubyish.

class Fixnum
    alias old_plus + 
    def + (x)
      old_plus(x.to_f)
    end
end

irb(main):003:0> 1 + "2"
3.0

Ruby lovers would hate me for this. But at least it's possible.

First, we copy the old addition method out of the way because we really don't want to have to redefine addition without it. Now we define our own addition operator, which converts its argument to a float before calling the old method. Why is the addition operator unary? Well, remember that 1 + "2" is nothing more than syntactic sugar, and what we're actually calling is a method:

1.+("2")

and the receiver of this method is our self, 1. It's consistent, is it not?

You Want Iterators?

There are a set of people on perl6-language who become amazingly vocal when anyone mentions iterators. I don't know why this is. Iterators aren't amazingly innovative or particularly interesting, nor do they solve all known programming ills. But hey, if you really get fired up about iterators, Ruby has those, too.

The most boring iterator Ruby supplies is Array#each. (# is not Ruby syntax—it's just a convention to show that this is an object method on an Array object, not an Array class method.) This is equivalent to Perl's for(@array):

[1,2,3,4].each {
    |x| print "The square of #{ x } is #{ x * x }\n"
}

By the way, here's Ruby's block syntax: We're passing an anonymous block to each, and it's being called back with each element of the array. The block takes an argument, and we define the arguments inside parallel bars. Some people don't like the { |x| ... } syntax. If that includes you, you have two choices: the ever-beautiful sub{ my $x = shift; ... }, or waiting until Perl 6. See? { |x| ... } isn't that bad after all.

You can call each on ranges, too:

1..4.each {
    |x| print "The square of #{ x } is #{ x * x }\n"
}

Or maybe you prefer the idea of going from 1 up to 4, doing something for every number you see:

1.upto(4) {
    |x| print "The square of #{ x } is #{ x * x }\n"
}

Or even:

100.times {
    |x| puts "I must not talk in class"
}

(puts is just print ..., "\n", after all.)

Another frequent request is for some kind of array iterator that also keeps track of which element number in the array you're visiting. Well, guess what? Ruby's got that, too.

irb(main):001:0> a = [ "Spring", "Summer", "Fall", "Winter" ]
["Spring", "Summer", "Fall", "Winter"]

irb(main):002:0> a.each_with_index {
  |elem, index| puts "Season #{ index } is #{ elem }"
}
Season 0 is Spring
Season 1 is Summer
Season 2 is Fall
Season 3 is Winter
["Spring", "Summer", "Fall", "Winter']

Oh yes—these iterators return the original object, so that they can be chained, just in case you wanted to do something like that.

Those were the boring iterators. What about more interesting uses? I saw an interesting Perl idiom the other day for reading key=value lists out of a configuration file into a hash. Here it is:

open(EMAIL, "<$EMAIL_FILE") or die 
    "Failed to open $EMAIL_FILE"; 
my %hash = map {chomp; split /=/} (<EMAIL>);

Of course, how does this translate to Ruby? There was quite a long thread about this on comp.lang.ruby, and I picked out three translations that impressed me for different reasons—of course, there's more than one way to do it. Here's the first:

h = [] 
File.open('fred.ini').read.scan(/(\w+)=(\w+)/) {
    h[$1] = $2 
}

We open a file, read the whole lot into a string, and then iterate on a regular expression—each time the regular expression matches, a block is called, and this associates the hash key with its element. Nice.

Established Perl programmers will see this and jump up and down about depending on the open call never failing. Good thinking, but Ruby has decent structured exceptions; if the open fails, an exception will be raised and hopefully caught somewhere else in your program.

Now that method is cute, but it reads the whole file into a single string. This can be memory hungry if you have 120-MB configuration files. Of course, if you do, you probably have other problems, but people will be pedantic. It'd be much better to read the file one line at a time, right? No problem.

File.foreach("fred.ini") {
    |l| s = l.chomp.split("="); h[s[0]] = s[1]
}

This is a more literal translation of what the Perl code is doing. Notice that Ruby's chomp returns the chomped string, without modifying the original. If you want to modify the original, you need chomp!—methods ending with ! are a warning that something is going to happen to the original object.

But even this method lacks the sweetness of the Perl idiom, which builds the whole hash in one go. Okay. In Ruby, you can construct a hash like this:

h = Hash[ "key" => "value", "key2" => "value2"];

So if we could read in our file, split it into keys and values, and then dump it into a hash constructor like that, we'd have it. Here's our first attempt:

h = Hash[File.open("fred.ini").read.split(/=|\n/)]

That's close, but it has a bit of a problem. Because objects can be hash keys in Ruby, what we've actually done is create a hash with an Array as the key and no value. Oops. To get around this, we need to invoke a bit of Perl 6 magic:

h = Hash[*File.open("fred.ini").read.split(/=|\n/)]

There we go, our old friend unary * turns the Array object into a proper list, and all is well.

So there are built-in iterators. But what if you want to define your own? All methods in Ruby can optionally take a block, and the keyword yield calls back the block. So, assuming we've already defined Array#randomize to put an array in random order, we can create a random iterator like so:

class Array
    def random_each 
        self.randomize.each { |x| yield x }
    end
end

What does this mean? First, get the array in random order, and then for each element of that array, call back the block we were given, passing in the element. Simple, hmm?

Messing with the Class Model

Let's move on to some less simple stuff, then. In a recent perl6-language post, the eminent Piers Cawley wondered whether or not it would be possible to have anonymous classes defined at run time via Class.new or some such. Man, that would be cool. I'd love to see a language that could do that. You can see this coming, can't you?

c = Class.new;
c.class_eval {
    def initialize
        puts "Just another Ruby hacker"
    end
}
o = c.new;

First, we create a new class, c, at run time, in an object. Now, in the context of that class, we want to set up an initializer; Object#initialize is called as part of Object#new. So now our class has a decent new method that does something; we can instantiate new objects of our anonymous class. But they can't do very much at the moment. Now, we could add new methods to the class with class_eval as before, but that's kinda boring. We've seen that. How about adding individual methods to the object itself?

class << o
    def honk
        raise "Honk! Honk!"
    end
end

o.honk;

This doesn't do anything to our class c; it just specializes o with what's called a singleton method—o and only o gets a honk method.

What about AUTOLOAD? This is a lovely feature of Perl that allows you to trap calls to unresolved subroutines. Ruby calls this method_missing:

class << o
    def method_missing (method, *args)
        puts "I don't know how to #{ method } (with 
              arguments #{ args })";
        end
    end

And notice there the Perl 6 unary star again, collecting the remaining arguments into an array.

There are many other tricks we can play if we do evil things such as override Object#new or play about with the inheritance tree by messing with Object#ancestors. But time is short, and I'm sure you're dying to move onto the last bit: What I hate about Ruby.

Ruby Gripes

Ruby is a comparitively young language, which is a mixed blessing. It's been developed at the right time to learn from the mistakes of other languages—try explaining why Python's array length operator len is a built-in function and not a method, and you'll appreciate the consistency of Ruby's OO paradigm. But, even though it's coming up to its 10th birthday, it's also still finding its way around, and the changes between minor releases are sometimes quite significant.

So what are the things I don't think Ruby has got right quite yet? First, I really, really, really miss using curly braces for my subroutine definitions. You can write subroutines in one line using Ruby; it's not as white-space significant as people make out:

def foo; puts "Hi there!"; end

but braces for blocks just seems so much neater.

I also miss default values for blocks; there is a special variable $_ in Ruby, but it contains the last line read in from the terminal or a file. So you really do have to say

array.each{|x| print x}
because
array.each { print }

won't do what you want.

There are a few other odd things: For instance, variables have to be assigned before they're used, which is probably a good thing but can confuse me at times. I also find myself tripping over Ruby's for syntax; Ruby supports statements modifying if, unless, while, and until, but not for, as for is just syntactic sugar for stuff.each anyway.

But there is one major problem I have with Ruby, and that's basically the reason why I haven't switched over to it wholesale: CPAN. Perhaps Perl's greatest asset is the hundreds and thousands of modules already available. Ruby has a project similar to the CPAN, the Ruby Application Archive. But as the language is still quite young, it hasn't had the time to grow a massive collection of useful code, and the RAA itself has some flaws—it's a collection of links, rather than a mirrored collection of material, and it can be pretty hard to find stuff on it at times.

This is, I'm sure, something that will be sorted out over time, but I have to sadly admit that Ruby's not quite there yet. Of course, Perl 6 will also need to spend time developing a large collection of useful modules; so at least Ruby does have a massive head start—it has a real, existing interpreter that you can download, play with, and use for real code today. And if you're at all interested in Perl 6, I heartily encourage you to do so.

Finally, thanks to David Black and the other members of #ruby-lang who helped review this article.

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.