Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

Ruby vs. Perl


Dec02: Ruby vs. Perl

brian has been a Perl user since 1994. He is founder of the first Perl Users Group, NY.pm, and Perl Mongers, the Perl advocacy organization. He has been teaching Perl through Stonehenge Consulting for the past five years, and has been a featured speaker at The Perl Conference, Perl University, YAPC, COMDEX, and Builder.com. He can be contacted at [email protected].


Lately, I have been playing with Ruby. Occasionally, I take a vacation from Perl to try out a new language. Usually, I cannot wait to get back to Perl. I did not find Python's whitespace rules very appealing, Java is just too much work, and most people have never heard of Smalltalk. I am still playing with Ruby—maybe because it comes with Mac OS 10.2. Should I switch from Perl to Ruby?

Ruby has what most PASCAL users have wanted in Perl—a writeln sort of function. The "hello world" program in Ruby is just a little bit shorter than in Perl. I can leave off the trailing newline in Ruby, which takes away one of my most common programming omissions.

	#!/usr/bin/perl
	print "Hello world!\n"

	#!/usr/bin/ruby
	puts "Hello world!"

I also like that everything is truly an object, even literals, so that I can call methods on everything. In Perl and some other pseudo object-oriented languages, I first have to get an object to treat something like "0" as an object. In Ruby it already is an object, and I use it this way later in this article. Since Ruby came after Perl and Python, it simplified some things that those languages had to continue to support (although Perl 6 may do even better).

A couple of weeks ago, I started to port my Business::ISBN module, which lets me deal with International Standard Book Numbers (ISBN), to Ruby. The module is fairly simple and does not have that many advanced concepts. It mostly handles simple string processing along with simple arithmetic. It was one of my first real Perl modules, so maybe it should be my first Ruby module, too.

Although Ruby has been around since 1995, it still does not have many modules. One of the reasons that Perl is so useful and always draws me back is that it is easy to get things done because it has so many modules. The Comprehensive Perl Archive Network (CPAN) has about 2500 registered modules, and many more unregistered ones—all of which I can easily find on CPAN Search (http://search.cpan.org/). The Ruby Application Archive (RAA) (http://www.ruby-lang.org/en/raa.html) has only a fraction of that total.

The ISBN is a number that publishers assign to books and similar items. Each binding of a book should get its own ISBN. For instance, Programming Perl, the first edition, has the ISBN 0-937-17564-1. The first part of the ISBN, "0" in this example, specifies the language, and the second part the publisher. The third part uniquely identifies the book, and the last part is the checksum. These might show up in data with or without the hyphens, or with spaces or other characters. I created the Business::ISBN module so I could easily go through tens of thousands of ISBNs, verify their checksums, and sort them by publisher.

Part of the Business::ISBN module turns ISBNs into European Article Numbers (EAN), which do the same thing as ISBNs but for more than just books. The EAN prepends three digits to the start of the ISBN and computes a new checksum. In Perl, I use fairly basic coding structure to do the job (see Example 1). In the Perl version, line 5 gets the ISBN as a string with no extra characters and in line 7 returns, unless the ISBN is in the expected format. After that, I go through the EAN algorithm. The foreach loop adds up the necessary digits of the ISBN so it can compute the final EAN checksum.

In Ruby, I can simplify a lot of the EAN checksum code with a specialized iterator (see Example 2). Since everything is an object, I can call methods on anything, including 0. In this case I call the step() method on 0. It goes up to 10 in steps of 2. That little bit of code represents the foreach loop in the Perl version.

People complain that Perl ranges only go in one direction. In the Perl version of the Business::ISBN::_checksum routine (Example 3), I need to multiply the first ISBN digit by 10, the second digit by 9, and so on for the first nine digits. I have one sequence that is ascending, the position in the string, and another descending, the factor. That is not a big problem in Perl since I can use the reverse() function to switch around 2..10.

Perl needs to know the whole range to reverse it. It cannot give you the last element as the first element until it knows what that last element is. I could have converted the foreach() to a for() or a while(), but that is not very Perly. Ruby handles this naturally because the step() method can go backwards, as in Example 4.

So far Ruby is looking pretty good, at least for the things I have pointed out, but it misses one of Perl's greatest strengths that, until now, I had taken for granted. In the Ruby version of the _checksum routine, I have to explicitly handle the number-to-string and string-to-number conversions. In line 6 I have to turn the ISBN digit from a character into an integer so I can use it in the multiplication, and once I have the checksum, in line 11, I have to make sure that it is a string so that the other methods in the module can use it correctly.

Ruby strings are sequences, meaning that I can access parts of the string with a range (for which I would use a substr() in Perl). However, if I access a single index, like @isbn[9], I get the ordinal value of the character instead of the character itself. If the tenth character, the checksum, were an "X," a valid ISBN checksum character, @isbn[9] returns not the character "X," but its ordinal value 88 (decimal). This must be useful for something, but not anything that I do frequently. It does make it easy to process a string as binary data, though. If I want to access part of a string as a string, I need to use a range, even if the range is only 1. To get the tenth character of @isbn, I use @isbn[9..9]. Although this is annoying, it is better than Perl's substr( $isbn, 9, 1 );.

In the end, I will probably keep my eye on Ruby, and when I have free time, I'll play with it. I might even be able to use small Ruby scripts to perform specific tasks. I cannot stop using Perl though, because it is just too useful. No other language can beat the utility of CPAN. Perl has some rough edges, but it gets the job done faster and just as well as a more purist approach.

References

  • EAN International and the Uniform Code Council, http://www .ean-ucc.org/
  • ISBN.org, http://www.isbn.org/standards/home/index.asp

  • Business::ISBN, http://search.cpan.org/author/BDFOY/Business-ISBN-1.70/

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.