Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Just Another Version of Algol


January 2000/Just Another Version of Algol


Introduction

This document is a preliminary specification of the Java™ language. Both the specification and the language are subject to change. When a feature that exists in both Java and ANSI C isn't explained fully in this specification, the feature should be assumed to work as it does in ANSI C. Send comments on the Java Language and specification to [email protected].

(First paragraph of "The Java Language Specification," Release 1.0 Alpha, Sun Microsystems Computer Corporation, February 1, 1995.)

Most programmers accept the obvious fact that Java is a programming language heavily influenced by C. The paragraph quoted above makes clear just how heavy that influence really is. Java is C revisited from a different perspective, and with different goals, and with a few new ideas thrown in. The same statement remains true if you substitute C++ for Java. The perspective is different, as are the goals, and the mix of added ideas is rather different. But the evolutionary process is much the same.

In making this observation, I do not diminish one whit the achievements of James Gosling (principal architect of Java) or Bjarne Stroustrup (principal architect of C++). Quite the contrary. We all build on the work of our predecessors. In fact, anybody who pointedly ignores applicable history is devoting time and effort to reinventing the obvious. We all make progress much faster by recycling proven successes, even as we innovate beyond them.

Dennis Ritchie (principal architect of C) is quick to point out that C derives directly from two earlier languages, B and BCPL. Neither achieved anywhere near the eventual success of C, but each taught some important lessons. These languages, in turn, are obviously influenced by Fortran — a much maligned but highly successful pioneer — and by Algol 60. The latter language achieved nowhere near the widespread user community of Fortran, but it has proved to be arguably more influential in the development of subsequent programming languages.

Algol 60 is, in fact, the granddaddy of all modern procedural languages. The term "procedural" indicates that you, the programmer, get to specify in exquisite detail the procedure by which the program achieves its goal. The term is often contrasted with "declarative," where you describe the nature of the data you have to work on and how you want it morphed, leaving it to the language system (compiler, interpreter, and/or run-time support) to figure out how best to do it. Procedural languages give you lots of opportunities to optimize a program for code size, data representation, and execution speed. They can also be precompiled into moderately compact "executable" files (even if they're interpreted). Given the highly varied needs of computing today, procedural languages rule.

Algol 60 offered a coherent, and elegant design at a time when ad hockery was the norm. Among many other things, it showed us the importance of block structure in expressing flow of control, long before Edsger Dijkstra wrote his now-famous 1967 letter to Communications of the ACM pointing out the dangers of using GOTO statements instead. It showed the benefits of nested namespaces in controlling the use (and reuse) of names in a large program. And it showed the benefits of divorcing input/output from the language proper.

So why didn't Algol 60 come to rule the world? The easy answer is that IBM and other large American companies were pushing Fortran at a time when software was a rather small tail on a large dog. The dog was all that expensive hardware — mainframe computers and the climate-controlled basketball courts needed to hold them. Systems software was bundled in at no extra cost with each multimillion-dollar hardware sale. Only a handful of companies made money selling replacements for free software components from IBM, and they had to offer significant benefits even to get a hearing. Software from third parties was generally left to languish.

But that is only part of the story. Algol 60 was far from a perfect solution. It required a call stack, for recursion and block nesting, at a time when hardware support for call stacks was sadly wanting. (Even PL/I, later flogged by IBM as a successor to Fortran and Cobol, suffered from this handicap.) Worse, it required a rather inefficient argument-passing mechanism. An argument to a function call was effectively recomputed each time it was referenced in the called function — if any components of an argument expression changed value, the argument was obliged to change accordingly. Thus, an implementation was obliged to replace each argument with a function pointer, which designated an encapsulated "thunk" of code that computed the argument on demand in the appropriate context.

This elaborate mechanism is not a good idea for passing your typical argument. C has taught us the virtues of "pass by value." It computes each argument expression once, right before the function call. The caller uses that value to initialize a local parameter variable, which then evolves as the called function sees fit. If you want to emulate the Fortran convention of "pass by reference," you pass a pointer value instead and let the called function dereference the pointer as needed. In either case, the program spends considerably less time shoveling arguments about than an equivalent Algol 60 program. And yes, argument passing even now is a significant contributor to execution time for many programs. Think how important it was at a time when a mainframe computer was rather less powerful than a Game Boy.

I could probably cite additional shortcomings, after a quick refresher on the Algol 60 language specification. But these are reasons enough. My point is that Algol 60, for all its elegance, was not a perfect mix of features. Later language designers had good and proper reasons for taking some of its features, stirring them in with a few new ideas, and leaving others behind.

As an aside, Algol 60 did have a direct successor. The programming language Algol 68 started out as a simple rework of the language, but it went off the deep end. It is telling that the original goal was to make a successor called Algol 64. Those extra four years were spent making the language even more abstruse and less likely to see effective implementations. The only living relic of Algol 68 that I know of is the Bourne shell. Steve Bourne came to Bell Labs with a trousseau of Algol 68 code which he transliterated into C. But he persisted in writing a dialect midway between the two languages. You can still see the traces in some oddly expressed and interestingly formatted C code.

Still another early dialect of Algol had wider acceptance, at least in the US military. I've been assured more than once that Jovial stands for "Joe's Own VersIon of ALgol." It always looked to me more like a bastard dialect of Fortran, but I never studied it all that closely. At least the name supports my basic theme, that language designers were happy to perpetuate the Algol 60 culture, to the extent that each designer understood that culture.

We old programmers have seen come and go Algol 60 itself, as well as Algol 68, PL/I, and Jovial. Pascal came and stayed much longer, thanks mostly to Borland and some really useful incarnations. C has come and stayed and stayed. It is now the "wallpaper" language that hangs everywhere, almost unseen. C++ has come, on the coattails of C, and is also showing good staying power. Best I can tell, the usage of C++ is still growing, so it's hard to trace out its full life cycle, but it will certainly be a long one.

Enter Java

And then there is Java. Java is unequivocally a member of the Algol 60 family, thanks to its direct C heritage. But it has been promoted as something quite new under the sun (pun intended). It is the programming language for a precisely defined Java Virtual Machine (or JVM), thus eliminating many of the portability problems of C. It avoids the worst bugs of C and C++ programming by restricting the use of pointers and providing automatic garbage collection. And it is a true child of the Internet, with built-in browser-style graphics, network communications, and downloadable executables.

According to the histories I've read, Java first spent several years wandering the desert inside Sun Microsystems, a solution in search of a problem. First it was targeted at smart appliances, then set-top boxes. Finally, it appeared with much fanfare as the perfect solution to the problem of sharing code across the Internet. With a Java interpreter in every web browser, you could write code once and run it anywhere (to crib the oft-repeated tag line from Sun's promotional literature). Hardware would be even more of a commodity than it has already become, since even the distinction between Windows, Mac, and Unix architectures would become irrelevant.

Soon there would be a whole suite of word processors, spread sheets, and all the other paraphenalia of your typical desktop, but now written in Java. Every time you run an application, its latest version is downloaded from a central source. Your support software keeps getting better with no effort on your part, and all in "Internet time."

That was the happy promise. The happy threat was a kind of platform independence that would soon kill off Microsoft Windows, or at least loosen its iron grip on desktop computers. It is interesting to note that neither the promise nor the threat has been fulfilled, even after nearly five years of Internet time, which supposedly runs rather faster even than dog years. Even more interesting, both promise and threat seem more remote today than when both were first bruited about several years (Internet aeons) ago.

So what happened? The easy answer is that Java has simply been the victim of excessive hype. Many of the goals set for Java, both overt and covert, are simply unattainable. At least, it will take more than a modest little programming language to achieve them. And many of the virtues ascribed to Java are just not that real, or not that novel, or not that important. It is not that Java is a failure as a programming language — far from it. But, at least in my opinion, it has not yet been allowed to find the niche where it can have its greatest success.

Take portability, for example. The designers of Java made a concerted effort to avoid what they perceived as the shortcomings of C in this department. They pegged the sizes of basic data types, the representation of integers and floating-point values, the behavior of a program when overflow or zero divide occurs. All of those issues do lurk in the back of the mind of your typical C programmer who endeavors to write portable code. But the question is, how much do they add to the cost of making code portable?

I've been programming in C practically since the day it was first implemented by Dennis Ritchie in the early 1970s. I've been writing portable C code since the second and subsequent implementations made such a thing possible. I've sold tens of millions of dollars of code that profited mightily from being written in portable C. And I contributed what I learned to the writing of the C Standard in the 1980s. Thus, I feel confident in opining that Java in some ways is an overreaction to the perceived problems of code portability in C.

In particular, the elasticity permitted implementers allows for some pretty impressive optimizations for each platform. But the added cost to the programmer is relatively small. The price that Java pays for a more restrictive programming model is performance. A JVM interpreter simply cannot run as fast as native code. Let me hasten to say that Sun, and others, have done an impressive job in producing the fast interpreters we have available today. But the performance hit is still there, and sometimes it matters. I question whether the small payoff in added portability is worth the resulting loss of application domains.

But that is the least of it. Java introduces a host of new opportunities for compromising portability, which I find often overwhelm considerations such as how many bits there are in a long integer. Java has multithreading and synchronization built into the language, for example. That's a great idea, given that most practical implementations of C, C++, or what have you must add multithreading with various ad hoc language extensions and magic library functions. But pushing this stuff down into the language does not eliminate the fundamental problems of multithreading. You still have to worry about race conditions, deadlock, and proving the code correct. Java allows enough latitude in the behavior of thread scheduling to present serious portability problems. The expert Java programmers I know minimize the use of threads as much as possible, in a program intended to be portable. Otherwise, they tell me, it is sure to behave unacceptably on at least one significant platform.

Java allows similar latitude in the interfacing of its graphics primitives to the underlying display system. My friends tell me that displays can suffer interesting changes as you move the code about, unless you stick with the most basic of display structures. Java is hardly alone in this regard, of course. Try crafting any nontrivial web page and look at it with the four most widely used web browsers. You'll see what I mean.

And then there's the interesting variation in the behavior of JVMs across different implementations. I'm sure you've heard the cynical rebuttal to "Write once, run anywhere." Veterans of the Java portability wars will tell you that it's really "Write once, test everywhere." Let me hasten to add once again that this is not a problem peculiar to Java. I learned many years ago that C code could not be considered portable until it was tested successfully on at least three diverse architectures. Even then, you should expect surprises when porting to a peculiar platform. We veterans know that portability is a relative measure of the cost of moving a piece of code versus rewriting it. It is not a boolean attribute. Now the Java enthusiasts are learning the same lesson, in Internet time.

Bulletproofing Java

Then there is the matter of robustness. As I mentioned earlier, Java severely restricts what you can do with pointers, a notorious source of the nastiest bugs in C and C++. Java calls 'em references (not to be confused with the similar C++ term) and only lets you concoct references to class objects allocated on the heap. You can't increment or decrement references, like you can pointers in C and C++, and any use of a null reference causes the JVM to throw an exception. Even more interesting, you have no way of freeing an allocated object and thus discrediting a reference. The JVM decides, in its own good time, when all references to an object have disappeared so that it is safe to actually free the object. This is called "garbage collection," in the trade.

Why go to all this trouble? In particular, why pay for the extra run-time checks for null pointers and the complexities of garbage collection? Because two of the nastiest kinds of bugs in a C or C++ program are dangling pointers and storage leaks. The former arise when a pointer to an object outlives the object itself. This can happen if a function returns a pointer to an automatic object or if the program frees a heap object and later uses a pointer to the freed object. Neither of these things can happen in Java, because you simply can't write code that causes such a bug. The latter kind of bug occurs when you fail to free storage after it is no longer used. A long running program that leaks storage this way will eventually become starved for heap space. It will then degrade horribly or crash. Java's garbage collector takes responsibility for hunting down orphaned objects, so storage leaks should not happen.

Dangling pointers cause invalid storage accesses. Even the best debugging aids have trouble highlighting the true source of such mysteries. And they are often hard to reproduce well enough to chase down. Similarly, storage leaks might not manifest themselves until a program has been running a long time. So both can escape testing and crop up in the field — the worst kind of bugs imaginable. You can see why many projects will gladly pay the price exacted by Java, in larger and slower executables, to eliminate such bugs. The promise is shorter time to market, because of easier debugging, and safer behavior in the field.

But once again, the results are somewhat oversold for Java. All you have to do is leave one reference lying about — in a static object, for example — and the elaborate network of objects it points to appears to be still in use. The garbage collector is thwarted and you have a de facto storage leak. Similarly, Java can only catch at run time a number of bugs that C++ easily detects at compile time. The lack of templates in Java eliminates a number of important options for stronger type checking. So a Java program faces an even greater risk than a C++ program of shipping with lurking bugs. I believe that Java does promise quicker time to market, for several good reasons, but it is not as bulletproof a language as its enthusiasts proclaim.

The final issue I will touch on only very delicately. It concerns the much touted strength of Java as a highly standardized language. For in some ways, standardization is the most important form of bulletproofing in our modern arsenal. I like to observe that a standard is a treaty between programmers and implementers. Programmers know what they can trust across different vendors, and vendors know how much latitude they have to make a more competitive product. History has shown that a language is safest to use once it falls under the purview of a widely accepted public standard. Otherwise, it changes randomly at the tug of competing vendors, or unexpectedly at the whim of a dominant vendor.

Sun Microsystems developed Java. They own the name as a trademark. They have evolved the language rapidly over the past several years, apparently in pursuit of the broadest possible marketplace. They have vigorously opposed any hint of dialect formation by their own licensees — their lawsuit against Microsoft being the most notable case in point. At the same time, Sun has long paid lip service to the need for a Java standard, and for open competition from vendors who choose not to license the Sun implementation.

Amid some controversy, Sun won permission from ISO several years ago to supply the international standard for Java. I won't repeat all the details of the decision-making process, because the bottom line is clear enough. Sun was given two years to polish up their existing description of the Java language and submit it for a quick vote. Sun argued that they could use the Internet, and their own tighter control of the process, to produce a widely acceptable public standard in Internet time. No need for some committee to meet for three to eight years to reinvent a wheel that was already rolling at high speed.

Well, after two plus years of little visible activity, Sun reneged. I also won't repeat all the recriminations, because the bottom line is clear enough. After aeons of Internet time, there is still no widely accepted public standard for Java. Sun is still in control. They are now working with ECMA, a European standards body with rather less clout than ISO. What the result will be I still can't tell.

But the process to date has taken its toll. Java implementations exist in a spectrum of versions. Writing portable Java has not become easier along the way. Just as C++ suffered from prolonged and excessive invention in the guise of standardization, I fear that Java is suffering a similar fate. What was once a cute little language with an elegant library has put on quite a bit of weight. It no longer resembles its baby pictures. [For a more in-depth discussion on Java standardization, see P.J. Plauger's "Standard C/C++" column elsewhere in this issue. — mb]

Where to Next?

I've intentionally painted a moderately bleak picture of Java today. I do so mostly as an anodyne to all those rah-rah articles we've all had to wade through these past few years. I don't intend to brand Java as a failure. I simply observe that its early, overly ambitious goals have not been achieved. Moreover, I don't think they ever will be.

My personal view of Java is that there's a useful little procedural language in there waiting to escape. Try to view it as just another candidate alongside C, C++, Pascal, and even Visual Basic for a given programming task. Interpreted Java has its advantages where portability matters, but that is not the only form it can take. Gosling and company took some pains to define Java in such a way that it can be compiled as well, just like Pascal or C. That original definition has been compromised over the past few years, as Sun has added more and more features that benefit from an interpreted environment. But the language can still be compiled.

A handful of compiled Java implementations are currently available. Some convert "byte codes" — the machine language for a JVM — into C code for subsequent compilation. Others behave like a true compiler, translating from Java directly to C, assembly language, or some intermediate form digestible by an existing compiler. My company, Dinkumware, Ltd., has been working in tandem with Edison Design Group (which makes C++ translators as well) to produce the library and translator front end needed to make a Java compiler. We obviously have faith that Java compilers will become more important in the near future.

I see Java as a language with definite advantages. It supports the development of larger programs than C, thanks to its object-oriented additions. It is easier to learn than C++, thanks to a much greater economy of concepts. And, as I mentioned earlier, it promises quicker time to market for many program development projects. But Java, at its root, is still just another version of Algol. That observation is intended as both a compliment and a caution.

P.J. Plauger is Senior Editor of C/C++ Users Journal and President of Dinkumware, Ltd. He is the author of the Standard C++ Library shipped with Microsoft's Visual C++, v5.0. For eight years, he served as convener of the ISO C standards committee, WG14. He remains active on the C++ committee, J16. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.