A Walk(back) on Smalltalk’s Nil Side

By Enoch Sower, September 01, 2000

High up, on the class of Metaclass, a lone man strives to find answers to life's persistence proxies: Eye Noire—Private VisualAge Eye.

September 2000 Feature: A Walk(back) on Smalltalk’s Nil Side

It was a gloomy Thursday morning as I sauntered into my dingy cube and asked Shane, my counterpart in havoc, how the new production release was going.

"Not bad." (He always says that.)

"You mean we’ll be able to go to this week’s status meeting today? Remember the last release when we huddled around Chris’s cube for hours trying to fix a bug?"

I was pleased to know that the 700 function points that we had added to the application weren’t blowing up. And even more pleased about some performance icing we had added to the cake—a framework of proxies built off of nil that saved 120 milliseconds at main retrieval time by delaying the instantiation of a complex business object, plus a second group of proxies that prevented mainframe transactions from being fired off if they weren’t needed.

Boy, was I wrong about the icing! Instead, the first dollop of fresh egg landed on my face at 8:32 a.m.—from Colorado, of all places. Mysterious walkback files were machine-gunned into the log directory, all from one user—about 70 files in a matter of seconds. A quick call revealed that the poor soul had just tried to do a plain retrieval and update transaction, the kind she had been doing all morning without a problem. This time, however, her PC spit out a cryptic error message: An undefined object locked when it didn’t understand one of our persistence helper objects. After rebooting, everything worked fine, however.

Peering into the last of the 70 walkback logs, I immediately noticed something strange. The log didn’t begin at the bottom with the items you would normally see, like:

UIProcess(Process)>>#newProcessOn:stack

Size:withArguments:named:

receiver = UIProcess:(4/9/99 12:14:34 PM){suspended,3}

arg1 = [] in UIProcess class>>#forkUserInterface

arg2 = 1024

arg3 = ()

arg4 = ‘(4/9/99 12:14:34 PM)’

Instead, the log mysteriously started where the undefined object didn’t understand something. And the offending class and method was, of course, a concrete proxy—code that I had written! What’s more, because the beginning of the stack trace never printed, I didn’t have a clue as to how that undefined object came into existence. The class’s only constructor ensured that the proxy should have been initialized to one of our persistence helper objects. Obviously, the proxy was being instantiated via another mechanism … but how? Where?

Clearly, my nil descendants also didn’t know how to play with the walkback mechanism—every time you touched them in the wrong way, 70 to 100 files would be spit out, and the machine would lock up.

As more and more of our 4,000 users logged in and started pulling the new production release, the volleys of egg mounted. The next two came from Cleveland sites—users with very similar symptoms. Then the East Coast started sputtering: one, then two. By 11:30 a.m., we had over 10 incidents of this mysterious failure, and our friends at the help desk were tired of rebooting machines.

Thousands of users were sailing along, merrily doing transactions with no signs of trouble. We had put two types of proxies into production, all inheriting off the same abstract class that lived right under nil, and the second type of proxy was problem-free? Hmm…

As it happened, I had a flight home scheduled for that afternoon. Since only 12 users out of thousands had been affected, Chris said I could still go home for the weekend. The captain always stays with the ship … while we rats … well, the Midwest Express to Milwaukee had plenty of room for rats.

We had built our performance proxies by the book—that is, two books that, put together, are about as thick as a NASA manual for a Mars flight: Design Patterns: Elements of Reusable Object-Oriented Software, by Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides (Addison-Wesley, 1994) and The Design Patterns Smalltalk Companion, by Sherman R. Alpert, Kyle Brown and Bobby Woolf (Addison-Wesley, 1998). Our walls were adorned with the right Rational Rose UML diagrams! We had gone through design reviews! We pair-programmed those puppies! We’d done code reviews with the core architecture team! We had tested and retested the proxies for weeks prior to the production release—unit testing, regression testing, automated testing—all without so much as a bump!

By the end of the day, 50 of 4,000 customers had suffered machine lock-up and the log file directory was bursting. All through the weekend, I dreaded the flight back to Cleveland. Sunday evening, I went straight to work, slipped into my cube, logged on to Lotus Notes and discovered, buried amidst 100 irrelevant messages, a notice of an emergency production release that had been delivered on Saturday! One of my proxies had been yanked out of production. My heart sank as I imagined those dreaded words from the director of operations: "Unplug and go home." For a moment I thought, "Cubicide! The only honorable way out."

Chris discovered the key clue: The proxies that bombed were part of cloned object graphs and the ones that never bombed were never cloned. I jumped into Rose and threw some quick sequence diagrams together. The cloning framework deep down makes a call to #shallowCopy, a primitive. However, it is a primitive with a difference: There is Smalltalk code below the primitive call to manually crank out a shallow copy if the virtual machine returns a primitive failure. Of course, I had implemented #shallowCopy in our nil proxy abstract class, so all our object graphs could clone themselves happily—as long as the VM wasn’t stressed. Now I suspected that, on very rare occasions, the VM was deciding to do a failure return from the primitive call, perhaps because a global garbage collection was just beginning when it got the call.

In three weeks of automated testing, the VMs never once faltered on the #shallowCopy primitive. And the primitive call never failed for hundreds of thousands of transactions on the first day of the production release. But it did fail 50 times in a somewhat random fashion. A quick note dashed off to the OTI (Object Technology Inc.) lab confirmed my suspicion that the #shallowCopy primitive call is designed to fail if the VM gets the call and there is "not enough memory to allocate the new object quickly—A non-quick allocate would be something that required a garbage collection operation." So, the VM punts the responsibility for object creation and copying back to the Smalltalk code, which lurks below the primitive call in #shallowCopy, when it thinks the action will take too long!

But how could I prove this was actually happening, since we couldn’t reproduce the error in our automated testing environment? The first step was to comment out the primitive call in #shallowCopy ("<primitive: VMprObjectShallowCopy>"). Subsequent calls to the method would then always fall into the Smalltalk code and, low and behold, we got the same behavior that we saw in the original 50 failures. Rapid-fire walkback logs (50 to 70 separate logs) appeared each time we called #shallowCopy on the nil side.

A quick tour through the object side of the image revealed that EsStackFrame>>#debugPrintOn: liked to call #debugPrintString on whatever was being written to the walkback log. And there was part of the problem: #debugPrintString wasn’t implemented in our abstract proxy class. Once we implemented the proxy, each error only received one log, instead of 50 to 70. When I looked at, that one walkback log, I knew why we were failing.

The Smalltalk code that executes after a #shallowCopy primitive failure manually makes a new object via a #basicNew or a #basicNew: to its class and then probes the original object’s shape via #instVarAt: and re-creates the same shape on the new object via #instVarAt:put:. We had already implemented #instVarAt: so our proxies could be seen in standard inspectors (it’s called by: EpInspector>>#selectedValue), but had not needed to implement #instVarAt:put: … until now. Once we implemented it in our abstract proxy class, the error never recurred—even when the #shallowCopy primitive call was commented out.

However, our testing manager asked an important question: How can we configure our environment so that it traps this kind of error before a production release? Back in my VisualSmalltalk Enterprise days (what a nice Smalltalk, by Parcplace), I could configure the amount of operating system virtual memory, old space, new space and so on from the command line. And good old VisualAge Smalltalk, by IBM, has some of the same options (see IBM Smalltalk User’s Guide Version 4.5). Just constrict some of these measures beyond what is reasonable and, voila, you’ve got the old VM choking on the #shallowCopy primitive call.

We didn’t have to go that far, because, as the fates would have it, our Internet server team had gobbled up the flawed proxy framework as soon as it was released to the production configuration. And they were able to break the framework in their test environment. How? They swamp their servers with a Web version of load runner that puts incredible stress on each VM. It didn’t take them long to find me after the walkback logs mushroomed around my code. Happily, I had the whole thing figured out, and when they imported the fixes for #instVarAt:put: and #debugPrintString, everything went fine.

A walk on the nil side sure can be exciting, but if you would rather lead a peaceful life, heed the following:

• Start by implementing your nil descendants with the behavior described in Design Patterns or by the Proxy Pattern described in The Design Patterns Smalltalk Companion.

• Test whatever you build on the nil side in various browsers and inspectors to make sure you have implemented all the support they require. You might want to put browser support methods in a class extension application that is loaded only during development.

• Make sure you can use ObjectSwapper, ObjectLoader and ObjectDumper with your nil subclassed objects and their realSubjects. Include all the SwapperSupport methods needed.

• Make sure you can break your nil subclassed objects by sending each one a message it doesn’t understand before its realSubject is ever instantiated. That way it will behave properly in the Envy debugger and walkback logs will be generated just once in run time.

• Finally, if you like the Visualization tools by IBM and want to make your proxies show up there, you’ll have to figure out what Object>>#dtxBecomeMonitored and Object>>#dtxBecomeMonitoredFromCritical do. The code for these is hidden in most ENVY/Developer environments (a multiuser IDE by Object Technology International with version control, configuration management and a reusable component library), and you’ll have to implement it properly on the nil side for your proxies to work with IBM’sVisualizer.

A Translation for the Non-Smalltalk Fluent
Strong typing is why it takes 53 lines of Java code to deliver just one function point.

In Smalltalk, high-performance proxies are as easy as falling off a log, the log being nil. Java has no simple counterpart to nil, so one has to create one’s own root. In this respect, it’s not trivial to get around java.lang.object. Typically, proxies in Java are extended from the object, so they carry all of the latter’s baggage, whereas Smalltalk proxies subclassed off of nil carry no state or behavior baggage. Most Java proxies, therefore, do not have the light-weight instantiation advantages of Smalltalk nil proxies. It is precisely this light-weight instantiation overhead that markedly contributes to performance in systems where tens or hundreds of thousands of nil proxies are employed.

In Smalltalk, messages routed to the real subject via the proxy are trapped by overriding #doesNotUnderstand:. But Java has no such convenient dynamic mechanism. The latter’s strong typing makes it very hard to achieve a run-time situation where an undefined message is sent to a proxy and that proxy is really supposed to turn around and delegate that message to its real subject. Reflection can be used to get around some of these issues in Java. However, reflection currently cripples performance and performance is often the reason we build proxies in the first place.

Indeed, when Java folks promote strong typing, Smalltalkers are likely to retort: "Strong typing for weak minds—weak typing for strong minds." Strong typing is part of the reason it takes an average of 53 lines of Java code to deliver an International Function Point Users Group (IFPUG) Level 4 function point, while Smalltalk delivers the same function point in only 21 lines (see www.spr.com/library/0langtbl.htm). Customers pay for functionality, measured in standard function points, and typically don’t care what language is under the hood.

Java’s intentional choice of strong typing also makes construction of proxies and their corresponding real subjects ponderous. Here is some Java code from my colleague Dave Harris; it uses an envelope or letter idiom to implement a proxy that stands in for a server, both of which inherit from the same interface which defines the service:

interface Server {

int getInt();

void putInt( int x );

}

class ConcreteServer implements Server {

private int x;

public int getInt() { return x; }

public void putInt( int x ) { this.x = x; }

}

class ProxyServer implements Server {

private Server s;

public void setDelegatee( Server s ) { this.s = s; }

public int getInt() { return s.getInt(); }

public void putInt( int x ) { s.putInt( x ); }

}

As you can see, this requires a lot of up-front thought. If we need to add a new method to the ConcreteServer, we must also add it in Server and ProxyServer. For a different kind of server we must start all over again. Lots of effort and duplication, compared to Smalltalk.

On the other hand, the approach should work and you should be able to build real solutions with it. It does yield what the Java advocates call "the benefits of manifest typing." The boundaries between systems—where uncertainty and change are greatest—are where Java folks feel they most benefit from making interfaces explicit and rigorously enforced.

In C++, we can get closer to the dynamic nature of Smalltalk proxies by overloading the member access operator, represented in Java by a period. This allows doing additional work whenever an object is dereferenced and the proxy ends up behaving like a pointer. Methods then do not have to be defined in triplicate (on the interface, the proxy and on the real subject) as in Java. For a good example of C++ proxies, see pp. 213-215 of Gamma’s Design Patterns. For more information on the basics of Smalltalk proxies, see pp. 213-221 of Alpert’s The Design Patterns Smalltalk Companion. (One wonders when, if ever, a Design Patterns Java Companion will appear).

But, take heart, all you folks stuck in J-Land. A group of self-sacrificing Smalltalkers is working feverishly on a technology that emits Java byte codes from Smalltalk source and implements features needed to do things like nil proxies. Stay tuned for further news.

— Enoch Sower

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

A Walk(back) on Smalltalk’s Nil Side

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

A Walk(back) on Smalltalk’s Nil Side

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content