Errant Architectures

By Martin Fowler, April 01, 2003

Objects have been around for a while, and sometimes it seems that ever since they were created, folks have wanted to distribute them. However, distribution of objects, or indeed of anything else, has a lot more pitfalls than many people realize, especially when they're under the influence of vendors' cozy brochures. This article is about some of these hard lessons-lessons I've seen many of my clients learn the hard way.

Objects have been around for a while, and sometimes it seems that ever since they were created, folks have wanted to distribute them. However, distribution of objects, or indeed of anything else, has a lot more pitfalls than many people realize, especially when they’re under the influence of vendors’ cozy brochures. This article is about some of these hard lessons—lessons I’ve seen many of my clients learn the hard way.

[click for larger image]
Architect’s Dream, Developer’s Nightmare
Distributing an application by putting different components on different nodes sounds like a good idea, but the performance cost is steep.

A Mysterious Allure
There’s a recurring presentation I used to see two or three times a year during design reviews. Proudly, the system architect of a new OO system lays out his plan for a new distributed object system—let’s pretend it’s some kind of ordering system. He shows me a design that looks rather like “Architect’s Dream, Developer’s Nightmare” with separate remote objects for customers, orders, products and deliveries. Each one is a separate component that can be placed in a separate processing node.

I ask, “Why do you do this?”

“Performance, of course,” the architect replies, looking at me a little oddly. “We can run each component on a separate box. If one component gets too busy, we add extra boxes for it so we can load-balance our application.” The look is now curious, as if he wonders if I really know anything about real distributed object stuff at all.

Meanwhile, I’m faced with an interesting dilemma. Do I just say out and out that this design sucks like an inverted hurricane and get shown the door immediately? Or do I slowly try to show my client the light? The latter is more remunerative, but much tougher, since the client is usually very pleased with his architecture, and it takes a lot to give up on a fond dream.

So, assuming you haven’t shown my article the door, I suppose you’ll want to know why this distributed architecture sucks. After all, many tool vendors will tell you that the whole point of distributed objects is that you can take a bunch of objects and position them as you like on processing nodes. Also, their powerful middleware provides transparency. Transparency allows objects to call each other within or between a process without having to know if the callee is in the same process, in another process or on another machine.

Transparency is valuable, but while many things can be made transparent in distributed objects, performance isn’t usually one of them. Although our prototypical architect was distributing objects the way he was for performance reasons, in fact, his design will usually cripple performance, make the system much harder to build and deploy—or both.

Remote and Local Interfaces
The primary reason that the distribution by class model doesn’t work has to do with a fundamental fact of computers. A procedure call within a process is extremely fast. A procedure call between two separate processes is orders of magnitude slower. Make that a process running on another machine, and you can add another order of magnitude or two, depending on the network topography involved.As a result, the interface for an object to be used remotely must be different from that for an object used locally within the same process.

A local interface is best as a fine-grained interface. Thus, if I have an address class, a good interface will have separate methods for getting the city, getting the state, setting the city, setting the state and so forth. A fine-grained interface is good because it follows the general OO principle of lots of little pieces that can be combined and overridden in various ways to extend the design into the future.

A fine-grained interface doesn’t work well when it’s remote. When method calls are slow, you want to obtain or update the city, state and zip in one call rather than three. The resulting interface is coarse-grained, designed not for flexibility and extendibility but for minimizing calls. Here you’ll see an interface along the lines of get-address details and update-address details. It’s much more awkward to program to, but for performance, you need to have it.

Of course, what vendors will tell you is that there’s no overhead to using their middleware for remote and local calls. If it’s a local call, it’s done with the speed of a local call. If it’s a remote call, it’s done more slowly. Thus, you pay the price of a remote call only when you need one. This much is, to some extent, true, but it doesn’t avoid the essential point that any object that may be used remotely should have a coarse-grained interface, while every object that isn’t used remotely should have a fine-grained interface. Whenever two objects communicate, you have to choose which to use. If the object could ever be in separate processes, you have to use the coarse-grained interface and pay the cost of the harder programming model. Obviously, it only makes sense to pay that cost when you need to, and so you need to minimize the number of interprocess collaborations.

For these reasons, you can’t just take a group of classes that you design in the world of a single process, throw CORBA or some such at them and come up with a distributed model. Distribution design is more than that. If you base your distribution strategy on classes, you’ll end up with a system that does a lot of remote calls and thus needs awkward, coarse-grained interfaces. In the end, even with coarse-grained interfaces on every remotable class, you’ll still end up with too many remote calls and a system that’s awkward to modify as a bonus.

[click for larger image]
A Better Way
Clustering involves putting several copies of the same application on different nodes. If you must distribute, this approach eliminates the latency problems.

Laying Down the Law
Hence, we get to my First Law of Distributed Object Design: Don’t distribute your objects!

How, then, do you effectively use multiple processors? In most cases, the way to go is clustering (see “A Better Way”). Put all the classes into a single process and then run multiple copies of that process on the various nodes. That way, each process uses local calls to get the job done and thus does things faster. You can also use fine-grained interfaces for all the classes within the process and thus get better maintainability with a simpler programming model.

Where You Have to Distribute
So you want to minimize distribution boundaries and utilize your nodes through clustering as much as possible. The rub is that there are limits to that approach—that is, places where you need to separate the processes. If you’re sensible, you’ll fight like a cornered rat to eliminate as many of them as you can, but you won’t eradicate them all.

One obvious separation is between the traditional clients and servers of business software. PCs on users’ desktops are different nodes to shared repositories of data. Since they are different machines, you need separate processes that communicate. The client/server divide is a typical interprocess divide.
A second divide often occurs between server-based application software (the application server) and the database. Of course, you can run all your application software in the database process itself, using such things as stored procedures. But often that’s not practical, so you must have separate processes. They may run on the same machine, but once you have separate processes, you immediately have to pay most of the costs in remote calls. Fortunately, SQL is designed as a remote interface, so you can usually arrange things to minimize that cost.
Another separation in process may occur in a Web system between the Web server and the application server. All things being equal, it’s best to run the Web and application servers in a single process—but all things aren’t always equal.
You may have to separate processes because of vendor differences. If you’re using a software package, it will often run in its own process, so again, you’re distributing. At least a good package will have a coarse-grained interface.
Finally, there may be some genuine reason that you have to split your application server software. You should sell any grandparent that you can get your hands on to avoid this, but cases do come up. Then you just have to hold your nose and divide your software into remote, coarse-grained components.

The overriding theme, in OO expert Colleen Roe’s memorable phrase, is to be “parsimonious with object distribution.” Sell your favorite grandma first if you possibly can.

Working with the Distribution Boundary
Remote Façade and Data Transfer Object are concepts to make remote architectures work.

As you design your system, you need to limit your distribution boundaries as much as possible, but where you have them, you need to take them into account. Every remote call travels on the cyber equivalent of a horse and carriage. All sorts of places in the system will change shape to minimize remote calls. That’s pretty much the expected price. However, you can still design within a single process using fine-grained objects. The key is to use them internally and place coarse-grained objects at the distribution boundaries, whose sole role is to provide a remote interface to the fine-grained objects. The coarse-grained objects don’t really do anything, but they act as a façade for the fine-grained objects. This façade is there only for distribution purposes—hence the name Remote Façade.

Using a Remote Façade helps minimize the difficulties that the coarse-grained interface introduces. This way, only the objects that really need a remote service get the coarse-grained method, and it’s obvious to the developers that they are paying that cost. Transparency may have its virtues, but you don’t want to be transparent about a potential remote call.

By keeping the coarse-grained interfaces as mere façades, however, you allow people to use the fine-grained objects whenever they know they’re running in the same process. This makes the whole distribution policy much more explicit. Hand in hand with Remote Façade is Data Transfer Object. Not only do you need coarse-grained methods, you also need to transfer coarse-grained objects. When you ask for an address, you need to send that information in one block. You usually can’t send the domain object itself, because it’s tied in a web of fine-grained local inter-object references. So you take all the data that the client needs and bundle it in a particular object for the transfer—hence the term. (Many people in the enterprise Java community use the term Value Object for Data Transfer Object, but this causes a clash with other meanings of the term Value Object.) Data Transfer Objects appear on both sides of the wire, so it’s important that it not reference anything that isn’t shared over the wire. This boils down to the fact that Data Transfer Objects usually reference only other Data Transfer Objects and fundamental objects such as strings.

—M. Fowler

Interfaces for Distribution
Use Web services only when a more direct approach isn’t possible.

Traditionally, the interfaces for distributed components have been based on remote procedure calls, either with global procedures or as methods on objects. In the last couple of years, however, we’ve begun to see interfaces based on XML over HTTP. SOAP is probably going to be the most common form of this interface, but many people have experimented with it for some years.

XML-based HTTP communication is handy for several reasons. It easily allows a lot of data to be sent, in structured form, in a single round-trip. Since remote calls need to be minimized, that’s a good thing. The fact that XML is a common format with parsers available in many platforms allows systems built on very different platforms to communicate, as does the fact that HTTP is pretty universal these days. The fact that XML is textual makes it easy to see what’s going across the wire. HTTP is also easy to get through firewalls when security and political reasons often make it difficult to open up other ports.

Even so, an object-oriented interface of classes and methods has value, too. Moving all the transferred data into XML structures and strings can add a considerable burden to the remote call. Certainly, applications have seen a significant performance improvement by replacing an XML-based interface with a remote call. If both sides of the wire use the same binary mechanism, an XML interface doesn’t buy you much other than a jazzier set of acronyms. If you have two systems built with the same platform, you’re better off using the remote call mechanism built into that platform. Web services become handy when you want different platforms to talk to each other. My attitude is to use XML Web services only when a more direct approach isn’t possible.

Of course, you can have the best of both worlds by layering an HTTP interface over an object-oriented interface. All calls to the Web server are translated by it into calls on an underlying object-oriented interface. To an extent, this gives you the best of both worlds, but it does add complexity, since you’ll need both the Web server and the machinery for a remote OO interface. Therefore, you should do this only if you need an HTTP as well as a remote OO API, or if the facilities of the remote OO API for security and transaction handling make it easier to deal with these issues than using non-remote objects.

In this discussion, I’ve assumed a synchronous, RPC-based interface. However, although that’s what I’ve described, I actually don’t think it’s always the best way of handling a distributed system. Increasingly, my preference is for a message-based approach that’s inherently asynchronous. In particular, I think they’re the best use of Web services, even though most of the examples published so far are synchronous.

For patterns on asynchronous messaging take a look at www.enterpriseIntegrationPatterns.com.

—M. Fowler

Martin Fowler is Chief Scientist at ThoughtWorks and a frequent speaker at Software Development conferences. This article is adapted from Patterns of Enterprise Application Architecture, Chapter 7 (Addison-Wesley, 2003). Reprinted with permission.

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Errant Architectures

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Errant Architectures

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content