Welcome to the final column in our series exploring CORBA and XML. Our first column  discussed XML in general and explored how XML could be used to address the CORBA versioning problem. The second column  considered alternatives for passing XML-defined data between client and target, and included an overview of the OMG's recently adopted "XMLDOM: DOM/Value Mapping Specification"  as one of those alternatives. The DOM/Value mapping employs IDL valuetypes to specify an API that allows your application to build and traverse XML parse trees in memory.
In this column we delve into two hot topics related to XML and CORBA: SOAP (Simple Object Access Protocol)  and web services. Both make extensive use of XML, and like XML, both are currently over-hyped as all-singing, all-dancing wonder technologies that can cure all your software ills. Such hype often builds expectations that simply can't be met, ultimately resulting in shattered illusions, disappointment, and worst of all, wasted time, effort, and money. As always, we take a pragmatic look at these technologies to see what they're really made of, to see where their real value resides, and how they compare and contrast with CORBA.
As we explained in our previous column, SOAP is an emerging distributed middleware technology that uses a lightweight and simple XML-based protocol to allow applications to exchange structured and typed information across the Web. SOAP is designed to support automated web services based on a shared, decentralized, and open web infrastructure. SOAP applications can be written in a wide range of programming languages (such as Java, C++, C, Perl, and C#), used in combination with a variety of Internet protocols and formats (such as HTTP, SMTP, and MIME), and can support many types of applications ranging from messaging systems to RPC (remote procedure call). There are three main parts in the SOAP architecture :
- An envelope that describes the contents of a message and how to process it.
- A set of encoding rules for expressing instances of application-defined datatypes.
- A convention for representing remote procedure calls and responses.
Thus, SOAP is similar to CORBA's IIOP (Internet Inter-ORB Protocol)  in the sense that it's a protocol for conveying messages between applications. Given our industry's penchant for creating "religious" arguments around technologies that are similar to each other whether they actually compete or not comparisons between SOAP and IIOP abound. We discuss some of these arguments below.
Transfer Syntax Issues
Transfer syntax is the format that a protocol uses to transfer data in a message from sender to receiver. Unlike IIOP, which represents message data in a binary format called the CDR (Common Data Representation) , SOAP uses XML for its transfer syntax. At least three religious arguments fall under the transfer syntax category, as described below.
- Message size. With respect to message size, those who favor IIOP's binary transfer syntax always mention the fact that it is much more efficient with respect to message sizes than SOAP's text-based transfer syntax. For example, sending the number 1 and the number 12,345,678 using IIOP always costs four bytes (assuming a long). In SOAP, however, the size is determined by how many character digits are used to represent the number, and by the character set used (i.e., whether it requires one byte per character or multiple bytes per character). On top of this issue, recall that IIOP contains no on-the-wire type identification (except for the Any type), versus SOAP's required use of character-based XML tags to mark all the fields in the message. The resulting difference in the size of an IIOP encoding of a message and the equivalent SOAP encoding can be dramatic.
- Marshaling complexity. Getting data from sender to receiver requires marshaling, and it is well accepted that marshaling overhead can degrade middleware performance and scalability significantly . The marshaling required for CDR is of medium complexity (i.e., marshaling simple types is simple, whereas marshaling complex recursive TypeCodes and valuetypes can be complicated). The bulk of a GIOP (General Inter-ORB Protocol) marshaling engine is fairly straightforward, with the only tricky aspects being the byte alignment rules and byte endianness. In contrast, SOAP marshaling is far more complicated, due to its use of XML as its transfer syntax, which is time consuming to parse. Even though the popularity of XML is driving rapid improvements in XML parsing engines, the complexity of SOAP's transfer syntax versus CDR means that SOAP will always be more expensive to marshal and demarshal. Again, whether marshaling overhead makes a difference in an application's performance depends entirely on that application and the computing environment that it runs in. For example, business applications are often I/O-bound due to database access overhead, so marshaling costs are rarely a bottleneck. In contrast, distributed real-time and embedded systems are often much more sensitive to marshaling complexity.
- Message readability. An argument that surfaces quite frequently is that the binary format of CDR makes messages hard for humans to read. The purveyors of this argument tend to be administrators or developers who regularly work on getting systems to interoperate. They often prefer human-readable transfer syntax that allows them to use simple tools to read the messages flowing between endsystems and determine where their interoperability problems lie.
But does size really matter? After all, the largest distributed system we have the World Wide Web is also our most successful, despite being mostly based on carrying HTML (a text-based markup language, obviously similar in many ways to XML) over HTTP (a text-based protocol). We've all suffered through slow web downloads over slow dial-up links, but the ever-increasing availability of high-bandwidth technologies such as broadband continues to eliminate these web bottlenecks. Moreover, it's usually the embedded images, MP3s, and other non-text files that cause slow downloads, not the text pages themselves.
Despite these arguments to the contrary, size really does matter for some applications. For example, the sizes of network messages matter in mobile or wireless applications where the network has been proven to be a bottleneck via throughput testing and measurement of actual working conditions. Size can also matter when devices possessing limited processing power or memory capacities are involved since parsing and processing text-based message encodings is more expensive than processing binary encodings. Generally, though, the degree to which message size has a noticeable effect on an application's performance and scalability depends heavily on the nature of the application, as well as its computing and communication environment. It's therefore pointless to argue about the superiority of one transfer syntax over another based solely on message size without thorough empirical evidence to determine if the time/space overhead actually affects end-to-end performance requirements.
Frankly, we find this a strange and weak argument. First, in a successful system, the time spent debugging the system is but a fraction of the time spent running it in production. If the use of a human-readable transfer syntax makes the system run slower in production thereby decreasing its business effectiveness then the use of human-readable transfer syntax is a poor engineering design choice. Second, the number of people who actually might need to read the transfer syntax is extremely small. As before, potentially penalizing the performance and throughput of the whole system for the benefit of a small percentage of its users can be a poor engineering design choice.
If using text-based transfer syntax adversely affects your application's performance, but you really must read the messages to help you debug the system, there are ways of having your cake and eating it too. For example, you can download programs that "snoop" CDR and IIOP messages that an application sends and receives, and the snoop program will decode them into text for you. The Ethereal network protocol analyzer, available at <www.ethereal.com>, is a popular snooping tool that works with IIOP. IONA's Orbix 2000 product provides a snoop plug-in that you can dynamically load into an application to have it translate binary CDR messages into text for you to read.
Years ago, similar snooping programs were used to help debug TCP/IP interoperability problems. As TCP/IP became ubiquitous, the need for people to use TCP snooping programs for debugging was greatly reduced, practically to nil. (Aside from actual TCP stack developers, the only people using TCP snooping programs today are students learning the details of TCP/IP, and hackers.) Similarly, as IIOP-based interoperability between different ORBs has improved, the number of people who need to use IIOP snooping programs has dwindled. Therefore, the fact that SOAP uses a text-based transfer syntax might provide marginal assistance to those few who are actually developing SOAP implementations, but beyond that, the fact that it's human-readable (assuming you actually consider XML to be human-readable, that is) is of little or no benefit in practice.
If you were involved with CORBA in the early 1990s, you may recall that CORBA was independent of any particular protocol. Indeed, all the ORBs (Object Request Brokers) that existed back then had their own proprietary protocols because the OMG had not yet standardized IIOP. Orbix had the Plain Old Orbix Protocol (the acronym is left as an exercise for the reader), and Sun and HP's jointly developed DOMF (Distributed Object Management Facility) could use either Sun ONC RPC or DCE RPC as the basis for its ORB-to-ORB protocol. Other ORBs, such as those from HyperDesk, Digital, and IBM, each had their own proprietary protocols, too. It wasn't until the OMG standardized IIOP in the CORBA 2.0 Specification in 1995 that ORBs started to move away from their own native protocols toward supporting the Standard.
Naturally, some applications still require the ability to run over non-standard protocols in order to maximize performance or address legacy constraints. As a result, many ORBs still support alternative non-standard protocols and transports, such as shared memory, bus interconnects, and multicast transports. In fact, an "extensible transport framework"  has been proposed in the OMG with the hope of making it possible to plug in different transport protocols underneath ORBs in a standard way.
Given the multi-protocol capacity of CORBA, therefore, perhaps the religious wars of "SOAP vs. IIOP" are completely unnecessary in the context of CORBA (i.e., is it possible to access CORBA objects using SOAP?). Several ORBs have offered SOAP support in the past. For example, Rogue Wave's (now defunct) Nouveau ORB supported an XML/CORBA marriage called "XORBA," where SOAP could be used to access CORBA objects directly. IONA built a SOAP plug-in for Orbix 2000 shortly after SOAP was first announced and showed SOAP interoperability in live demonstrations with Microsoft at several conferences. However, IONA has never released its SOAP plug-in as a product.
Clearly, SOAP interoperability with CORBA is technically feasible and economically expedient. The obvious question one must ask, however, is whether there are any technical benefits to having SOAP access to CORBA objects. Such potential benefits might include the following:
- Firewall traversal. There is no standard port for IIOP, so it normally does not traverse firewalls easily. The OMG's first attempt at an IIOP firewall traversal standard fell apart due to serious technical flaws in the specification submitted for adoption, so they are now on their second round of adoption for IIOP firewall traversal technology. This means that ORBs are still using their own proprietary approaches to dealing with firewalls, such as VisiBroker GateKeeper and IONA's WonderWall.
- Interoperability with non-CORBA systems. Unlike many of its technology predecessors, such as DCE, CORBA, COM, and J2EE, SOAP enjoys ubiquitous industry support, with no SOAP-related industry fragmentation in sight. An informal count via web searching shows that there are currently about 60 different implementations of SOAP available today from a wide variety of sources. Not all of these implementations are complete or correct, yet many of them interoperate. This is due in large part to SOAP implementers advertising SOAP-based servers on their websites, thus allowing other implementers to perform interoperability tests with those servers and compare results.
SOAP, on the other hand, is normally layered over HTTP, and HTTP normally uses port 80. Most system administrators allow HTTP traffic on port 80 to come through their firewalls. This means that unless the firewall performs packet checking to try to sniff out certain types of messages and block them, SOAP messages carried over HTTP will easily traverse firewalls. Naturally, network administrators will need to configure their firewalls so that malicious data or requests are not tunneled through SOAP messages.
Naturally, as more SOAP-based systems appear in production environments, interoperability between those systems and CORBA systems will become increasingly important. Without such interoperability, your middleware systems would evolve into disconnected islands unable to talk to each other, which is neither useful nor desirable for important business sectors. Unfortunately, though, simple on-the-wire interoperability between systems is not always sufficient, due to differences in object models and application semantics. We'll discuss this issue more thoroughly below.
The precise definition of web services is still evolving, but they are generally viewed as having the following characteristics:
- They are accessible via widely deployed protocols such as HTTP and SMTP.
- They support loosely-coupled service-oriented architectures, which consist of services that advertise their existence and their contact information in some form of registry or directory service, along with clients who look up services in the directory and then contact those services using their contact information. Service-oriented architectures also hide implementation details, such as programming language and operating system, from all participants. CORBA services can advertise their object references in Naming or Trader services to allow clients to find them and deliver requests to them, and CORBA hides client and object implementation details, such as operating system and programming language. It's therefore accurate to say that CORBA has supported service-oriented architectures from its inception.
- They send and receive XML-encoded messages.
In our opinion, web services also represent the evolution and convergence of three important areas of technology :
- The World Wide Web. Web services represent the continued evolution of the Web from a browser-oriented interactive system to a large-scale A2A (application-to-application) integration system.
- Traditional middleware. Web services also represent another step in the evolution of middleware, continuing a long line including Sun RPC, DCE, DCOM, CORBA, transactional messaging, and J2EE. Though viewed as novel by many new to the distributed computing arena, the fundamental concepts underlying web services are derived from, and largely identical to, the basic capabilities of these more mature middleware technologies.
- EDI (Electronic Data Interchange) . B2B (business-to-business) integration enables trading partners to conduct business transactions based entirely on standardized business document formats and on standardized business interactions called orchestrations or choreographies. These business transactions are executed by automated applications with little to no need for human input into the overall process. Due to the novelty of the Web, many believe that such B2B integration is an entirely new technology area. This is entirely false, however, as standardization processes for EDI were begun in 1979. Since then, EDI has evolved to include hundreds of standard business interactions and standard business document formats.
The fact that web services are evolving from both the Web itself and from traditional middleware is fairly obvious, and thus requires little explanation. The EDI roots of web services, on the other hand, are little understood, yet they are key to understanding the relationship between CORBA and web services. They are also critical to making web services into a viable integration technology.
Most web services that are developed today are very simple. For example, the X Methods website <www.xmethods.net> lists a number of publicly accessible web services. Here you'll find a number of stock quote services, exchange rate calculators, and services that return trivial reports on traffic, airports, and headline news. These services are little more than demoware and are thus useless for real-life applications.
Why are these web service examples so simple? It's not a sign that web services have no place in distributed computing systems. Rather, one reason is that the whole web services area is still quite immature. Currently, many of those developing web services are new to distributed computing, and they're thrilled when they can get a simple client application talking to a simple web service. There's nothing wrong with that, as we all have to start somewhere. However, a much more fundamental reason one that, despite its apparent simplicity, might even be considered profound is that trivial services are trivial to use. In other words, trivial services have trivial application choreographies. Just as in dance, application choreography consists of a series of coordinated steps. In A2A integration, these steps represent what's required for two or more applications to be able to interact correctly to carry out some overall function or business process.
Consider how a CORBA client interacts with a CORBA object. Typically, the object advertises its object reference in a Naming or a Trader service. The client looks up the object reference and then uses that reference to invoke a request on the object. As we explained above, this discovery pattern is the hallmark of service-oriented architectures.
As explained in , CORBA applications can be either service-oriented or session-oriented.
- A service-oriented application is one whose objects are normally persistent CORBA objects (i.e., objects that outlive any server process that might temporarily host them) that are shared by multiple applications. For example, objects supporting the CosNaming::NamingContext interface tend to be service-oriented.
- A session-oriented application, on the other hand, is typically based on the Factory pattern, where persistent Factory objects are invoked to create transient CORBA objects that are private to each client's session. Once the client finishes with its session objects, it normally destroys them.
The sets of interactions between a client, an object, and directory services required to create session-oriented or service-oriented applications are examples of application choreographies.
At a somewhat higher level, each CORBA IDL interface also implies application choreography. For example, the Naming service generally expects applications to set up some name-to-object reference bindings before performing lookups for those bindings. Similarly, a banking interface would likely expect clients to invoke deposit operations before trying to invoke withdrawal operations. The application choreography implied by an IDL interface is sometimes called its contract because it specifies what an object supporting that interface expects from its clients, and what it promises to deliver to its clients in return.
One reason that the web services listed on the X Methods website are so simple is to avoid the need to specify the services' application choreographies. For a stock quote web service, for instance, you expect to pass a stock name string to it and expect it to return the value of that stock. Application choreography doesn't get much simpler than that, and thus it can be implied rather than requiring explicit explanation and documentation. This does not mean, however, that all web services offer trivial application choreographies.
CORBA Objects and Web Services
Up to this point in this column, we have provided essential background information regarding SOAP, web services, and application choreographies. We now build on this information by comparing and contrasting CORBA objects and web services. There are two ways to do this:
- We could treat CORBA objects and web services as equals, and therefore as rivals, and examine the pros and cons of each. For example,  (which unfortunately is full of misinformation regarding CORBA and is generally lacking in objectivity) takes this approach. Such a comparison might appear to help us decide which technology is superior, with the assumption being that the superior technology will win, and is thus the one we should prefer.
- We could recognize that CORBA objects and web services are different and examine where each fits within common distributed computing solutions. We advocate this approach because it allows you to gain the benefits of both technologies and evolve your computing systems gracefully instead of resorting to wholesale disruptive replacement.
This approach suffers from a number of flaws. First, superior technology rarely wins in isolation, because "winning" is defined by market share and revenue, not by technical features. In fact, in our experience, superior technology almost always loses because the creators of such technology rarely possess the marketing knowledge and skills required to get large numbers of users to want to adopt it. Second, there is no such thing as a "one size fits all" solution, and as a result real-life computing systems are rarely homogeneous. It's unlikely, for example, that if you had a CORBA-based system successfully running in production, you would take it down and wholly replace it with a system based around web services because some "Web Services vs. CORBA" article declared CORBA to be the loser. Such replacement rarely makes good engineering or economic sense.
If you consider application choreographies, it's easy to see that CORBA objects and web services are different. For example, all CORBA objects support the CORBA object model, which among other things allows for run-time navigation of objects' interface inheritance hierarchies. Web services, on the other hand, has no object model whatsoever. This is not to say that we equate the lack of an object model with technical inferiority; rather, we point it out only as a significant difference between the two technologies.
There are several possible ways to combine CORBA and web services:
- Implement CORBA objects using web services. This approach is trivial, given that CORBA completely hides implementation details. A client of an object implemented in this manner would be fully shielded from the fact that the object's servant invoked one or more web services to carry out the client's requests. The servant would translate all values passed to it in each request into a form suitable for invoking the web service, and the servant would also translate any return values from the web service before returning a reply to its client, be it a normal reply or an exception. Such a CORBA object would serve as a wrapper for the web services that implement it, and its primary responsibility would be mapping between CORBA application choreographies and Web Service application choreographies. As part of doing this, it would also have to map between the CORBA parameter data and the data passed to and from the web services. Given CORBA's traditional role as integration-oriented middleware, wrapping and mapping are two tasks that are quite common in CORBA object implementations, which have been used to wrap and map devices, databases, mainframes, packaged applications, and everything in between.
- Implement web services using CORBA objects. This approach is similar to the above approach, and in practice is a more common scenario than the previous one. This is because web services are viewed as a more unifying technology than CORBA, primarily due to the fact that web services are accessed over the ubiquitous Web infrastructure. In this scenario, a web service implementation, typically written as a J2EE servlet , accesses "back-end" CORBA objects to carry out its function, wrapping (and thus hiding) those CORBA objects, and mapping between the different application choreographies and data. The clients of the web service neither know nor care that CORBA objects implement it.
- Exposing CORBA objects as web services. This approach is often discussed (and even marketed by some products) as if it were both obvious and trivial, but it is neither. Rather, this approach is simply untenable. The reason is straightforward: CORBA application choreographies and web services application choreographies do not match. For example, would directly exposing a CORBA object require that the Naming service where it's registered also be exposed as a web service? Would a client of such a web service know enough about Naming to be able to navigate it and find its target service? What entity would be responsible for translating SOAP parameter data into CORBA parameter data, and translating the object's IDL definition into WSDL (Web Services Description Language) for use by the web service client? Could the web service client handle GIOP location forward replies, and if so, how?
The fact that the questions raised above are so difficult to answer is one major reason why direct SOAP access to CORBA objects has not caught on. Because SOAP and web services have evolved together, SOAP is now viewed as being synonymous with web services. As the questions above imply, CORBA application choreographies prevent CORBA objects from serving directly as web services, and therefore the ability to access such objects directly via SOAP is of little or no value.
Though it sounds obvious to say that you cannot integrate your applications unless you address their choreographies, many integration projects fail for this very reason. Such projects tend to consider integration and mapping at only the transport and protocol levels, which is like trying to translate from one human language to another by translating only the words and ignoring sentence structure. As a consequence, these projects often fail to satisfy their long-term end user and customer needs, even though they may perform quite well for limited use cases.
Middleware for Middleware
Rather than competing with existing middleware technologies, web services is evolving into a role of integrating middleware. Unlike its RPC and EAI (Enterprise Application Integration) technology predecessors, web services technologies and standards enjoy unprecedented industry backing. All the major players and most of the minor ones fully support and endorse the standardization of SOAP and WSDL. So far, the industry remains unfragmented with respect to these standardization efforts, something that never happened for Sun RPC, DCE, COM/DCOM, CORBA, or J2EE. This fact alone puts web services in the driver's seat when it comes to being able to integrate different types of middleware, including CORBA, J2EE, and Microsoft .NET.
By no means, however, does the unified industry backing of web services standards imply that all other middleware technologies, including CORBA, are going to disappear. CORBA is already proven itself as being more than capable for solving a variety of distributed computing and integration problems, and its high performance, dependability, scalability, and great flexibility have enabled solutions in cases that were previously insolvable. Other middleware technologies, such as .NET, J2EE, and EAI, have their places as well. In many cases, systems based on these technologies have been running successfully within production environments for years, and their owners are not about to "fix what ain't broke."
Given that the role awaiting web services is one of middleware integration, not middleware replacement, one must consider what such integrations might look like. After all, middleware is generally viewed as integration technology, so what does it mean to integrate technologies which themselves serve to integrate? Here, we again turn to the issue of application choreography, but this time at a business process level.
Traditional CORBA integration approaches have typically been used to integrate systems within controlled environments, such as within intranets owned by a single company. Web services, on the other hand, appear to hold promise not only for integration within an intranet, but also integration across the Internet. In essence, integration via web services appears to be headed toward a level of granularity that is more coarse-grained than the typical CORBA-based integration, and toward systems that are more loosely coupled than the typical CORBA system. They achieve loose coupling by minimizing interface dependencies and focusing more on the exchange of XML-defined data in essence, and unlike CORBA objects, they are more document-oriented than method-oriented.
Due to the uncontrolled nature of the Internet, web services intended for consumption by trading partners and others outside of your organization will require strict definitions for their application choreographies. Nailing down e-business interactions between trading partners is not simply a technical requirement to allow their applications to talk to one another. Rather, it's also a business requirement so that each party knows what it's giving and receiving and a legal requirement, so that contracts are correctly honored.
This is where the essence of EDI, which we described above as being one of the three technology areas feeding into web services, enters the picture. In short, EDI blazed the trail for standardizing the business processes and business documents required for computing applications to correctly and automatically conduct business on behalf of their owners. This trail is now being followed by modern-day standards such as ebXML (<www.ebxml.org>), RosettaNet (<www.rosettanet.org>), and BizTalk (<www.biztalk.org>). Ultimately, the web services approach is best used to integrate not through programming, but through workflow business logic residing within CORBA systems and other traditional middleware in order to support standardized business processes and documents.
Without standard application choreographies, web services will be forever limited to simple stock quote services and foreign exchange rate calculators. Similarly, without the existence of back-end integrated systems based on traditional middleware, such as CORBA, web services will have nothing to wrap, map, and expose. The fact that SOAP is based on ubiquitous web infrastructure is a bonus, but that alone is not a compelling reason to replace traditional middleware systems with web services. Rather, the strength of web services lies in its ability to implement the business choreographies needed for A2A integration over the Internet using the business logic already residing in your CORBA systems. Comparing CORBA and web services as if they were competitors is highly misguided. Instead, they are complementary, and hopefully we have convinced you that together they solve a problem that neither can solve alone.
This column concludes our series on CORBA and XML. We hope you have gotten as much out of this series of columns as we have in writing it. If you have comments, questions, or suggestions regarding these columns, please let us know at [email protected].
 A. Gokhale and D.C. Schmidt. "Optimizing a CORBA IIOP Protocol Engine for Minimal Footprint Multimedia Systems," Journal on Selected Areas in Communications, special issue on Service Enabling Platforms for Networked Multimedia Systems, September 1999.
Steve Vinoski is chief architect and vice president of Platform Technologies for IONA Technologies and is also an IONA Fellow. A frequent speaker at technical conferences, he has been giving CORBA tutorials around the globe since 1993. Steve helped put together several important OMG specifications, including CORBA 1.2, 2.0, 2.2, and 2.3; the OMG IDL C++ Language Mapping; the ORB Portability Specification; and the Objects By Value Specification. In 1996, he was a charter member of the OMG Architecture Board. He is currently the chair of the OMG IDL C++ Mapping Revision Task Force. He and Michi Henning are the authors of Advanced CORBA Programming with C++, published in January 1999 by Addison Wesley Longman.
Doug Schmidt is an associate professor member at the University of California, Irvine. His research focuses on patterns, optimization principles, and empirical analyses of object-oriented techniques that facilitate the development of high-performance, real-time distributed object computing middleware on parallel processing platforms running over high-speed networks and embedded system interconnects. He is the lead author of the book Pattern-Oriented Software Architecture: Patterns for Concurrent and Networked Objects, published in 2000 by Wiley and Sons. He can be contacted at [email protected].