Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Implementing RMI for C++ Objects


April 2003/Implementing RMI for C++ Objects


Introduction

Distributed systems require that computations running in different address spaces, potentially on different hosts, are able to communicate. "Raw" communications via sockets require applications to engage in application-level protocols to encode and decode messages. The design of such protocols is cumbersome and time consuming.

RPC (Remote Procedure Call) abstracts the communication interface to the level of a procedure call. RPC, however, does not translate well into distributed object systems, where communications are needed between program-level objects in different address spaces.

To match the semantics of object invocation, distributed object systems require RMI (Remote Method Invocation) [1]. RMI provides the ability to call a method on a remote object using the same syntax as for a local object. It essentially allows objects residing on another machine to be treated like they are almost local to your machine.

In this article, I discuss some ideas that accomplish the RMI access mechanism for remote C++ objects. This technique is not limited to communications between applications written in C++. It is equally effective for implementing RMI access to remote C++ objects from clients written in, say, Java.

The sheer number of acronyms associated with distributed programming is overwhelming — COM, DCOM, IIOP, CORBA, XML-RPC, SOAP, etc. Is there room for RMI? More so, is there room for your own RMI? I believe the answer is "yes." RMI occupies its own very distinct functional niche. It does not compete with the CORBA environment and the likes. CORBA objectives are substantially more ambitious. Associated expenses and limitations of the imposed framework are often not justified for projects that do not have application distribution as their primary focus.

As in real life where your modest Toyota Corolla is often a reasonable, faster, and cheaper alternative to a Rolls Royce, your own communications infrastructure is often better suited for your purposes. Although straightforward implementations are common (with C++ objects packed into an XML message, sent over the wire, and unpacked), an RMI infrastructure is a better choice.

Similarly RMI does not compete with and has not been made obsolete by the SOAP (Simple Object Access Protocol) initiative. RMI is rather an infrastructure that provides access to remote objects via a familiar interface. SOAP, on the other hand, is an XML-based communication protocol that COM, CORBA, or RMI implementations might employ for communication purposes.

Design

The basic idea behind RMI is fairly straightforward and well developed. It is a classic case of the Remote Proxy pattern thoroughly discussed in [2]. The idea is to provide a local proxy object through which all communications with the actual remote object will be channeled. The main responsibilities of such a proxy as described in [2] are:

  • Providing an interface identical to the real object.
  • Controlling creation, access to, and destruction of the real object.
  • Encoding requests to and decoding replies from the real object.

These goals provide clear guidance for implementation. The interesting part is providing a sufficiently powerful and flexible infrastructure that encapsulates common RMI functionality and enables easy implementation (and even auto-generation) of the RMI code based on the declaration of an actual class. Thanks to C++'s exceptional power and versatility, the implementation of a working prototype is a matter of two evenings' work. However, the devil is in the details. It actually took four times longer to iron out some "wrinkles." Figure 1 shows the proxy connection path to the remote object and relationships between classes comprising the RMI infrastructure.

Getting Started

// Original classes.
class Widget
{
  public: 

  virtual ~Widget();
  Widget(const std::string& name);

  private: ...
};
class Circle : public Widget
{
  public: 

  ~Circle();
  Circle(const std::string& name,
         const Point& center, 
         int radius);

  private: ...
};

// Client-based proxy classes
// generated in rmi_widget.h and
// rmi_circle.h declaration files.
namespace RMI {

struct Widget : public RMIB
{
  ~Widget();
  Widget(const std::string& name);
  Widget(const Widget&);
  Widget& operator=(const Widget&);
  // RMI support ctor.
  Widget(const Call& c) : RMIB(c) {}
};
struct Circle : public Widget
{
  // No dtor needed.
  Circle(const std::string& name,
         const Point& center, 
         int radius);

  Circle(const Circle&);
  Circle& operator=(const Circle&);
  // RMI support ctor.
  Circle(const Call& c) : Widget(c) {}
};
} // End of RMI namespace.

The code fragment above shows two stripped-down classes and the corresponding RMI proxies generated for them. Proxies are essentially interface classes. Therefore, they dutifully replicate public interfaces of the original classes. Unlike their real counterparts, proxies do not have the luxury of implicit function generation, and they always add their own "maintenance" RMI::Proxy::Proxy(const RMI::Call&) constructors.

Generated proxy declarations can be used to write the following code on the client:

// Include proxy declarations.
#include <rmi_widget.h>
#include <rmi_circle.h>

using namespace RMI;

int main()
{  ...
  // Connect to the server.
  RMI::connect(...);
  ...
  Point center(11, 22);
  // A RMI::Circle proxy (not the real
  // Circle object) is created on the
  // client.
  Circle circle("Circle", center, 33);
}

That ordinary looking (that was the goal!) example does a lot more "under the hood" than it shows. The RMI::Circle proxy constructor communicates with the server, which in turn remotely creates a Circle instance and returns an RMI reference to the instance. When the circle proxy goes out of scope, it destroys the remote Circle object that it created and represented.

The necessary steps for building an RMI-enabled distributed system are essentially very similar to those required by the Java RMI [1]:

  1. Write Widget and Circle classes.
  2. Run these classes through the RMI code generator to generate declarations for corresponding RMI::Widget and RMI::Circle proxies and the necessary client/server infrastructure.
  3. Compile generated server code.
  4. Write and compile the client application using generated RMI proxies.
  5. Run the system.

The Java concept of dynamic class loading from the server is not supported. Unlike Java (a programming environment), C++ (a programming language) does not specify bytecode formats. That makes implementation of such a feature unrealistic.

Extending the Interface

class Widget
{ ...
  const std::string& name() const;
};
class Circle : public Widget
{ ...
  const Point& center() const;
  void center(const Point&);
};

namespace RMI {

struct Widget : public RMIB
{ ...
  std::string name() const;
};
struct Circle : public Widget
{ ...
  Point center() const;
  void center(const Point&);
};
} // End of RMI namespace.

Given the interface, writing the client code to create, access, and delete remote objects is hardly different than using real Circle instances locally:

using namespace RMI;
using std::string;

int main()
{ ...
  Point center(11, 22);
  Circle circle("Circle", center, 33);
  string name = circle.name();
  circle.center(Point(33, 44));
}

You might have noticed that unlike their real counterparts proxy member functions do not return references or pointers (in a C++ sense). It is hardly surprising. All the data resides on the server. There is nothing in the client address space to which the proxy can return a reference or pointer. Therefore, return values from remote methods are returned (and arguments to remote methods are passed internally) by an RMI reference (proxy) or by value, using object serialization. Any data of any type can be passed to or from a remote method as long as the data is a registered remote object (having an RMI proxy generated for it), or a serializable object (having conversion functions implemented for the type).

However, it should be mentioned that some object types must not be sensibly passed to or returned from a remote method by value. Most of these objects, such as a file descriptor, encapsulate information that makes sense only within a single address space.

Adding Your Own Types

So far I have not mentioned the Point class used in the examples above. Although it is a no-brainer to deduce the class's declaration, it is impossible to figure out if Point instances are being created locally or remotely. That is the remarkable degree of transparency provided by the discussed RMI infrastructure. However, for the code to compile, Point needs to be integrated into your distributed environment. In order to do that, you need to decide if Point instances are to be created in one of the two following ways:

  1. Remotely (on the server) and returned or passed by an RMI reference.
  2. Locally (on the client or the server) and, therefore, returned or passed by value.

If you go with the first option, you'll need to generate an RMI::Point proxy in the same way you did for Widget and Circle:

// The original class
class Point
{ ...
  Point(int x, int y);
  ...
  int x_;
  int y_;
};

// Generated RMI proxy
namespace RMI {

struct Point : public RMIB
{
  ~Point(); 
  Point(int x, int y);
  Point(const Point&);
  Point operator=(const Point&);
  // RMI support ctor.
  Point(const Call& c) : RMIB(c) {}
};
} // End of RMI namespace.

Easy. However, this approach is likely to be somewhat inefficient for small or transient objects. Consider:

{  ...
  RMI::Point center(11, 22);
  RMI::Circle* circle =
    new RMI::Circle("new", center, 33);
}

This trivial fragment will result in the following message exchange between the client and the server:

  1. The client creates an RMI::Point proxy and sends a request to create a remote Point object.
  2. The server sends the reply with the reference to the new Point object.
  3. The client sends a request to create a remote Circle object passing the RMI::Point reference as an argument.
  4. The server sends the reply with the reference to the new Circle object.
  5. The client sends a request to delete the remote Point object when center goes out of scope.
  6. The server sends the confirmation of successful deletion.

This is quite a "conversation" for just two lines of code. More so, for every proxy on the client, there is a real object on the server allocated using operator new. Remotely allocating and destroying a small and transient Point object is likely to be unnecessarily expensive performance-wise and resource-wise. Instead you might prefer creating Point objects locally and simply passing them by value.

{  ...
  Point center(11, 22);
  RMI::Circle* circle = new
    RMI::Circle("new", center, 33);
}

I use full type qualifications to highlight a subtle change — a Point instance is created instead of an RMI::Point proxy. The client-server dialog shrinks considerably:

  1. The client sends a request to create a remote Circle object passing a locally created Point object by value as an argument.
  2. The client receives the reply with the reference to the new Circle object.

This approach requires the Point class to be serializable. Object serialization is the process of converting a complete object from the memory-based format into a format better suited for storing the object on disk, sending it over the network, etc. In the context of the discussed RMI infrastructure, the serialization requirement translates into having a pair of conversion routines for the Point type:

namespace RMI
{
  string
  convert(const Point& point)
  {
    ostringstream stream;
    stream << point.x()
           << " "
           << point.y();
    return stream.str();
  }
  template<> 
  Point
  convert<Point>(const string& str)
  {
    int x, y;
    istringstream stream(str);
    stream >> x >> y;
    return Point(x, y);
  }
}

The RMI infrastructure already provides serialization routines for some classes and primitive types (int, double, std::string, etc.). The RMI namespace is open — add serialization routines for more types when the need arises.

However, I must warn you against getting too adventurous serializing your classes. Java RMI uses a technology called object serialization to transform an object into a linear format. That technology essentially flattens an object and any objects it references. An object can be simple and self-contained or it can refer to other objects in complex graph-like (or even cyclical graph) structures. In general terms, serializing an object is not a trivial task. Unlike Java, C++ does not address the issue, and you'll have to integrate libraries, taking care of serialization if you want to use it extensively. A safer approach would be to limit the use of serialization and rely on passing objects by an RMI reference. This sensible approach is favored by Java RMI and the "remote object" parameter type.

Behind the Interface

Listing 1 shows the implementation generated for the discussed examples. You may find it boring — it was meant to be so for easy auto-generation. The RMI infrastructure classes are working "double shift" for those generated functions to be as uninteresting as they are. All the client-server communication complexity is hidden inside the RMI::Call class.

class Call
{ ...
  typedef Parameter arg; 

  Call(const RMIB& object_id,
       const char* func_signature, 
       const arg& arg1 = arg::none(),
       const arg& arg2 = arg::none(),
       const arg& arg3 = arg::none())
  : result_(call(instance, func,
                 arg1, arg2, arg3))
  {
  }
  ...
  private: Parameter result_;
};
static

Parameter 
call(
  const RMIB& instance,
  const char*      func, 
  const Parameter& arg1, 
  const Parameter& arg2, 
  const Parameter& arg3)
{
  write(pack(
    instance, func, arg1, arg2, arg3));
  return unpack(read()).arg(0);
}

The Call constructor provides data (pack), sends it over to the server (write), reads and unpacks the reply (read and unpack), and stores the return value in Call::result_. Call takes the remote object ID (its memory address in the server address space), the signature of the remote method, and the arguments for the method. When an RMI proxy constructor or a static function is called, you have no remote object ID. Therefore, null is passed instead (see the constructors in Listing 1). I chose function signatures to be in the familiar C++ declaration form. For example, the Circle constructor signature will be Circle::Circle(const std::string&, const Point&, int). The choice is arbitrary and can be any string that uniquely identifies the remote method.

Point center(11, 22);
Circle circle("new", center, 33);

For the fragment, the remote call to the Circle::Circle(const std::string&, const Point&, int) function will be packed in one of the following formats:

"0x0^Circle::Circle(const std::string&, 
  const Point&, int)^new^0x47180^33" 
"0x0^Circle::Circle(const std::string&, 
  const Point&, int)^new^11 22^33"

depending on the center passed by reference ("0x47180") or by value ("11 22"). The format is a simple ASCII string — "object-id^function-signature^arg1^arg2^arg3". This format is not critical and can be ASCII or binary, proprietary, proprietary XML, XML-based SOAP, etc. as long as the client and the server understand each other. However, you might want to consider Java-supported protocols (JRMP, RMI-IIOP, etc.) for easier integration with the standard Java RMI. For my C++-only testing purposes, the overhead of something like XML and SOAP was not justified as transferred data is well structured and variability is minimal.

Once packed, a request is simply sent over to the server. I'll get to the details of the request's processing by the server later. At the moment, it is sufficient to know that the server:

  1. Reads and unpacks the message.
  2. Calls the appropriate remote method on the appropriate remote object.
  3. Packs the return value back to the message.
  4. Returns the message to the client.

Therefore, the client will get back something like "0x0^Circle::Circle(const std::string&, const Point&, int)^0x47280". The only argument ("0x47280") represents the return value from the remote method. For the remote call to Circle::Circle(...), that value will be the address of a newly created remote Circle instance.

While still within the RMI::Call constructor, the reply is unpacked, the return value is converted to a Parameter instance, and stored. Then via the chain of Proxy(const Call&) constructors, the newly created Call instance is transferred to the RMIB base-class constructor where the object ID is initialized with the call's result.

RMI::RMIB::RMIB(const Call& call)
: id_(call.result()), bound_(false)
{
}

Handling Different Types of Arguments

In order to uniformly process arguments of different types, Call uses the help of the Parameter class. Parameter is simple but very convenient. It encapsulates type conversion functionality and employs convert serialization functions to do the job:

class Parameter
{ ...
  // Create a Parameter instance
  // using conversion routines.

  template<class Type>
  Parameter(const Type& value)
  : value_(RMI::convert(value)) {}

  // Implicit conversion back
  // to the original types.

  template<class Type>
  operator Type() const
  {
    return RMI::convert<Type>(value_);
  }
  static const Parameter& none();

  private: std::string value_;
};

Processing Remote Calls on the Server

During generating code for RMI proxies, the corresponding server-based infrastructure is being generated as well. Listing 2 shows the server code generated for the Widget and Circle examples. When the server is running, it merely reads a message from the client, unpacks the message, looks up in the functions map for the function matching the provided function signature, calls the function, packs the result, and sends it back to the client.

void
RMI::Server::run() const
{
  typedef Functions::iterator Fit;

  // Enable connectivity infrastructure.
  RMI::connect(...);

  for (;;)
  {
    Message msg = unpack(RMI::read());
    const string& func = msg.function();
    Fit it = functions_.find(func);

    if (it == functions_.end())
    {
      // Function not found.
   }
   else 
   {
      // Call function,
      // pack the result,
      // send back to the client.
      RMI::write(
        it->second(msg).pack());
   }
  }
}

Server-Based Object Persistency

A remote Foo instance on the server is created and destroyed by the corresponding RMI::Foo proxy constructor and destructor on the client. When created, remote objects are bound to their proxies. (Note the bind() calls in the proxy constructors in Listing 1.) The existence of server-based objects is defined by the existence of their client-based proxies. Consequently, the termination of a client-based session will lead to the destruction of all server-based objects created by that session. It is not always the desired behavior.

A modest part of the RMIB interface provides the means for configuring the link between a local proxy and the remote object. That functionality allows extending lifespan of the server-based object beyond the one client session that created the object.

class RMIB
{ ...
  RMIB& rmi_name(const string&);
  RMIB& bind(bool =true);
  bool is_bound() const;
};

template<class T> 
T find(const string& object_name);

When a proxy is being destroyed, the destruction message is sent to the corresponding remote object only if the object is still bound to the proxy being deleted (see proxy destructors in Listing 1). Otherwise, the remote object outlives the proxy that originally created it and remains in the server address space. If a name is assigned to the remote object, it can be later identified by that name and associated with a new proxy.

{
  Point center(11, 22);
  Circle circle("Circle", center, 33);

  // Assign a name to the rem. object
  // and break the proxy-object bond.
  circle.rmi_name("My circle");
  circle.bind(false);

  // RMI proxy goes out of scope and is
  // deleted. The remote object remains
  // in the server address space.
}
...
{
  // Find the remote object and
  // associate new proxy with it.
  Circle circle = RMI::find<Circle>(
                  "My circle");

  // Bind proxy and rem. object.
  circle.bind();

  // Proxy goes out of scope.
  // Destroys the bound remote object.
}

Versioning with RMI

Once the initial version of a product is shipped, further software development usually becomes substantially more difficult. Developers must take into consideration the existing, already installed product base and ensure backward compatibility. Versioning difficulties in CORBA are well discussed in [4]. The discussed RMI makes the remote classes' modifications totally transparent to the existing clients as long as the function methods' signatures already used by those clients remain the same.

More Things to Do

As I mentioned earlier, the provided code (available for download at <www.cuj.com/code>) is merely a working prototype. Its sole purpose is to illustrate the ideas discussed in this article. Many aspects of C++ functionality have not been addressed in an attempt to stay focused on the principle idea and to keep the article within reasonable limits. Therefore, I am sure the implementation has a lot of room for improvement and further development. For example, you won't find many safety checks. You might want to check for the correct number of passed arguments. Although the RMI code generator builds the code for the client and the server at the same time and I didn't need such a check for my testing purposes, it might be necessary to enable client/server independent upgrades.

The examples and the code on the CUJ website are far from being exhaustive. However, they employ and demonstrate some commonly used C++ functionality like inheritance, static member functions, implicitly generated member functions, virtual functions, etc. Although that functionality might be sufficient for many applications, many aspects of C++ still have not been addressed, such as const attributes, template classes, multiple inheritance, exception handling, special cases like Foo::Foo(const Foo&, int =0) where the same constructor is the copy constructor and a non-copy constructor, and much more.

The RMI Code Generator

The code implementing RMI proxies is very regular if not boring. Writing such code is better left to an automatic code generator like the Java rmic compiler. I won't go into the implementation details of such a code parser. It can be written using your favorite tools such as Korn shell, sed, yacc, lex, Perl, or even "raw" C++. Figures 2 and 3 show the state transition diagram for such a parser.

Environment

The code was compiled using gcc 3.0 on Sun SPARC running Solaris 2.6. Do not waste your time trying to compile the code using non-ANSI-compliant "oldies" such as Sun's C++ 5.0 or Microsoft Visual C++ 6.0.

References

[1] jGuru. Fundamentals of RMI, <http://developer.java.sun.com/developer/onlineTraining/rmi/RMI.html>.

[2] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley, 1995)

[3] Mary Campione, Kathy Walrath, and Alison Huml. The Java Tutorial (Addison-Wesley, 2000), <http://java.sun.com/docs/books/tutorial/java/nutsandbolts/datatypes.html>.

[4] Douglas C. Schmidt and Steve Vinoski. "Object Interconnections: CORBA and XML, Part 1: Versioning," C/C++ Users Journal C++ Experts Forum, May 2001, <www.cuj.com/experts/1905/vinoski.htm>.

Download the Code

<batov.zip>

About the Author

With a beginning in machine code (yes, those zeros and ones) some 25 years ago, Vladimir Batov has developed software ever since for nuclear power stations, air traffic control, military radar, many other things, and just for fun. These days apart from his main interest in C++ and software design, he enjoys good books, tennis, sunset over the bay in Melbourne, Australia, and digging on the beach with his three-year-old daughter. He can be reached at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.