Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

.NET

ODBC Driver Development


Packaging Calls in XML-SOAP and Communicating Over Sockets

I mentioned earlier that the calls received by your driver are packaged using XML-SOAP and communicated to the server via sockets. The server responds with an XML-SOAP response on the same socket, which is interpreted by the driver and then sent to the application as per the ODBC standard. The key point is that the communication style between your driver and datasource or database server is your choice. The application is not interested in knowing how you obtain the data; it just expects it in the buffers it has specified and in a manner defined by ODBC.

XML is a platform-neutral data representation protocol. I believe that XML has two attributes that make it unique. First, it is text based and second, it allows hierarchical data to be represented easily. By using XML, data can be serialized into a transmissible form that is easily decoded on any platform. SOAP provides an encoding scheme over XML and thus inherits all the advantages of XML. There is no existence of SOAP separate from XML. The prime target of SOAP is interoperatibility while using XML encoding for request-response in IPC/RPC. In the SOAP framework, few XML tags are required, and data is serialized in a standard fashion as per the type of data. This implies that sender and receiver do not need to negotiate the common parts of every communication.

Table 2 shows a call to function GetAccountBalance with one parameter i.e. account number encoded in XML as well as XML-SOAP.

Table 2: Encoding a call in XML and SOAP/XML.

I have kept the tags required by SOAP in uppercase. As you can see from Table 2, SOAP requires you to put your request-response in an ENVELOPE tag and divide it into two parts with HEADER and BODY tags.

When the response is constructed, the function or method name is appended with a string — RESPONSE, so GetBankBalance becomes GetBankBalanceResponse. If an error situation arises, a FAULT tag is used instead of the method name with details of the error like code, message etc. Table 3 shows a response on success and a response on error.

Table 3: SOAP response for success and error.

That is almost all about SOAP. You may be surprised, but this relatively new and simple looking text-based protocol is challenging COM and CORBA as far as RPCs are concerned. The clear advantage being ease of design and implementation, compatibility across various systems, ability to piggyback on protocols like HTTP, and to perform RPC even on the Internet.

One thing I have left out purposely is the way complex or compound data types like arrays or structures are encoded in SOAP. If you are interested in learning more on this, you can check the latest SOAP specification on w3c site or read the excellent article by Don Box —A Young Person’s Guide to the Simple Object Access Protocol. Now that we have some idea about encoded calls and responses, let’s see how sockets transport these across process, machine, and network boundaries.

A socket represents an endpoint for communication between processes across a network transport. The two endpoints are connected through a communication link. The communication link is an abstract expression for everything involved in transmitting data. Understanding sockets becomes very easy if you think you are opening a file, which you can write to or read from. The difference is that the name of the file in this case is a network address and a port number, so instead of test.txt you have something like 127.0.0.1:80. Also, the server serves the content of this file hot. The socket API has been there and has almost remained unchanged since the early Unix days (hence the common name Berkeley sockets).

You do not have to worry about how sockets are implemented. As long as TCP/IP is installed and you are able to see a machine on the LAN or access it on the Internet, you should be able to open a connection to it using a socket API subject to the condition, that there should be a listener on the other end. To make things clear, let’s take an everyday example of sockets in action. Internet Explorer, or any of your browsers, opens a socket connection to the site using its IP address and port 80, which is the standard port for HTTP communication. What follows is simple socket IO. The steps in the process are:

  1. Browser opens a socket connection with server address and port 80
  2. Browser writes the request for a resource (e.g., an HTML page) to the socket
  3. Web-server reads the request from the socket and prepares a response (e.g., the requested HTML page)
  4. Web-server writes back the response to the socket
  5. Browser reads it, interprets it as per Content-Type, and renders it as necessary

The browser here is a socket client and the web-server is the socket server. Our ODBC driver acts as a socket client and communicates with our database server, which acts as a socket server.

Let’s take an example in context of our driver on how this whole process of request response works. The client calls SQLExecDirect for the query SELECT * FROM authors. I will show you the actual encoding which the driver ODBCDRV1 (provided with the download) is going to do for this call and also the response template, since the actual data in response is dependent on the server.

By looking at Example 4, the use of ENVELOPE, HEADER, and BODY tags must be obvious as part of a SOAP request. I have put a connection ID in the header. Since the server would need to know which client is making the request, the client should always send this information with the request. The header seemed a good place to me, but again SOAP does not specify anything, it’s your choice. Note that the samples I have provided as download do not use this ID and work in a stateless fashion to keep things simple.

Example 4.

<ENVELOPE>
    <HEADER>     
        <CONNECTID>12345</CONNECTID>
    </HEADER>
    <BODY>
        <SQLExecStmt TYPE="General">
            <STMT>SELECT * FROM authors</STMT>
        </SQLExecStmt>
    </BODY>
</ENVELOPE>

The choice of method name is also between you and your server, so any name, which you program your server to recognize, will do. I have used SQLExecStmt but it could be any valid XML tag name or even SQLExecDirect, the name of the API. Parameters to functions are encoded as child elements of the method name element. The query statement is sent as STMT element, a child of the method name element. I have used the terms tag and element interchangeably throughout the text, but they both refer to any valid XML name inside angle brackets.

This is how most of the calls are encoded. Although you will see a reasonably long list of exported function calls in your DEF file, the types of call you need to encode are not many. You will see that a generic encoder with a few parameters will work well. Before we see how this request is sent to the server using a socket, let us go a little deeper and see how encoding is done. This is accomplished using the XML parser and the XML DOM (Document Object Model). A request, as shown in Example 4, is directly prepared using DOM programming while a response received as a stream of bytes is processed into DOM using the parser. I describe both of these cases below. A complete working example is available in the download as XMLTEST.

Encoding XML Request or Response Using DOM Methods

The Document Object Model is a platform and language-neutral interface that defines a document as a structure that can be programmatically manipulated with ease. If you want to search for the STMT tag and extract its immediate content, the DOM would expose a standard walk and search method for this. More information on this is available at http://www.w3.org/DOM/.

Since XML is hierarchical and also supports a list of attributes within an element, we have to provide a structure which can store a tree of elements as well as contain a linked list of attributes. I have implemented this as a C++ class XMLNode in the file XMLTREE.CPP. The request shown in Example 4 can be encoded with ENVELOPE as the root node containing two child elements HEADER and BODY. The HEADER in turn contains the CONNECTID and so on. Example 5 shows how this can be done.

Example 5

 XMLNode*        root;
    XMLNode*        node1;
    XMLNode*        node2;
    XMLNode*        node3;
    XMLNode*        nodeattr;

    // create the root element
    root = XMLNode::CreateElement ( "ENVELOPE", NULL );

    // create the HEADER and append it to ENVELOPE
    node1 = XMLNode::CreateElement ( "HEADER", NULL );
    root->AppendChildNodeBeforeX( node1, NULL );

    // create the CONNECTID and append it to HEADER
    node2 = XMLNode::CreateElement ( "CONNECTID", "12345" );
    node1->AppendChildNodeBeforeX( node2, NULL );

    // create the BODY and append it to ENVELOPE
    node1 = XMLNode::CreateElement ( "BODY", NULL );
    root->AppendChildNodeBeforeX( node1, NULL );

    // create methodname tag and append it to BODY
    node2 = XMLNode::CreateElement ( "SQLExecStmt", NULL );
    node1->AppendChildNodeBeforeX( node2, NULL );

    // create STMT tag and append it to method name tag
    node3 = XMLNode::CreateElement ( "STMT", "SELECT * FROM authors" );
    node2->AppendChildNodeBeforeX( node3, NULL );

    // create and append the statement type as attribute
    nodeattr = XMLNode::CreateAttribute ( "Type", "General" );
    node1->AppendChildNodeBeforeX( nodeattr, NULL );

    // stream the tree to a file
    root->StreamToFile ( 0, _FILE_STDOUT, stdout );

The samples in the download are built around my parser and DOM implementations, but you are free to adopt anything of your choice like MS XML (Microsoft), Xerces (Apache Project), or Expat (James Clark) provided you are willing to change the remaining code. I suggest that you first be comfortable with ODBC driver development and then move on to changing these underlying tools and technologies.

The root node now encapsulates the complete request. The CreateElement is a static member function of the XMLNode class, which allocates an XMLNode object and assigns it the name specified in the first parameter. If the element is supposed to contain text, the user can specify the text as the second parameter. A child node is then allocated and its value set to this second parameter. The AppendChildNodeBeforeX is used to create associations between the nodes and allows you to append a specified XMLNode object as a child to another XMLNode object before the specified object from the already existing child nodes or as the last child if NULL is specified in the second parameter. The StreamToFile method can be used to stream the request to screen, file, or socket. This method walks through the tree and writes the content to the specified target. I have explained sockets conceptually earlier in this section; let’s see an example on how a socket client can be implemented.

Socket Client and Server

In Windows OS, you need to initialize the Winsock library before any other socket related calls are used. This call WSAStartup also requires you to specify the version (1.0,1.1,2.0) required by your program to work. Next, you create a socket using the API call socket specifying the address family (AF_INET for Internet style addresses), type of socket (stream or datagram), and also the protocol to use (zero for using the available transport like TCP/IP). The socket is now created but is not connected. Specify the target server address along with the port and address family using the SOCKADDR_IN structure defined in winsock.h in call to connect. This call actually connects you to the specified socket server.

You must have the correct IP address of the server and the port on which the server is listening or else the connection will fail. Also, you need to convert the port number to network byte order from the host storage so that it is independent of the platform on which the client or server is running. I mention this because Intel-based systems store the integer values in format different from what is used by the RISC-based systems. Also, if you have a DNS name like www.yahoo.com, you can use the gethostbyname API to convert it to an IP address. This API has not been shown in the example but is part of samples in the download.

Once connected, you can send the request using send and receive the response sent by the server using recv. After the request response session is complete, you can indicate the end of communication using shutdown and finally destroy the socket using closesocket. Example 6 shows this. It connects to the socket server on the local machine (127.0.0.1) on port 9999. I will explain below the implementation of a basic socket server to serve this client. The client will send a string “Hello server” to the server and the server sends back the string “Hello Client”. Error checking is not shown for brevity. You must link this program with mswsock.lib or ws2_32.lib for it to work.

Example 6.

int                               status;  // status or return values
SOCKET                     sckt;           // socket handle like file handle
WSADATA                 wsadata;           // winsock requirement for startup
SOCKADDR_IN         address;               // structure for IP, port etc.
char                           buf[128]   // buffer to recv response

    // winsock initialization, specify the version required
    status = WSAStartup ( MAKEWORD( 1, 1 ), &wsadata );

    // create a socket, no address associated as yet
    sckt = socket ( AF_INET, SOCK_STREAM, 0 );  

    // address details of server to communicate with
    address.sin_addr.s_addr = inet_addr ( "127.0.0.1" );
    address.sin_family            =  AF_INET;            // address family
    address.sin_port               =  htons ( 9999 );    // port number

    // connect to server analogous to opening a file
    status = connect ( sckt, ( struct sockaddr* )&address, sizeof(address));

    // send a hello to server/listener
    status = send ( sckt, "Hello server", strlen ( "Hello server" ), 0 );

    // recv response from server (limited to 128 bytes )
    status = recv ( sckt, buf, 128, 0 );

    // initiate a shutdown for both send and recv
    status = shutdown ( sckt, 0x02 );                 

    // close the socket handle
    status = closesocket( sckt );

    // winsock finalization
    WSACleanup ();

To play around more, you can replace the localhost address 127.0.0.1 with the IP address of your favorite site, replace the port 9999 with standard HTTP port 80, and specify GET /pagename.htm\r\n\r\n in the send string. You will now be able to get at least the first 128 bytes (capacity of the variable in code snippet) of the specified HTML page typically seen using the browser.

This is how the XMLNode class used by the sample driver ODBCDRV1 streams the XML request to the socket using the TSocketClient class. You will see that implementing a basic socket server is equally simple. Example 7 shows how we can create a socket server, which can serve our simple socket client, described above. The difference here is that instead of using the connect API after creating a socket, we bind it to an address and port available on the machine using bind. The address and port are specified through the SOCKADDR_IN structure. Next, we set the socket in listening mode using the listen API. Note that the socket is now ready to listen and accept connections but is not yet accepting connections. The accept API is the call on which your program stops and waits for a connection. This API on the server side serves the connect API from the client. Another distinguishing factor is that when the client connects, the accept API creates and returns a new socket on which the communication with the client takes place. The first socket can continue listening for other connections; this is how a single server serves multiple clients. (The example does not show this.) If you already have a database server listening and serving sockets then you need not worry about this part.

Example 7.

Int                          status;      // status or return values
Char                      buf[128];       // buffer to recv request
SOCKET               listen_sckt;         // socket handle to listen
SOCKET               conn_sckt;           // socket handle for IO
WSADATA            wsadata;               // winsock requirement for startup
SOCKADDR_IN    listen_addr;                // structure for IP, port etc.

    // winsock initialization, specify the version required
    status = WSAStartup ( MAKEWORD( 1, 1 ), &wsadata );

    // create a socket, no address associated as yet
    listen_sckt = socket ( AF_INET, SOCK_STREAM, 0 );

    // prepare to bind socket to a port on any IP for the machine
    listen_addr.sin_family              = AF_INET;          // address family
    listen_addr.sin_port                  = htons ( 9999 ); // port number
    listen_addr.sin_addr.s_addr    = htonl ( 0 );           // any IP address

    // bind the socket to the address family, port, IP
    status = bind ( listen_sckt, ( const struct sockaddr* )&listen_addr, 
                              sizeof(listen_addr));

    // switch to listen mode to allow connections
    status = listen ( listen_sckt, 1 );

    // wait and accept connections
    conn_sckt = accept ( listen_sckt, NULL, NULL );

    // recv request from client
    status = recv ( conn_sckt, buf, 128, 0 );

    // send a hello to client
    status = send ( conn_sckt, "Hello Client", 12, 0 );

    // initiate a shutdown for both send and recv
    status = shutdown ( conn_sckt, 0x02 );                 

    // close the client connection socket
    status = closesocket( conn_sckt );

    // close the listening connection socket
    status = closesocket( listen_sckt );

I have described socket communication in the simplest form. There are two more important issues — synchronization and scalability. Synchronization defines how the client or server determines that the other has completed writing his request or response so that it can move ahead and act on it. There are two ways to solve this. One is the HTTP way of providing headers containing the length of content to follow. The actual content starts after the headers and an empty line. The other way is to have a fixed signature at the start and end of the content, something similar to MIME. I prefer the second way since our requests and responses are prepared in DOM and streamed directly while walking the tree. Determining the length of the content in advance would require walking the DOM tree twice, which would be cumbersome. I use a fixed signature of ___\x4\x4MSG_SOCK\x4\x4___ defined as MSG_SOCK_SIGN in SOCK_CLI.HPP and SOCK_SVR.HPP. You can change it if you wish. A discussion on socket scalability is outside the purview of this article. Winsock provides a number of models depending upon the nature of application and scalability requirements. I suggest reading Network Programming for Microsoft Windows, by Anthony Jones and Jim Ohlund.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.