Dr. Dobb's | A Simple Oracle Call Interface

A Simple Oracle Call Interface

Maciej's database library lets you "embed SQL code within C++ applications.

December 01, 2004
URL:http://www.drdobbs.com/cpp/a-simple-oracle-call-interface/184405930

Relational databases are important components of most information systems. Independent of the actual functional problems that they solve, there is always an issue of interfacing between host languages (that is, languages in which the client application is written) and the database server. Not surprisingly, there are different approaches and libraries available to developers, ranging from low-level interfaces shipped by database server vendors to entire mediating layers. Most of these methods impose the necessity of using lots of function calls that are not conceptually relevant to the problem at hand, having only a technical purpose—binding and defining variables, statement preparation, and execution are examples of such calls, for instance.

Another approach is to use embedded SQL, where the SQL code and host language (C or C++, for instance) are mixed, creating a kind of "dialect" where database-related code is as minimal as possible. Pro*C/C++ is one example of this approach. The problem with such approaches, however, is that they use special preprocessors that replace the embedded SQL statements with low-level calls in the host language. This replacement step is artificial and does not mix with the rest of the language, especially with the type system.

In this article, I present a database library in which language features, such as operator overloading and templates are exploited to provide the effect of "embedding" SQL code within C++ applications, staying entirely within the Standard C++ language. (The complete source code for the library is available here.) The advantage of this approach is that the provided syntax plays safe with the rest of the language and does not require a separate preprocessing or compilation step. As a proof-of-concept, I use the Oracle Call Interface (OCI) as the underlying database interface. It should be possible to reuse most of these techniques with other interfaces.

Choosing the Most Natural Syntax

Since a library such as this is intended to provide the most natural and simple syntax for database interfacing, the code it uses should be familiar and not contain distracting elements. In short, you should need to provide only the information that is needed to solve your problem. For example, Example 1(a) means "write the object i to the standard output." There are many things happening when this line is executed, but those that are not changing with regards to the problem at hand (stream buffering, synchronization, locale setting, and the like) are kept out of sight. It is possible to manipulate all the details, but in most of the code there is no need to do so, and Example 1(a) is indeed minimal—it is not possible to make it syntactically simpler without losing the intent.

The question that motivated me to build this library was: Is it possible to achieve the same syntactic minimalism in database interfacing? In other words, what is the minimal syntax for accessing the database? With this in mind, I propose Examples 1(b) and 1(c) which illustrate a syntax that hides details as much as possible, leaving only the information relevant to the problem—database connection, SQL query, and source/destination of local data. The advantage of this syntax is that it is easy to understand if you already know SQL and it requires no more than is really necessary—conceptually, nothing can be removed from this syntax without losing the overall sense, just as in the regular code using the IOStreams library. Is it possible to have a database library that provides this syntax? Thanks to C++ operator overloading, templates, and lifetime rules of temporary objects, the syntax in Example 1 is not only possible, but easy to provide.

(a)

cout << i;

(b)

int count;
sql << "select count(*) from person", into(count);

(c)

int id = 7;
string name = "John";
sql << "insert into person(id, name) values(:id, :name)", use(id), use(name);

(d)

sql << "query", into(var1), use(var2), ...;

Example 1: Choosing a syntax.

The Machinery

Listing One shows two classes—Session and Statement—that provide the interface that's similar to other database interface wrappers and that encapsulate most of what is happening when low-level OCI calls are used. What's important is that the interfaces in Listing One are complete with regard to database access—it is possible to execute all the basic SQL queries with the help of the relevant methods. Of course, such classes cannot provide the syntax that's proposed in Example 1, but they are enough to form a solid foundation for the complete library. Interestingly, the Session and Statement classes encapsulate most of the inner workings of the OCI library, but the low-level concepts of database operation are still exposed.

Listing One

class Session
{
public:
    // connect to the database
    Session(std::string const & serviceName,
        std::string const & userName,
        std::string const & password);
    ~Session();
    void commit();
    void rollback();
    // ...
};
class Statement
{
public:
    Statement(Session &s);
    ~Statement();
    // accumulate the binding info
    void exchange(/* ... */);
    // prepare the SQL statement
    void prepare(std::string const &query);
    // use the binding info to actually bind the variables
    void defineAndBind();
    // execute the query
    bool execute(int num = 0);
    // fetch the next row of data
    bool fetch();
    // ...
};

Typical database access can be expressed on a step-by-step basis; see Figure 1. Each step more or less maps to the methods of the Session and Statement classes, and the code that uses a typical database library needs to closely reflect all these steps. Clearly, there is a deep "impedance mismatch" between the syntax proposed in Example 1 and the approach in Figure 1. Of course, even the most simple SQL query requires that all the steps are executed, but the entire process can be automated and controlled.

1. Connect to the database (this is done once for the entire session).

2. Prepare the SQL statement for execution.

3 (a). Gather the information needed to bind local data.

3 (b). Bind local data to SQL placeholders.

4. Execute the statement.

5. Fetch more rows, if needed.

Figure 1: Database access.

Temporary objects are perfect for this job. They can be invisible, but thanks to their lifetime rules, they can at the same time be predictable when it comes to the timing of their actions. Consider again the proposed syntax, like that in Example 1(d). This syntax can be "translated" into the steps in Figure 1 by a specially introduced temporary object that can be created at the beginning of this expression (created and returned by operator<<) and that lives until the end of the expression, collecting everything on its way, and actually executing the statement in its destructor. Listing Two presents the additional definitions that make this trick possible:

into and use functions that return objects of special types. These objects carry on information about the type and address of the source/destination of data.
The definition of the temporary object's class and a helper starter class, which exists only to create the temporary object.
A member shift operator in the Session class serves two purposes: It enables storing the query and starts the creation of the temporary object that handles the rest of the expression.

Listing Two

// helper functions for preparing binding and define info
template <typename T>
IntoTypePtr into(T &t)
{
    return IntoTypePtr(new IntoType<T>(t));
}
template <typename T>
UseTypePtr use(T &t)
{
    return UseTypePtr(new UseType<T>(t));
}
// type of the temporary object
class TempType
{
public:
    TempType(Session &s);
    // the destructor actually makes the dirty work
    ~TempType();
    // inserter for additional stream-formatted data
    template <typename T>
    TempType & operator<<(T const &t);
    // operators that accept and accumulate bind info
    TempType & operator,(IntoTypePtr const &);
    TempType & operator,(UseTypePtr const &);
    // ...
};
// type of object that is used to kick-start the temporary
class StarterType
{
public:
    StarterType(Session *s);

    template <typename T>
    TempType operator<<(T const &t)
    {
        TempType o(*session_);
        o << t;
        return o;
    }
private:
    Session *session_;
};
class Session
{
public:
    // the starter object
    StarterType once;
    // the inserter operator (first in the expression)
    template <typename T>
    TempType operator<<(T const &t)
    {
        return once << t;
    }
    // ...
};

With these pieces, I can now better explain what happens in Example 1(d) when it is executed:

The sql is an instance of the Session class and retains all the information that is needed to create and execute new statements.
The subexpression sql << "query" uses the once member object to start up the temporary object (this also means that you can use the form sql.once << "query", which means that the statement is supposed to be executed only once). The "query" string (or whatever it is) is remembered by the underlying string stream. From now on, the temporary object is designated as tmp for clarity.
into(var1) is a call to the function that returns the IntoTypePtr object, which carries information about the type and address of the var1 variable. The returned object is called intoObj in this description.
The subexpression tmp, intoObj causes operator, to store the information carried on by intoObj in the main temporary object for later use. This operator returns the same temporary object (by returning *this) so that the following comma operator works in the same way, and so on, until the end of the full expression.
When the tmp object is destroyed, the dirty work is performed behind the scenes to prepare the statement, bind all variables, and execute the query.

A result of this operator overloading is the translation of the SQL-oriented syntax into calls to the regular interface in Listing One. This also explains why Examples 1(b) and 1(c) were separated. Gathering the binding information is executed while the temporary object is "traveling" through the expression, swallowing IntoTypePtr and UseTypePtr objects, whereas the actual binding of all variables is performed at the end of the expression, when the temporary object is destroyed.

The type of temporary object also has overloaded inserter operator<<. Thanks to this, the full expression can contain many insertions and commas. Every insertion is delegated to the underlying string stream object (this also means that you can build the query from the objects of your own types, if they are IOStreams aware) and every comma accumulates the binding info for the consecutive placeholders.

What I've described covers the case when the query is to be executed exactly once. Of course, database access needs can be rarely satisfied by the possibility of only executing one-time queries, so there is also a facility to just prepare statements for execution, leaving it up to you to execute the query and fetch the consecutive rows (if it is a "select" query).

The important thing is that the implementation of into and use functions (Listing Two) do not do much—they only create objects of special types, which need to follow a given interface. Depending on the type of the variable that is bound to the SQL placeholder, a specific specialization of the IntoType<> or UseType<> templates is used. This makes it an extremely extensible mechanism, which can be used to plug in user-defined types into the library. The only thing that is needed for every supported variable type is to write a specialization for the IntoType<> and UseType<> templates (the library itself is a good example of how to do this). Of course, the library contains specializations for commonly used types.

Examples

Listing Three presents examples that put the library to work. Of course, to run this code you need to provide true database credentials (service name, username, and user password) and prepare the database tables that make the example SQL statements valid.

Listing Three

// example program

#include "soci.h"
#include <iostream>

using namespace std;
using namespace SOCI;

int main()
{
    try
    {
        Session sql("DBNAME", "user", "password");
        // example 1. - basic query with one variable used
        int count;
        sql << "select count(*) from some_table", into(count);
        // example 2. - basic query with parameter
        int id = 7;
        string name;
        sql << "select name from person where id = " << id, into(name);
        // example 3. - the same, but with input variable
        sql << "select name from person where id = :id", into(name), use(id);
        // example 4. - statement with no output
        id = 8;
        name = "John";
        sql << "insert into person(id, name) values(:id, :name)",
                                                          use(id), use(name);
        // example 5. - statement used multiple (three) times
        Statement st1 = (sql.prepare <<
            "insert into country(id, name) values(:id, :name)",
            use(id), use(name));
        id = 1; name = "France";  st1.execute(1);
        id = 2; name = "Germany"; st1.execute(1);
        id = 3; name = "Poland";  st1.execute(1);
        // example 6. - statement used for fetching many rows
        Statement st2 = (sql.prepare <<
            "select name from country", into(name));
        st2.execute();
        while (st2.fetch())
        {
            cout << name << '\n';
        }
    }
    catch (exception const &e)
    {
        cerr << "Error: " << e.what() << '\n';
    }
}

A test driver accompanies the library (available electronically), which is self-contained in that it prepares the required database structures by itself. You may find this test driver to be a valuable source of information about what can be really done with the library.

Afterthought: Syntax-First Library Development

One of the biggest contributions of the eXtreme Programming (XP) method is its focus on test-driven development. As a rule of thumb, in XP the test unit is written before the code that is supposed to make the test pass. The result is code that exactly meets its requirements. It can be beneficial to apply a similar concept on another level of code design.

When implementing a library that is meant to provide some particular functionality, the key design problem is to choose the interface of the library. Sadly, most libraries seem to be developed in the "back-end to front-end" direction, where some low-level concepts (network connectivity, database access, filesystem operations, GUI, and the like) are simply wrapped into high-level language structures, hiding some of the underlying complexity but still revealing the fundamental low-level conventions. Listing One presents two classes that together can be considered a poor man's database library. Such libraries have little added value and in some extreme cases can even be a disservice to the entire language community by suggesting that the ability of the high-level language is limited to only provide simple wrappers for what is always available to the C programmers via the low-level APIs. I have heard such claims about the C++ language made by C programmers. The library I present here is based on an approach I call "syntax-first library development," which is exactly the reverse of this scenario.

The way I built the library was to set up the intended syntax before writing a single line of code. After that, I went through a head-scratching and pen-biting phase to come up with the implementation that makes this syntax possible. Granted, the library employs tricks that some programmers may consider to be obscure, but those tricks are meant to be hidden from library users.

The thrust of the library design is similar to the test-first development proposed by the XP method and, by analogy, the syntax selected before implementing the library itself can be considered to be documentation for the library interface in the same way that test units are documentation for the code requirements in XP. Interestingly, syntax-first library development and test-first development can be used together in a library design and development, leading to libraries that are both expressive and well tested.

Maciej is a Ph.D. student at the Institute of Computer Science, Warsaw University of Technology. You can contact him at http://www.msobczak.com/.

December, 2004: A Simple Oracle Call Interface

1. Connect to the database (this is done once for the entire session).

2. Prepare the SQL statement for execution.

3(a). Gather the information needed to bind local data.

3(b). Bind local data to SQL placeholders.

4. Execute the statement.

5. Fetch more rows, if needed.