Java, SQL, Cloudscape, and Derby

Cloudscape is a Java-based, lightweight, embeddable, relational-database engine with a raft of freely available supporting drivers, plug-ins, and other tools.


December 01, 2004
URL:http://www.drdobbs.com/database/java-sql-cloudscape-and-derby/184405927

December, 2004: Java, SQL, Cloudscape, and Derby

Ken is an author, consultant, and industry analyst. He teaches Expert Series seminars and is the editor of SQLSummit.com, WebServicesSummit.com, and GridSummit.com. He can be contacted at [email protected].


Cloudscape DBMS was developed in the quest for a minimalist SQL database management system (DBMS) for embedding in Java applications, web servers, and mobile devices. After passing through Informix's hands, the product was acquired by IBM. IBM subsequently donated Cloudscape to the Apache Software Foundation's Derby Project (http://incubator.apache.org/derby/), an incubator project sponsored by the Apache DB project.

While IBM offers several database products, its flagship DBMS is DB2 Universal Database (UDB). Consequently, IBM aligned Cloudscape 10's SQL dialect with DB2 SQL, although Cloudscape doesn't support all of DB2's features or SQL. Administering a Cloudscape database is therefore easier than administering DB2 or other enterprise databases.

To encourage Cloudscape adoption, IBM aligned Cloudscape 10's SQL dialect with the SQL used by DB2, and embedded it in numerous IBM projects, including WebSphere Portal, WebSphere Application Server, Lotus Workplace, and Tivoli. IBM also donated Cloudscape to the Apache Software Foundation to launch the Derby project, thereby ensuring that the source code and binaries are freely available.

Small Footprint

Enterprise-edition SQL products support a wide range of features—parallel processing, integrated message queuing, replication, online-analytical processing (OLAP), SQL/XML, performance monitoring, and several flavors of database connectivity (ODBC, JDBC, and the like). Of course, these features increase the DBMS's appetite for memory, disk clusters, and multiple processors. For applications such as mobile computing, developers need to trade features for ease of administration and a smaller footprint.

IBM distributes Cloudscape as a 2-megabyte JAR file (cs.jar) that's embeddable in programs written for laptops and mobile devices (http://www-106.ibm.com/developerworks/db2/library/techarticle/dm-0408cline/index.html). A multithreaded architecture also makes Cloudscape suitable for embedding in web servers and XML servers. Cloudscape has been certified with Java 2 Enterprise Edition (J2EE) and it's extensible using Java. You can retrieve Java classes stored in a database and use Java functions in triggers. You can also write Java code to implement SQL stored procedures and user-defined functions (UDFs). Like DB2 UDB, Cloudscape 10 registers UDFs with a SQL CREATE FUNCTION statement and Java stored procedures with a CREATE PROCEDURE statement (see my article "Extenders, UDFs and Stored Procedures," DDJ, August 1999).

Cloudscape provides row-level locking, support for the Unicode character set, and transaction processing with ANSI Standard isolation levels for concurrency control. For client-server processing, it includes a network server that conforms to the Open Group Distributed Relational Database Architecture (DRDA) and protocols used by IBM DB2 servers.

There have been two new versions of the SQL Standard since the first release of Cloudscape. The DBMS does support some SQL:1999 features, but Version 10 does not implement SQL:2003 and its SQL/XML functions (xmlattributes, xmlcomment, xmlconcat, xmlelement, xmlforest, xmlparse, xmlpi, xmlagg, xmlroot). One of the prime attractions of Cloudscape 10 is SQL compatibility with DB2 compatibility. To maintain compatibility with future versions of DB2, the Derby project will add SQL/XML, an XML type, and XQuery. The Derby incubator site also reports prototype code for XQuery and SQL/XML.

Performance, Concurrency, and Scalability

Typical of most SQL products, the amount of memory available to the DMBS affects performance. Sluggish performance with Cloudscape is often corrected by increasing the cache size (pageCache) and/or the Java VM maximum heap size (maxHeap).

When you create a database, Cloudscape creates a directory that includes a lock file, service properties files, a subdirectory (log) for transaction logs, a subdirectory (tmp) for working storage, and a subdirectory (seg0) for data files. Some SQL applications use hundreds or thousands of tables. One factor that limits Cloudscape's ability to scale is that it creates a new file in seg0 each time you add a table to a database.

If you're writing transaction-processing programs for Cloudscape, you can use four JDBC transaction isolation levels:

Higher levels mean Cloudscape uses locks to avoid one thread changing a row while another accesses it. Lower isolation levels permit greater concurrency of access to data such as when you have multiple readers and infrequent updates. A Java program can open multiple databases, but Cloudscape does not support transactions across multiple databases. You can, however, write transaction processing programs that span multiple schemas in a single database.

IBM's Cloudscape FAQ indicates Version 10 is capable of supporting multiuser access and concurrent connections even when operating in embedded mode.

My experience was different using embedded Cloudscape. Attempting to connect to a database threw an exception when there was an open ij session (ij is a Cloudscape utility that lets SQL scripts be executed against JDBC databases) with an active connection to that database. Disconnecting from that database within an ij session didn't resolve the problem. I had to exit ij after disconnecting.

Types and Serialization

Cloudscape implements SQL:1992 and some SQL:1999 features. It provides basic types and Datetime types such as timestamp. It also provides binary and character objects (blob, clob), but not user-defined types. Cloudscape 10 removed several types supported in earlier versions, including boolean, tinyint, nchar, and Java Data Types (user-defined types).

Prior to Cloudscape 10, Java developers used user-defined types to serialize Java objects by streaming them to Cloudscape databases. IBM removed this feature for Version 10 because the implementation of user-defined types did not adhere to part 13 of the SQL Standard.

Security

Developers using Cloudscape often embed it in programs that run on mobile computers such as laptops. Securing a laptop database presents different challenges from securing databases that reside in a managed data center. One solution for limiting access to sensitive data is to restrict programs from accessing views, tables, stored procedures, and other SQL database objects.

Many SQL database administrators use groups and roles for the purpose of defining who can access what data. They use the SQL GRANT and REVOKE commands to grant or deny the authority to access tables and views or to execute procedures for inserting or updating tables. Although adding GRANT and REVOKE are on the wish list for Apache Derby, Cloudscape 10 does not support those commands for managing user rights. To identify database users and passwords with Cloudscape 10, you can set command-line properties or use a properties file.

Java developers embedding Cloudscape can provide additional levels of security by using digitally signed JAR files and Java Cryptography Extensions (JCE). When Cloudscape loads a class from a signed JAR file in the database, it checks for the presence of an X.509 certificate and verifies the signature. If a database is encrypted, none of its contents are viewable as plaintext. Cloudscape uses a boot password for encrypted databases, a separate password from the one used to open a JDBC connection. A rather weak encryption algorithm, DES, is the default but you can specify what algorithm to use.

Because Cloudscape can execute Java stored procedures and user-defined functions embedded in a database, there has been some concern about using SQL injection techniques to execute malicious Java code. Cloudscape calls Java methods only when they are invoked through a stored procedure or function that has been registered with a CREATE PROCEDURE or CREATE FUNCTION SQL statement. In addition, the Java methods run with a Java 2 security manager and will only have permissions defined by the policy file.

Utilities

Cloudscape includes several utilities that operate with a command-line interface and batch files. The sysinfo tool provides version and Java classpath information. The ij utility is an interactive SQL tool. You can use ij to run SQL scripts or to interact with a Cloudscape database by typing in SQL commands.

SQL developers use Data Definition Language commands to create tables, views, triggers, and procedures, and specify keys and indexes. The cslook utility lets you examine database objects such as indexes, keys, and views. (The Apache project refers to the schema extraction tool as "dblook.")

Listing One, a DOS session running the Cloudscape utilities sysinfo and ij, illustrates how to use a JDBC connection attribute to create a new database. You can use ij commands to set properties or create a separate properties file. Properties include the maximum display width for columns, the name of session log files, character encodings, and so on. Listing Two is a SQL script you can execute with ij to create the tables for the program in Listing Three.

A GUI is particularly helpful for novice SQL users. Browsing database contents is also quicker using a tree-structured presentation of the system catalog and schema objects. SQL database managers such as DB2 Universal Database, Oracle 10g, and Microsoft SQL Server include visual tools for managing and querying databases. Cloudscape provides command-line utilities, but IBM has developed an Eclipse plug-in for browsing databases with a visual interface.

Writing JDBC Code

The initial steps of a JDBC program include loading the appropriate driver and connecting to the database. The primary thing that changes when porting a JDBC program from one DBMS to another is the class name of the JDBC driver. For Cloudscape, your Java programs can load the JDBC driver using:

Class.forName
("com.ihost.cs.jdbc.CloudscapeDriver");

To connect to a database with JDBC, your code passes to the getConnection method a URL with protocol and subprotocol values. The connection string can also serve other purposes. You can add attributes such as user ID, password, database name, log device name, and whether to create a new database. You can also specify the encryption algorithm and provider for encrypted databases.

When you create a new database, Cloudscape puts it in the current directory of your JDBC program unless you've defined the system property cloudscape.system.home. In an embedded environment, you also call getConnection to shutdown a Cloudscape database.

DriverManager.getConnection
("jdbc:cloudscape:sample;shutdown=true");

As an alternative to attaching attributes to the connection URL, you can use a Properties object and pass that as the second argument to getConnection (see Listing Three).

The remainder of programming with Cloudscape is similar to writing code for other databases. Your code must use the types provided by Cloudscape, of course, and you have the flexibility of storing JAR files in databases. You can also use Java for writing SQL extensions.

When you deploy a Java program that uses Cloudscape, your production bundle includes JAR files, the Cloudscape log, and the folders (subdirectories) under the database's folder.

Database Container for Web Content

In the information-processing world, a "document" has a broader meaning than in the print industry. We have to deal with a variety of textual content that isn't necessarily formally structured, laid out, and printed. Our software has to access web pages, XML, ASCII and Unicode text, schemas, scripts, word-processing documents, and other text or document content.

Although XML documents, HTML pages, and scripts often exist as content stored in a file container, there are alternatives to the "document in a file" paradigm. Document management systems and content management systems frequently use databases as document containers. Many webmasters use SQL to store message threads from online discussions and to cache site statistics as HTML instead of dynamically generating the data each time. They also use SQL for retrieving boilerplate notices and privacy policies. Some web servers can use a SQL database as an alternative to a log file. Data miners use SQL to save relevant HTML pages for further analysis.

Many web page layout tools operate with HTML pages in files. If you are using one of those products but want to use a SQL content-management system, you'll need software to easily import and export HTML, XML, and other types of files.

Listing Three is a Java program (DocToCloud) that illustrates how to store a web page (HTML) in a Cloudscape table using SQL. It's fairly standard JDBC code that, with minor revisions, will work with other SQL databases. The first part of the program sets up the JDBC protocol and properties to be used for connecting to a Cloudscape database. After connecting to the database, the program creates a JDBC statement and executes a SQL CREATE to create the table that contains the HTML page and related information. It uses a FileInputStream to process the file contain an HTML page, which contains a boilerplate policy statement in this example. Next, the program prepares a SQL INSERT statement using parameter markers. After preparing the query, the program substitutes execution time values for those parameters. Next, it executes the prepared statement and closes out the JDBC statement and connection. After executing this class, the n_data column of the notices table contains the complete HTML document. This example uses a LONG VARCHAR column to store the notice. If you are storing large HTML or XML documents, you should investigate the use of CLOB columns.

DB2 Plug-In for Eclipse

To provide a graphical interface for exploring DB2 and Cloudscape databases, IBM developed a DB2 plug-in for Eclipse (http://www-106.ibm.com/developerworks/ db2/library/techarticle/dm-0408cline2/ index.html). If you're using Eclipse and Cloudscape, you can use the DB2 plug-in for connecting to and exploring Cloudscape databases.

To start browsing, the Connection Wizard of the Database Explorer (Figure 1) walks you through several screens that prompt for information required for a database connection. The first time I used the Connection Wizard to connect to the example database (magic), there was an error message. Because my testing was with the Cloudscape 10 beta version, I tried again. Scrolling back through Connection Wizard screens turned up no bogus information, so I clicked Finish a second time. This attempt succeeded and the Database Explorer opened my example database. Figure 2 shows the tables contained in the magic database user1 schema. The user1 example schema includes tables for storing XML, HTML, scripts, schemas, and other text documents. The xschema table contains schemas such as Document Type Definitions (DTDs) and XML schemas (.xsd). The scripts table includes JavaScript code, Synchronized Multimedia Interaction Language (SMIL), and other scripts.

Figure 3 is the DB Output window after a document had been loaded into the database. After running Listing Two, you'll see the view in Figure 2 by right-clicking on the notices table to browse its contents. In this example database, the notices table stores version and type information for a Privacy Policy notice. The table includes a column for the HTML content of the notice.

Going Forward

IBM will continue to make Cloudscape and related software available from its DB2 Developer Domain web site (http://www.ibm.com/developerworks/db2/downloads/ jcc/). If you want to work with the database manager's source code, browse to the Apache Software Foundation Derby incubator web site (http://incubator.apache.org/derby/DerbyToDo.html). To build Derby, you'll need the Java Developer Kit (JDK) 1.3 to program with the JDBC 2.0 API, or JDK 1.4 to use the JDBC 3.0 API. If you've written programs for Cloudscape 10 and are porting them to work with Derby, you'll have to update the code because of name changes. You'll have to change, for example, com.ihost.cs.jdbc.CloudscapeDriver to org.apache.derby.jdbc.EmbeddedDriver and change other classes from com.ihost.cs to org.apache.derby.

To connect to Derby, IBM provides freely available JDBC drivers. If you want to access Cloudscape/Derby from C/C++, PHP, or Perl, you'll need an ODBC driver. Informix bundled an ODBC driver with older versions of Cloudscape, but IBM now refers Cloudscape users to the Easysoft JDBC-ODBC Gateway (http://www.easysoft.com/products/2023/main.phtml).

Earlier versions of Cloudscape were certified for JDK 1.1 and tested with Personal Java for PDAs and mobile devices. Cloudscape 10 requires a Version 1.3 or 1.4 JVM. Providing an implementation that works with Java 2 Micro Edition is one of the tasks on the ToDo list for Derby developers.

In addition to Cloudscape/Derby, developers wanting an embeddable Java database manager can look to products such as Daffodil DB, FirstSQL/J, and Pointbase. Derby isn't unique as a mature open-source SQL project. Years of development have gone into database managers such as Firebird, Ingres, Birdstep, MySQL, and postgreSQL; DBMSs that have an installed base. But Derby is likely to attract developers looking for inexpensive deployment and ease of migration to DB2.

DDJ



Listing One

D:\Cloudscape\frameworks\embedded\bin>set CLOUDSCAPE_INSTALL=D:\Cloudscape

D:\Cloudscape\frameworks\embedded\bin>java com.ihost.cs.tools.sysinfo
------------------ Java Information ------------------
Java Version:    1.4.2_05
Java Vendor:     Sun Microsystems Inc.
Java home:       C:\j2sdk1.4.2_05\jre
Java classpath:  D:\CLOUDS~1\lib\cs.jar;D:\CLOUDS~1\lib\cstools.jar
OS name:         Windows XP
OS architecture: x86
OS version:      5.1
Java user name:  north
Java user home:  C:\Documents and Settings\north.KNC
Java user dir:   D:\Cloudscape\frameworks\embedded\bin
--------- Cloudscape Information --------
[D:\Cloudscape\lib\cs.jar] 10.0.1.0 beta - (29729)
[D:\Cloudscape\lib\cstools.jar] 10.0.1.0 beta - (29729)
------------------------------------------------------
----------------- Locale Information -----------------
------------------------------------------------------
D:\Cloudscape\frameworks\embedded\bin>ij
D:\Cloudscape\frameworks\embedded\bin>set CLOUDSCAPE_INSTALL=D:\Cloudscape
D:\Cloudscape\frameworks\embedded\bin>java -Dij.protocol=jdbc:cloudscape: 
                                                      com.ihost.cs.tools.ij
ij version 10.0 (c) 1997, 2004 IBM Corp.
ij> connect 'jdbc:cloudscape:magic;create=true';
ij>
Back to article


Listing Two
connect 'jdbc:cloudscape:magic' user 'user1' password 'user1';
CREATE TABLE notices (n_version DATE, n_desc VARCHAR(128), 
                                      n_type VARCHAR(5), n_data LONG VARCHAR);
CREATE TABLE scripts (s_version DATE, s_desc VARCHAR(128), 
                                      s_type VARCHAR(5), s_data LONG VARCHAR);
CREATE TABLE xschemas (x_version DATE,x_desc VARCHAR(128), 
                                      x_type VARCHAR(5), x_data LONG VARCHAR);
CREATE TABLE documents (d_version DATE,d_desc VARCHAR(128), 
                                       d_type VARCHAR(3), d_data LONG VARCHAR);
Back to article


Listing Three
/* Created on Aug 31, 2004 */
package docStoreWithSQL;

/** @author north **/
/****************************************************************************
* FILE NAME:  DocToCloud.java     TITLE: Import HTML into an SQL column
* AUTHOR:  Ken North, Ken North Computing, LLC
* Program reads a web page (HTML) and stores it in an SQL column using JDBC. 
**********************************************************************
* Copyright (c) Ken North Computing, 2004. All rights reserved. Reproduction
* or translation of this work beyond that permitted in Section 117 of the
* United States Copyright Act without express written permission of the
* copyright owner is unlawful.
* The purchaser may make backup copies for his/her own use only and not for
* distribution or resale. The Author and Publisher assume no responsibility
* errors, omissions or damages caused by the use of these programs or from
* use of the information contained herein.
**********************************************************************/

import java.sql.*;
import java.io.*;
import java.util.Properties;

public class DocToCloud {
  public static String protocol = "jdbc:cloudscape:";
  public static void main (String args [])
       throws SQLException, ClassNotFoundException, IOException
  {
    Properties prop = new Properties();
    prop.put("user", "user1");
    prop.put("password", "user1");
    
    System.setProperty("cloudscape.system.home","c:\\cloudscape");
    System.out.println("DocToCloud starting with password and user set");
                        // load the Cloudscape driver 
    Class.forName("com.ihost.cs.jdbc.CloudscapeDriver");
                             //   connect to "magic" database 
    Connection conn = DriverManager.getConnection(protocol + "magic", prop);
    System.out.println("Connected to magic database");
                            // create JDBC Statement for SQL statement
    Statement stmt = conn.createStatement ();
                            // DROP the existing notices table
    try
    {
      stmt.execute ("DROP TABLE notices");
    }
    catch (SQLException e)
    {
                 // catch and ignore table not found exception
    }
                // CREATE the notices table to store standard HTML notices
    // stmt.execute ("CREATE TABLE notices (n_version DATE, 
    //                        n_desc VARCHAR(128), n_data LONG VARCHAR)");
    stmt.execute ("CREATE TABLE notices (n_version DATE, n_desc VARCHAR(128), 
                                   n_type VARCHAR(5), n_data LONG VARCHAR)");
    System.out.println("Created notices table");
                           // read HTML file using input stream
    File file = new File ("d:\\north\\articles\\ddj\\PrivacyPolicy.htm");
    InputStream istr = 
       new FileInputStream ("d:\\north\\articles\\ddj\\PrivacyPolicy.htm");
                           // Do a prepared INSERT into the n_data column
    PreparedStatement pstmt = conn.prepareStatement 
          ("INSERT INTO notices (n_version, n_desc, n_type, n_data) 
                                                     VALUES (?, ?, ?, ?)");
    pstmt.setString (1, "2004-08-31");
    pstmt.setString (2, "Privacy Policy");
    pstmt.setString (3, "HTML");
    pstmt.setAsciiStream (4, istr, (int)file.length ());
    pstmt.execute ();
    System.out.println("Inserted row into notices table");
                        // close connection, statement
    if (pstmt != null)
      pstmt.close();
    if (conn != null)
      conn.close();
    System.out.println("Closed statement and connection.");
  }
}
Back to article

December, 2004: Java, SQL, Cloudscape, and Derby

Figure 1: DB2 Eclipse plug-in Connection Wizard.

December, 2004: Java, SQL, Cloudscape, and Derby

Figure 2: Viewing tables with the DB2 Eclipse plug-in.

December, 2004: Java, SQL, Cloudscape, and Derby

Figure 3: Viewing a table containing an HTML page using the DB2 Eclipse plug-in.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.