Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

Extenders, UDFs, and Stored Procedures


Aug99: Programmer's Toolchest

Ken consults and teaches Expert Series seminars. He wrote Database Magic with Ken North (Prentice Hall, 1998). Contact him at [email protected].


Database managers have long been used as building blocks for online transaction processing, decision support, and operational systems. Traditional SQL Database Management Systems (DBMS) have done well in managing tables of numbers and characters, but the database development landscape is changing, and influences such as object-oriented programming and the World Wide Web have raised expectations. Now both users and developers expect to store behavior and use richer data types.

Clearly, the emerging generation of feature-rich applications require multimedia, geospatial data, and types that are more complex than traditional SQL rows and columns. Consequently, developers require sophisticated database managers for complex data, User-Defined Functions (UDFs), and User-Defined Types (UDTs). The need for richer types and custom behaviors is a primary reason developers are moving to universal databases -- that is, tools for storing all types of data in a database -- that have object-relational characteristics. Universal databases also store behavior such as methods, functions, triggers, constraints, and procedures.

DB2 Universal Database

When traditional SQL DBMSs evolved into Object-Relational DBMSs (ORDBMS), IBM was an early supplier of object-relational technology. IBM consolidated several DBMS tools to produce its DB2 Universal DataBase (UDB). UDB supports active databases that contain rules and logic, making it possible for you to rely on the database manager to enforce rules about data. UDB offers an open-ended architecture for developing custom types and behaviors, and lets you create both distinct and structured UDTs. Distinct types are derived from intrinsic types, and structured types are a sequence of named, typed attributes.

Besides supporting SQL-92, UDB lets you write extensions to enhance database functionality. You can use SQL constraints and triggers to embed logic in a database, but you can also use programming languages to add extensions to DB2 databases. In this article, I'll discuss DB2 Extenders and server-side programming for extending DB2 using Java stored procedures and UDFs.

Object-Relational Extensions

The UDB architecture supports server plug-ins known as "DB2 Extenders," which extends UDB from being simply a data store for traditional SQL data (characters and numbers). Object-relational DB2 is more capable of storing all of an application's data types. It provides SQL3 containers and rich types such as spatial data, multimedia, images, time series, and biometric data.

DB2 Extenders are shared libraries that supplement DB2's core SQL functionality. They let you write SQL queries that provide capabilities such as text searching, image matching, and content-based queries. By installing extenders, you add USTs and UDFs that work with DB2 databases.

To create extenders, you use the DB2 Extender SDK. Alternatively, you can also obtain extenders from IBM and third-party suppliers. IBM bundles a text extender and an Audio, Video, and Image (AVI) extender with DB2 UDB. The AVI Extender, for instance, provides types and functions for working with image, audio, and video data. It supports Query By Image Content (QBIC), which provides the ability to query for matching colors and textures. The text extender adds new UDFs and UDTs, and provides several indexing methods for text searches.

As Table 1 illustrates, there are several vendors that offer DB2 Extenders, in addition to IBM. Environmental Systems Research Institute (http://www.esri.com/) and MapInfo (http://www.mapinfo.com/) supply spatial extenders used in geographical applications. IsoQuest (http:// www.isoquest.com/) supplies the NetOwl Extender that provides natural-language processing for business intelligence applications. NetOwl is used, for example, to convert the text of classified ads to a searchable database format.

DB2 Extenders are one alternative to adding features to DB2, but another alternative is hands-on programming. DB2 UDB provides a rich programming model using multiple languages and APIs. You can do client and server-side programming using several languages, including REXX, Basic, Fortran, Cobol, PL/I, C, C++, and Java.

Why Java?

With so many programming language alternatives, the obvious question is, "Why should you use Java?" Part of the answer has to do with intrinsic Java features. Java provides automatic memory management, known as garbage collection, that circumvents memory problems, such as runaway pointers and memory leaks. Those problems plague C/C++ programmers because they can corrupt data structures and crash a server. Java compilers generate bytecodes that are verifiable by the Java Virtual Machine (JVM). Java is network and client-server aware, with features such as built-in sockets and remote methods.

JDBC, another reason DB2 developers choose Java, is the de facto standard for data access from Java clients or servers. It will likely become an international standard, such as ODBC and SQL-92. JDBC provides objects that abstract and simplify the process of writing programs to access SQL databases. It enables Java programmers to use a single API for accessing heterogeneous SQL databases. When you use JDBC, you program with objects such as connections, statements, and result sets. You use classes that map to SQL types, provide SQL exception handling, and expose database and result set metadata. JDBC uses dynamic SQL, although you can use static SQL with SQLJ. (For more information about JDBC, see http://ourworld.compuserve .com/homepages/Ken_North/dataacce.htm.)

Java is flexible. DB2 can serve several types of Java clients, including applications, applets, or servlets. Any of those client types can use JDBC and execute Java UDFs and stored procedures. DB2 developers can also use JDBC code in Java stored procedures. JDBC stored in a database looks similar to JDBC in a client. You can easily move behavior from one tier to another, without having to change languages and programming models. This simplifies the process of creating thin clients and server-centric applications.

Java's intrinsic features make it an attractive language for programming all tiers of a multitier application. Portability is perhaps the most compelling argument for Java. By using Java, you avoid mixed programming models with one language for clients, another for creating server-side DLLs, and yet another for writing stored procedures.

When you add Java classes to a database, DB2 uses a JVM on the server to interpret the Java bytecodes. Before you can use Java on the DB2 server, you need to install the JVM and Java run-time classes. The Java classes come with the JDK and various developer environments. Table 2 lists the Java environment used for different operating systems for which there is a DB2 version.

Java installs as part of some operating systems. For others, you must install Java separately on the DB2 server. Table 2 lists the JDK versions that you need to use Java with DB2. It includes a column with the URLs you can use to download the appropriate Java environment. After installing the Java environment on the server, you need to specify database manager variables to set the Java heap size (java_ heap_sz) and point DB2 to the Java classpath (jdk11_path).

SQLJ

SQLJ is a Java preprocessor that generates precompiled SQL for Java applets, applications, servlets, and stored procedures. It reads a source file (.sqlj file) containing SQLJ statements, and emits a Java (.java) source file. SQLJ provides static SQL type checking and simplifies Java database programming. It is available for DB2 UDB versions on AIX, HP-UX, Solaris, OS/2, OS/390, SCO UnixWare 7, and Win32 systems.

Starting with Version 5.2, DB2 UDB includes a SQLJ run-time package (sqlj.runtime.*). For SQL statements that are unknown until execution time, SQLJ generates dynamic SQL and JDBC calls. For known queries, SQLJ uses static SQL and calls to the SQLJ run time. Applications with known schemas gain performance and security benefits from static SQL and precompiled queries. SQLJ uses exemplar schemas for schema validation and compile time type checking. Although it uses host variables, SQLJ isn't a clone of other embedded SQLs. For example, SQLJ doesn't support PREPARE, EXECUTE, or DESCRIBE.

SQLJ keeps track of execution context and connection context. There is an execution context associated with each executable SQL operation. The SQLJ run time uses get and set methods to check or change execution context attributes such as MaxRows, QueryTimeout, and UpdateCount. SQLJ manages multiple contexts and multiple schema connections, and permits session sharing between SQLJ and JDBC programs. SQLJ also provides result set iterators to navigate through query result sets. Iterators support binding to the database by column name or position. DB2 static checks iterator columns against exemplar schemas so iterator columns use Java types that map to the SQL types of query columns. SQLJ generates an accessor method for each column of an iterator and you can pass an iterator as a parameter to a method.

The fundamental benefits of SQLJ are optimization and security. The Java compiler optimizes source code, and DB2 precompiles and optimizes the SQL. By using SQLJ's profile customizer (db2profc), you can precompile SQL statements, call the SQLJ run time, and generate packages in DB2 databases.

UDFs and Stored Procedures

Some DBMS products use block-structured SQL dialects for storing procedures and functions in a database. The DB2 model is different because you write code using traditional programming languages. To implement a UDF or stored procedure, you compile a program and create a DLL, shared library, or Java classes. UDFs are SQL extensions registered with the database to provide a specific function, such as string processing and calculations. UDFs are often used to provide logic associated with user-defined types. UDFs do not use I/O streams or SQL statements. Stored procedures are programs that reside in the database and execute at the server. They can contain SQL statements and JDBC methods that access DB2 databases.

One issue to consider when developing server extensions is the tradeoff between performance and server stability. When you install extensions that execute in the server process address space, you avoid the overhead of context switching (executing instructions to switch to a different address space). Using in-process extensions delivers better performance, but at the risk of crashing the server or corrupting its data structures. DB2 gives you the choice of writing in-process (unfenced) or out-of-process (fenced) procedures and UDFs. It uses different directories to hold the libraries and Java classes that contain unfenced and fenced extensions. To install an unfenced UDF or stored procedure, you must have DBADM or SYSADM authority, or CREATE_NOT_FENCED authority on the database; see Figure 1.

Developer Guidelines

Stored procedures that use JDBC actually import DB2's JDBC classes (import COM.ibm.db2.jdbc.app.*). DB2 developers include other Java packages when creating UDFs and stored procedures:

  • COM.ibm.db2.app.StoredProc
  • COM.ibm.db2.app.UDF

  • COM.ibm.db2.app.Blob

  • COM.ibm.db2.app.Clob

  • COM.ibm.db2.app.Lob

Java stored procedures require the first package, UDFs require the second. The last three are optional packages for processing large objects. When you create a fenced UDF or stored procedure, you install your Java classes under the %DB2PATH%\SQLLIB\FUNCTION directory. When you create an unfenced extension, you install the Java classes under %DB2PATH%\SQLLIB\FUNCTION\UNFENCED. When you create classes that are part of a Java package, you create the corresponding directories under \SQLLIB\FUNCTION. If you compile a class named acme.pizzaOrders.gdiscount, Java produces a bytecode file named gdiscount.class. Before using the gdiscount function in a SQL statement, you must store the class file (gdiscount.class) in \SQLLIB\FUNCTION\acme\pizzaOrders.

When executing a fenced UDF, DB2 loads the Java interpreter inside the db2udf process. For a fenced stored procedure, it loads the interpreter inside the db2dari process.

Development Steps

Java developers can use the dynamic SQL model or embedded SQL (SQLJ) model when creating UDFs and procedures. When using embedded SQL -- including SQLJ -- you connect to the target database, preprocess the source file, compile, link, and generate bind files. The final step, after binding to the database, is to disconnect from the database. Because UDFs do not contain SQL statements, they do not require precompilation and binding. Besides creating the code, you use a CREATE FUNCTION or CREATE PROCEDURE statement to register the extension. You must also store the new class files in the appropriate subdirectory under %DB2PATH%\SQLLIB\FUNCTION.

User-Defined Functions

UDFs typically compare, convert, or return a value. You can use UDFs in SQL statements in the same manner you use built-in functions. UDFs are typically associated with custom types or UDTs. UDFs let you extend the SQL grammar you can use with a database.

You use a CREATE FUNCTION statement to register a UDF, and after registration you can use the UDF in SQL statements. You might, for example, program a function that takes weekly salary as an argument and returns annual salary. If you name the function perannum, you can use it in queries such as the statement in Listing One.

DB2 SQL provides three classes of UDFs -- scalar functions, table functions, and column functions. Scalar functions accept zero or more arguments and return a single, scalar value. Column functions (aggregate functions) return a single value when you pass a set of like values. Table functions are external functions that return a table. This enables you to use table functions in the FROM clause of SELECT statements. It is valid to use a scalar or column UDF wherever you use an expression in a SQL statement.

DB2 UDB supports sourced and external functions. External functions are those you implement in a programming language such as Java. A sourced function is a scalar or column function you derive from an existing function. Sourced functions are useful for deriving behavior for distinct (derived) types. DB2 enables you to overload functions; that is, create more than one function in the schema with the same name. A function signature includes schema name, function name, number of parameters, and types. DB2 differentiates overloaded functions by using a combination of function path, function signature, and parameter comparisons.

Registering a UDF

After developing the program for a UDF, use the CREATE FUNCTION statement to make it known to DB2. You specify argument formats, including type information and the function's return type. CREATE FUNCTION includes a CAST FROM clause for type coercion.

CREATE FUNCTION provides options that affect the execution model and performance of a UDF. When executing a query, DB2 normally calls a UDF once for each row of a result set. CREATE FUNCTION provides a SCRATCHPAD option that you can specify to tell DB2 to use shared memory and instantiate the UDF only once per query. CREATE FUNCTION also supports parallelism. If you are using multiple partitions and parallel queries, you probably want parallel evaluation of UDFs. DB2 uses the ALLOW PARALLEL option by default, and disallow parallel execution if you specify DISALLOW PARALLEL.

Listing Two is the declaration of metaname, a UDF that encodes a character string. It uses Java-style parameters, and resides in a library named "encode." The metaname source code is in Java.

When you register a new UDF, DB2 adds it to the system catalog and updates the SYSCAT.FUNCTIONS and SYSCAT .FUNCPARMS catalog views. It stores information about user-defined types in SYSCAT.DATATYPES.

Java UDF Guidelines

Java UDFs can be fenced or unfenced. Unfenced Java UDFs cannot corrupt the server's address space, but they can degrade server performance or terminate the server process. Both fenced and unfenced UDFs must implement the COM.ibm.db2 .app.UDF interface.

DB2 expects UDFs to have a specific signature, and the Java signature differs from C and C++ UDFs. If you've written C/C++ UDFs, you're familiar with an argument list that includes SQL-argument, SQL-result, SQL-argument-ind, SQL- result-ind, SQL-state, function-name, specific-name, diagnostic-message, scratchpad, call-type, and dbinfo. When you start writing Java UDFs, you use a smaller argument list (SQL-argument and SQL-result). The generic signature for a Java UDF, with arguments of SQL types s1 and s2, looks like this when the returned value is of type s3:

public void funcname (arg1 in1, arg2

in2, arg3 ret) {......}

where funcname is the method name, arg1, arg2, and arg3 are the corresponding Java types for SQL types s1, s2, and s3, arg1 and arg2 are input arguments, and arg3 is the returned argument.

For table functions, Java UDFs use additional arguments for each column in the result set.

Stored Procedures

DB2 stored procedures are programs installed as shared libraries, or in Java's case, as class files. Java stored procedures are public instance methods that must implement the COM.ibm.db2.app.StoredProc interface. They are external procedures that support multiple outputs, and they can use IN, OUT, and INOUT parameters. Java stored procedures can also return result sets. Stored procedures that use JDBC use a public method and pass null arguments to get an internal JDBC connection, as in Listing Three.

You can execute a Java stored procedure by using the standard SQL CALL statement, assuming you have the appropriate authority:

  • DBADM authority, SYSADM authority; or
  • CONTROL privilege for the package associated with the procedure.

  • EXECUTE privilege for the package associated with the procedure.

Registering a Procedure

Before using a Java stored procedure, you must install it in the proper location and register it with a CREATE PROCEDURE statement. You use CREATE PROCEDURE to provide information such as class and method identifiers, and the parameter passing style for Java (DB2GENERAL). You also define all stored Java procedures as EXTERNAL procedures. Listing Four is the declaration of a fenced Java stored procedure having one input argument, and a result set with two outputs.

When you register the procedure using CREATE PROCEDURE, DB2 updates SYSCAT.PROCEDURES in the system catalog. To create a procedure, you must have one of the following:

  • DBADM or SYSADM authority.
  • IMPLICIT_SCHEMA authority on the database, if the schema name doesn't exist.

  • CREATEIN privilege on the schema, if the schema name exists.

Extension Testing and Debugging

Java programmers who've been doing client-side development find that developing server-side classes for a database is an eye-opening experience. Database classes don't have a GUI so there are no problems with layout managers, such as the AWT. Client-side Java developers can use a variety of visual development environments and debuggers, but Java database developers often develop their own debugging and testing solutions. Techniques such as creating test drivers and writing debugging information to log files are helpful when testing classes to run in a database. The fundamental rule I emphasize is, if possible, to use a three-step process:

1. Test and debug your classes outside the database.

2. Test by installing the classes in a development database.

3. Install the classes in a production database.

If you are developing an unfenced UDF or stored procedure, you should expand this process with additional testing. Verify the UDF or procedure runs correctly as a fenced extension to both development and production databases, before installing it as an unfenced extension.

Conclusion

DBMS vendors have taken divergent paths when adding Java to their server product. Some vendors use a single JVM for all versions of their server. IBM, for instance, uses different JVMs for different operating systems. This is a double-edged sword. IBM's Java expertise is spread across product groups -- it isn't concentrated solely in database labs. Tools such as DB2 benefit, for example, when the AS/400 group improves on Java's garbage collection. On the other hand, IBM's database engineers must test DB2 UDB with different JVMs. Multiple JVMs also present a challenge for IBM customers running UDB on different operating systems. Developers writing DB2 code for multiple platforms must pay particular attention to differences in JVMs. Platform variations can produce differences in class implementations, garbage collection, and thread management. DB2 for Solaris, for example, supports Java as a true multithreaded environment by using native operating-system threads. Differences in the Java thread model can present challenges in areas such as thread synchronization and exception handling.

To address Java performance issues, IBM is turning to compilation and other optimizations. For OS/390, DB2 developers will be able to compile stored procedures to native code. Java programs use standard classes that reside in a file named classes.zip. For AS/400, IBM engineers use hotspot technology and precompile the classes.zip file so it is sharable across multiple instances of the Java VM.

With Java and SQLJ becoming standards, it is clear that Java will become a favored tool for extending DB2. For UDB 5.x and later versions, you can extend DB2 SQL by implementing Java UDFs and stored procedures. For UDB 5.2 and later versions, SQLJ lets you use static SQL and generate DB2 packages. Packages and stored procedures minimize network traffic and provide better version control and security.

IBM has more plans in store for Java. It has already committed to the adoption of Enterprise Java Beans (EJB) as the server programming model for Domino Go, WebSphere, and DB2. It will also migrate the SanFrancisco component framework to the EJB model. We can expect to see other Java developments from IBM.

DDJ

Listing One

SELECT id, surname, perannum(salary) from emp

Back to Article

Listing Two

CREATE FUNCTION metaname (en_string)
  RETURNS VARCHAR
  EXTERNAL NAME 'encode!metaname'
  LANGUAGE JAVA
  PARAMETER STYLE DB2GENERAL
  NO SQL
  FENCED

Back to Article

Listing Three

Connection con = getConnection ();

Back to Article

Listing Four

CREATE PROCEDURE BOOKS_IN_STOCK (IN BOOK_NUM  DOUBLE,
                     OUT PRICE DECIMAL(6,2),
                     OUT QTY DOUBLE)
EXTERNAL NAME 'books!instock'
RESULT SETS 1 
FENCED
LANGUAGE JAVA 
PARAMETER STYLE DB2GENERAL





Back to Article


Copyright © 1999, Dr. Dobb's Journal

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.