Native Queries for Persistent Objects

Among other benefits, native queries overcome the shortcomings of string-based APIs.


February 01, 2006
URL:http://www.drdobbs.com/database/native-queries-for-persistent-objects/184406432

February, 2006: Native Queries for Persistent Objects

William is an assistant professor of computer science at the University of Texas in Austin. Carl is chief software architect at db4objects. They can be contacted at [email protected] and [email protected], respectively.


While today's object databases and object-relational mappers do a great job in making object persistence feel native to developers, queries still look foreign in object-oriented programs because they are expressed using either simple strings or object graphs with strings interspersed. Let's take a look at how existing systems would express a query such as "find all Student objects where the student is younger than 20." This query (and other examples in this article) assume the Student class defined in Example 1. Different data access APIs express the query quite differently, as illustrated in Example 2. However, they all share a common set of problems:

Design Goals

Our goal is to propose a new approach that solves many of these problems. This article is an overview of the approach, not a complete specification. What if you could simply express the same query in plain Java or C#, as in Example 3? You could write queries without having to think about a custom query language or API. The IDE could actively help to reduce typos. Queries would be fully typesafe and accessible to the refactoring features of the IDE. Queries could also be prototyped, tested, and run against plain collections in memory without a database back end.

At first, this approach seems unsuitable as a database query mechanism. Naively executing Java/C# code against the complete extent of all stored objects of a class would incur a huge performance penalty because all candidate objects would have to be instantiated from the database. A solution to this problem was presented in "Safe Query Objects" by William Cook and Siddhartha Rai [3].

The source code or bytecode of the Java/C# query expression can be analyzed and optimized by translating it to the underlying persistence system's query language or API (SQL [6], OQL [1,8], JDOQL [7], EJBQL [1], SODA [10], and so on), and thereby take advantage of indexes and other optimizations of a database engine. Here, we refine the original idea of safe query objects to provide a more concise and natural definition of native queries. We also examine integrating queries into Java and .NET by leveraging recent features of those language environments, including anonymous classes and delegates.

Therefore, our goals for native queries include:

Defining the Native Query API

What should native queries look like? To produce a minimal design, we evolve a simple query by adding each design attribute, one at a time, using Java and C# (.NET 2.0) as the implementation languages.

Let's begin with the class in Example 1. Furthermore, we assume that we want to query for "all students that are younger than 20 where the name contains an f."

  1. The main query expression is easily written in the programming languages; see Example 4.
  2. We need some way to pass a Student object to the expression, as well as a way to pass the result back to the query processor. We can do this by defining a student parameter and returning the result of our expression as a Boolean value; see Example 5.
  3. Now we have to wrap the partial construct in Example 5 into an object that is valid in our programming languages. That lets us pass it to the database engine, a collection, or any other query processor. In .NET 2.0, we can simply use a delegate. In Java, we need a named method, as well as an object of some class to put around the method. This requires, of course, that we choose a name for the method as well as a name for the class. We decided to follow the example that .NET 2.0 sets for collection filtering. Consequently, the class name is Predicate and the method name is match; see Example 6.
  4. For .NET 2.0, we are done designing the simplest possible query interface. Example 6 is a valid object. For Java, our querying conventions should be standardized by designing an abstract base class for queries—the Predicate class (Example 7). We still have to alter our Java query object slightly by adding the extent type to comply with the generics contract (Example 8).
  5. Although Example 8 is conceptually complete, we would like to finish the derivation of the API by providing a full example. Specifically, we want to show what a query against a database would look like, so we can compare it against the string-based examples given in the introduction. Example 9 completes the core idea. We have refined Cook/Rai's concept of safe queries by leveraging anonymous classes in Java and delegates in .NET. The result is a more concise and straightforward description of queries.

Adding all required elements of the API in a step-by-step fashion lets us find the most natural and efficient way of expressing queries in Java and C#. Additional features, such as parameterized and dynamic queries, can be included in native queries using a similar approach [4]. We have overcome the shortcomings of existing string-based query languages and provided an approach that promises improved productivity, robustness, and maintainability without loss of performance.

Specification Details

A final and thorough specification of native queries is only possible after practical experience. Therefore, this section is speculative. We would like to point out where we see choices and issues with the native query approach and how they might be resolved.

Regarding the API alone, native queries are not new. Without optimizations, we have merely provided "the simplest concept possible to run all instances of a class against a method that returns a Boolean value." Such interfaces are well known: Smalltalk-80 [2, 5], for instance, includes methods to select items from a collection based on a predicate.

Optimization is the key new component of native queries. Users should be able to write native query expressions and the database should execute them with performance on par with the string-based queries that we described earlier.

Although the core concept of native queries is simple, the work needed to provide a solution is not trivial. Code written in a query expression must be analyzed and converted to an equivalent database query format. It is not necessary for all code in a native query to be translated. If the optimizer cannot handle some or all code in a query expression, there is always the fallback to instantiate the actual objects and to run the query expression code, or part of it, with real objects after the query has returned intermediate values. Because this may be slow, it is helpful to provide developers with feedback at development time. This feedback might include how the optimizer "understands" query expressions, and some description of the underlying optimization plan created for the expressions. This will help developers adjust their development style to the syntax that is optimized best and will enable developers to provide feedback about desirable improved optimizations.

How will optimization actually work? At compile or load time, an enhancer (a separate application or a plug-in to the compiler or loader) inspects all native query expressions in source code or bytecode, and will generate additional code in the most efficient format the database engine supplies. At runtime, this substituted code will be executed instead of the original Java/C# methods. This mechanism will be transparent to developers after they add the optimizer to their compilation or build process (or both).

Our peers have expressed doubts that satisfactory optimization is possible. Because both the native query format and the native database format are well defined, and because the development of an optimizer can be an ongoing task, we are very optimistic that excellent results are achievable. The first results that Cook/Rai produced with a mapping to JDO implementations are very encouraging. db4objects (http://www.db4o.com/) already shows a first preview of db4o with unoptimized native queries today and plans to ship a production-ready Version 5.0 with optimized native queries.

Ideally, any code should be allowed in a query expression. In practice, restrictions are required to guarantee a stable environment, and to place an upper limit on resource consumption. We recommend:

It seems desirable that processing should continue after any exception occurs in query expressions. A query expression that throws an uncaught exception should be treated as if it returned False. There should be a mechanism for developers to discover and track exceptions. We recommend that implementors support both exception callback mechanisms and exception logging.

The sort order of returned objects might also be defined using native code. An exact definition goes beyond the scope of this article but, using a Java comparator, a simple example might look like Example 10. This code should be runnable both with and without an optimization processor. Querying and sorting could be optimized to be executed as one step on the database server, using the sorting functionality of the database engine.

Conclusion

There are compelling reasons for considering native queries as a mainstream standard. As we have shown, they overcome the shortcomings of string-based APIs. The full potential of native queries will be explored with their use in practice. They have already been demonstrated to provide high value in these areas:

Acknowledgments

Thanks to Johan Strandler for his posting to a thread at TheServerSide that brought the two authors together, Patrick Roomer for getting us started with first drafts of this paper, Rodrigo B. de Oliveira for contributing the delegate syntax for .NET, Klaus Wuestefeld for suggesting the term "native queries," Roberto Zicari, Rick Grehan, and Dave Orme for proofreading drafts of this article, and to all of the above for always being great peers to review ideas.

References

  1. Cattell, R.G.G., D.K. Barry, M. Berler, J. Eastman, D. Jordan, C. Russell, O. Schadow, T. Stanienda, and F. Velez, editors. The Object Data Standard ODMG 3.0. Morgan Kaufmann, 2000.
  2. Cook, W.R. "Interfaces and Specifications for the Smalltalk Collection Classes." OOPSLA, 1992.
  3. Cook, W.R. and S. Rai. "Safe Query Objects: Statically Typed Objects as Remotely Executable Queries." G.C. Roman, W.G. Griswold, and B. Nuseibeh, editors. Proceedings of the 27th International Conference on Software Engineering (ICSE), ACM, 2005.
  4. db4objects (http://www.db4o.com/).
  5. Goldberg, A. and D. Robson. Smalltalk-80: The Language and Its Implementation. Addison-Wesley, 1983.
  6. ISO/IEC. Information technology—database languages—SQL—Part 3: Call-level interface (SQL/CLI). Technical Report 9075-3:2003, ISO/IEC, 2003.
  7. JDO (http://java.sun.com/products/ jdo/).
  8. ODMG (http://www.odmg.org/).
  9. Russell, C. Java Data Objects (JDO) Specification JSR-12. Sun Microsystems, 2003.
  10. Simple Object Database Access (SODA) (http://sourceforge.net/projects/ sodaquery/).
  11. Sun Microsystems. Enterprise JavaBeans Specification, Version 2.1. 2002 (http://java.sun.com/j2ee/docs.html).

DDJ

February, 2006: Native Queries for Persistent Objects

(a)

// Java
public class Student {
   private String name;
   private int age;
   public String getName(){
      return name;
   }
   public int getAge(){
      return age;
   }
}


(b)
// C#
public class Student {
   private string name;
   private int age;
   public string Name {
      get { return name; }
   }
   public int Age {
      get{ return age; }
   }
}

Example 1: (a) Java class; (b) C# class.

February, 2006: Native Queries for Persistent Objects

// Java
List <Student> students = database.query <Student> (
   new Predicate <Student> () {
     public boolean match(Student student){
       return student.getAge() < 20 && student.getName().contains("f"); 
    }
});
Collections.sort(students, new Comparator <Student>(){
  public int compare(Student student1, Student student2) {
    return student1.getAge() - student2.getAge();
  }
});

Example 10: Defining the sort order of returned objects.

February, 2006: Native Queries for Persistent Objects

(a)

String oql = "select * from student in AllStudents where student.age < 20";
OQLQuery query = new OQLQuery(oql);
Object students = query.execute();


(b)
Query query = persistenceManager.newQuery(Student.class, "age < 20");
Collection students = (Collection)query.execute();


(c)
Query query = database.Query();
query.Constrain(typeof(Student));
query.Descend("age").Constrain(20).Smaller();
IList students = query.Execute();

Example 2: (a) Object Query Language (OQL); (b) JDO Query Language (JDOQL); and (c) db4o SODA (using C#).

February, 2006: Native Queries for Persistent Objects

(a)

// Java
student.getAge() < 20


(b)
// C#
student.Age < 20

Example 3: (a) Java query; (b) C# query.

February, 2006: Native Queries for Persistent Objects

(a)

// Java
student.getAge() < 20 && student.getName().contains("f")


(b)
// C#
student.Age < 20 && student.Name.Contains("f")

Example 4: (a) Java; (b) C#.

February, 2006: Native Queries for Persistent Objects

// pseudoJava
(Student student){
  return student.getAge() < 20
    && student.getName().contains("f");
}

// pseudoC#
(Student student){
  return student.Age < 20
    && student.Name.Contains("f");
}

Example 5: PseudoJava and pseudoC#.

February, 2006: Native Queries for Persistent Objects

(a)

// Java
new Predicate(){
  public boolean match(Student student){
    return student.getAge() < 20
     && student.getName().contains("f");
  }
}


(b)
// C#
delegate(Student student){
  return student.Age < 20
    && student.Name.Contains("f");
}

Example 6: (a) Java; (b) C#.

February, 2006: Native Queries for Persistent Objects

// Java
public abstract class Predicate <ExtentType> {
  public <ExtentType> Predicate (){}
  public abstract boolean match (ExtentType candidate);
}

Example 7: Predicate class.

February, 2006: Native Queries for Persistent Objects

new Predicate <Student> () {
  public boolean match(Student student){
    return student.getAge() < 20
      && student.getName().contains("f");
  }
}

Example 8: Adding the Java extent type.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.