Channels ▼
RSS

Web Development

dbcoder: Generating Code For Your Database


February, 2004: dbcoder: Generating Code For Your Database

David writes custom software and lives in the Washington, D.C. area. He can be contacted at david@SilberSoft.com.


This article will discuss dbcoder, a tool that generates code based on a MySQL database schema and a code template. It can be used, for example, to generate a set of Perl classes to provide object-oriented access to a database.

Information about an arbitrary database's schema can be embedded into the generated code.

Usage

$ dbcoder —database=payroll —template=program

will produce a version of the program template customized to match the structure of the payroll database. This presumes that there is an available template called "program" and a database named "payroll." A selection of useful and demonstration templates are included as part of the the dbcoder distribution package. In addition, users may write their own templates to fulfill any particular need.

One of the primary products of dbcoder is code generated from the "Perl" template. This generates a set of Perl-language modules that provide classes specific to the named database and its tables. The "Perl" template is distributed as part of the dbcoder package.

dbcoder currently works with MySQL databases. I have started work toward support for PostgreSQL databases, but have since deferred that project to make time for other dbcoder enhancements.

Even though dbcoder is written in Perl, the templates can be written in any language—in fact, if a schema report is all you want, dbcoder can generate that from a template as easily as it generates program code. Note that dbcoder users do not have to know the details about how templates are written. End users can use one of the templates provided with the dbcoder software distribution, or templates written by others within their organization or by third parties. dbcoder will search out templates from various places on the local computer and can produce reports regarding them upon request.

Basic Template Structure

Each template resides within a directory containing at least an information file and a template subdirectory. The information file contains metainformation, which describes the template. The name of the top-level directory is the default name of the template. The template subdirectory contains the actual template files themselves.

Within the template files, a set of data is made available by means of template variables, which need only be mentioned within a set of special delimiters. For example, the following line in a template:

#  This code generated for use with the [[ Database.Name ]] database.

could result in this similar line of generated code:

#  This code generated for use with the payroll database.

Information more complicated than a simple variable can also be referenced. For example, this template snippet produces a list of tables in the database:

#  This database contains the following tables:
[[ FOREACH table = Database.Tables ]]
#    - [[ table.Name ]]
[[ END ]]

Code is normally evaluated in "database context." Within the above loop, information is evaluated within the context of the current table. It is possible to set up a template file to be processed in "table context," in which case that template file is used as the pattern for one generated file per database table.

As an example of how dbcoder might be used, let us say that you have a class for your program that (among other things) accesses data from a specific database table. (This would be generated in table context, as otherwise, there is no value that can be set for the top-level Table variable.)

You might have something like:

Package [[ Database.Name ]]::[[ Table.Name ]];
.
.
.

[[ FOREACH column = Table.Columns ]]
sub get[[ column.Name ]] {
...
}
[[ END ]]

This would result in a set of files of generated code similar to:

Package payroll::timesheetentry;
.
.
.

sub getemployeeid {
...
}

sub getprojectid {
...
}

sub getstarttime {
...
}

sub getduration {
...
}

sub getgoal {
...
}

sub getdone {
...
}

sub getcomment {
...
}

.
.
.

But this example is pretty boring. Suppose you wanted to make the accessor methods that deal with strings pass their results through a translation function before returning them. You could make a template that includes:

[[ FOREACH column = Table.Columns ]]
sub get[[ column.Name ]] {
...
[[ IF column.Type == DBCoder.ColumnType.String ]]
return babblefish( $result );
[[ ELSE ]]
return $result;
[[ END ]]
}
[[ END ]]

This would result in generated code similar to:

Package payroll::timesheetentry;
.
.
.

sub getemployeeid {
...
return $result;
}

sub getprojectid {
...
return $result;
}

sub getstarttime {
...
return $result;
}

sub getduration {
...
return $result;
}

sub getgoal {
...
return babblefish( $result );
}

sub getdone {
...
return babblefish( $result );
}

sub getcomment {
...
return babblefish( $result );
}

.
.
.

The possibilities are limited only by the imagination of the template author.

Directory Variables

The variable data provided to templates is also available to be used within the file names of generated files. The file names of the template files need to be constructed with underscores surrounding the name of the desired variable in the place where the variable values should be substituted.

The use of the variable name Database.Tables marks the template file to be evaluated in table context. If the variable name is used in a directory's name, all files within that directory are likewise evaluated in table context.

Parameters

The template author can specify a set of parameters that will be recognized by the template. Parameters are declared (and generally defined) in a parameter's file in the top-level directory of a template. A parameter can be referenced from within a template just like any built-in template variable, except that parameters are contained in simple variables with only one value.

End users may choose to set values for these parameters, and the files they generate from templates that use the named parameters will include that information.

The Perl Template Toolkit

Originally, templates contained embedded Perl code and the dbcoder program used eval to process that code. The intention was to leverage the power of the Perl eval statement to evaluate code with database information embedded in Perl code.

While this was a nifty hack, it had some major problems. It meant that the templates had to be written by someone who was fluent in Perl—even when the language of the template code was something other than Perl. It also not only exposed the internals of dbcoder to the templates, but it required that template authors be very knowledgeable about those internals. This made it very difficult to change the dbcoder program. Almost any part of it might be used by a template that would then be broken if that part changed. Finally, the template code itself was ugly. It was necessary for the template author to protect any text that should not be evaluated by the dbcoder program. Perl programs in particular tended to turn into exercises in creative string escapes.

But then we discovered the Perl Template Toolkit. Template Toolkit is a CPAN module that encapsulates the general idea of processing a template file and generating a result file. It allows for a set of variable information to be provided to the template. It was not a hard decision to use it instead of the dbcoder template-parsing code.

This brought us a variety of benefits:

  • The dbcoder code base is reduced in size, which eases maintenance of the dbcoder project.
  • The template syntax became much more regular.
  • Many users will already be familiar with the syntax.
  • While we could have again exposed the internal dbcoder objects to the template, we chose instead to provide the Template Toolkit (thus, the templates) with a very specific set of data. The interface to the templates is now totally separate from the internal layout of dbcoder. This opened the way for a lot of code cleanup, which in turn led to a less cluttered and more understandable code base.

Data Collection

Option setting can come from the command line or from a user's project file. If the user has set parameter values, they are collected from the project file as part of this process. dbcoder collects metadata about all the templates that it can find. Templates are searched for in a template path, which by default contains $HOME/.dbcoder and the directory under which the templates distributed with dbcoder were installed.

This collection of data is searched to find the requested template(s). If dbcoder was invoked to produce a report instead of to generate code, it might be that all of the templates' data will be used.

The template metadata is made available to the template by way of the template variable. Specific data within the template information will be available through variables such as Template.Author or Template.Version. The full set of template metadata can be found in the "dbcoder Template Authors' Reference."

The complete tree of database information (database, table, column) is collected from the database engine. This information includes the name of each entity, their relative position within the data hierarchy, and—for columns—their type. Relation and index information are read from the project file. (When dbcoder was written, MySQL did not support references.)

Generating Code from Templates

When it is time to generate result files, the template files from each requested template are processed. (The metainformation was already collected early on.) The hierarchy of template files is traversed depth-first. Each template file is used to generate a result file with the same name. Certain characters embedded in the template file name can be used to substitute information from the template variables. If that information is the name of a table, the contents of that template file or directory are used to generate one result file or directory for each table in the database.

For the most part, template files are processed in database context, but if the template is being processed for a specific table (table context), the information about that table is made available as a top-level template variable.

When each template file is handed off to the Template Toolkit for processing, the name for the target file (possibly modified to contain variable data) and the set of templates variables data is also provided. Variable information includes dbcoder metainformation (version, timestamp of this invocation, etc.), template metainformation (author, version, versions of dbcoder it can work with, etc.), database schema information, and parameters.

Lessons Learned in Development

Testing. dbcoder was conceived and written by John Romkey in 1997. I took over the maintenance and further development (under John's supervision) in January 2001. Coming in to the dbcoder project, I was confronted with a large chunk of code that had some problems and for which we had many goals. After studying the code for a while, I could see that I needed some way to test the existing functionality. There was a "t" directory, which is commonly used for testing by Perl modules, but it had a series of empty files and one file with a few lines of code in it. There was no documentation, and I was unfamiliar with that testing setup. Rather than get side-tracked learning that testing methodology, I chose to create my own tests.

In retrospect, I think that was only a partial mistake. On the one hand, I should have gone with the standard test setup. On the other hand, many of the tests I wrote to fit into my specially constructed test framework would not have worked in the standard test framework. I likely could have saved myself a great deal of effort, though, by using the standard test framework for the simpler tests.

Language Features. As I was wrapping up work on the dbcoder project, I realized that I had not taken as much advantage of inheritance as I could have. The MySQL and PostgreSQL classes should inherit from Database, Table and Column should perhaps inherit from an Element class, and TemplateList and ParameterHash should descend from a common List ancestor.

Documentation. I put a lot of effort into the dbcoder documentation. I added POD to each module so that manual pages could be generated. The POD was interspersed with the code rather than being stuck at the bottom so that it also served as the per-subroutine headers. This worked out well, as documentation that became out of date tended to get noticed and fixed. I think that if the documentation had been separate it would have tended to drift much more than it did.

In addition to the program code documentation, I wrote a set of books targeted to the various roles people take on as they use software. In this case, the roles are "End User," "Administrator," "Template Author," and "dbcoder Developer." I also turned the POD documentation into sections in the appendix of the Developers' Manual. I think this was a good thing for the program modules. It was a bit of overkill, though, to do the same for the test scripts. For a project with more people working on it, that might have made more sense. The books were written in DocBook, which was a good choice, I think, as there are a good assortment of tools for it.

All told, I think the work was successful. The project is in a much better state now then when I started and there is documentation to show how to use the program and to enlighten the next person to work on it, or even to remind me, should I come back to it after a long absence.

More information regarding dbcoder may be found at http://www .dbcoder.org/.

TPJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.