Tiny Perl Server Pages and MySQL

Our authors use MySQL and Perl Server Pages to explore web-site personalization techniques.


June 01, 2002
URL:http://www.drdobbs.com/database/tiny-perl-server-pages-and-mysql/184405072

Jun02: Using Tiny Perl Server Pages and mySQL

Andy is a solution architect and Hung-Wai an MCSE. They can be reached at [email protected] and [email protected], respectively.


There are three basic reasons why organizations invest in web-based information technology: to increase revenue, reduce cost, and improve customer relationships. But when visiting web sites, customers increasingly want more than just static HTML documents — they now expect personalized content and interaction with the site. In this article, we'll explore personalization techniques that involve technologies ranging from cookies and URL rewriting to tracking user sessions and state management. The tools we use are mySQL and Tiny Perl Server Pages (TPSP) custom tag library facility (for more information on TPSP, see "A Tiny Perl Server Pages Engine," by Andy Yuen, DDJ, August 2001). In the process, we'll present ConsoleApp, a tool that facilitates building web applications using the Model-View-Controller paradigm. While the code presented here is written in TPSP, the techniques can easily be applied to other programming and scripting languages.

TPSP and Custom Tags

TPSP unites client-side HTML, server-side scripting, and component-based development to produce dynamic pages. TPSP has many JSP-like features including custom tag support. The latter lets you develop custom tag modules to encapsulate complex server-side behaviors and business rules into simple XML-like elements that content developers can use. The only requirements for running TPSP are that your web server supports CGI and that you have Perl 5.005 or later installed on your system. In other words, TPSP is platform- and web-server independent.

Like JSP, TPSP lets you create custom tags. But, unlike JSP, you are not required to provide an XML-based Tag Library Descriptor (TLD) with each tag library you develop. Listing One is a TPSP page that demonstrates the use of nested tags. It uses the custom tags <test:loop>, <test:if>, <test:condition>, <test:then>, and <test:else> to generate 10 random numbers and displays either "head" or "tail" in an unnumbered list, depending on whether the random number is greater than 0.5.

Cookies and Hidden Fields

Unlike JSP, previous versions of TPSP did not provide a built-in session object for session tracking. Since HTTP is a stateless protocol, this is a limitation if you want to implement personalization because personalization relies on the availability of a session-tracking mechanism.

A session is a set of interactions between the client's web browser and a web server. A session starts when users first invoke a site's URL and ends when they terminate their browser, or when the web server closes the session after a certain period of inactivity (timeout). Since HTTP is a stateless protocol, it requires an external mechanism to save the state information between client/web server interactions. The most common mechanisms used to maintain state are cookies and hidden fields.

Cookies are name-value pairs similar to CGI query strings. Cookies are sent back and forth between the browser and web server in the HTTP header and maintained by the browser. They can be created to have an indefinite lifespan or expire when the browser terminates. Cookies are easy to use. To create a cookie, you can use the TPSP built-in $response object: $mycookie = $request->cookie(-name=> $COOKIENAME, -value=>$sid, -expires=>'+30m');, where $COOKIENAME is defined elsewhere and contains the name of the cookie and $sid contains the session identifier (SID).

A cookie is then sent to the browser in the HTTP header: Out->print($request-> header(-cookie=>$mycookie));. To read a cookie, use: $sid = $request->cookie ($COOKIENAME);.

One limitation of cookie is its size constraint. The cookie header can store a maximum of 4 KB of text. Unpredictable behavior may result if your cookie is assigned a value exceeding 4 KB. This makes it impractical to store a huge amount of information in cookies. Another drawback is that they cannot be passed among different domains. Also, there is a limitation to the number of cookies you can create for a particular domain. However, the most severe drawback is that users can disable cookies from the browser (usually for privacy reasons). Hence, a site must have a fallback mechanism to store state information.

Hidden fields are one of the many HTML INPUT types. They are set by servers and do not correspond to any displayable user-interface element. To use hidden fields to store state information, use: <INPUT TYPE="hidden" NAME="STATE" VALUE="state info">.

The drawback to hidden fields is that stored information has to be sent across the network multiple times during a session. For example, if you save state information in the first page accessed by users, you have to send the information back in one or more hidden fields. If users add some more information in a subsequent page, you have to add that information in the next page you send, and so forth. This can quickly become messy. If your site relies on numerous existing legacy pages, it may be difficult to change all the legacy code to accommodate the use of hidden fields and preserve state information.

Security is another problem for both cookies and hidden fields. Hence, storing sensitive information directly in cookies and hidden fields is not recommended.

URL Rewriting

Instead of saving user information in cookies or hidden fields directly, the alternative is to only store an SID in them and save the state information on the server side. In case cookies are disabled on the client's browser, you can use URL rewriting to achieve a similar result. This involves the inclusion of the SID in the HTTP query string of the URL:

HTTP://domainName/cgi-bin/something.pl?SID=xxxxxxxxx

Once you use URL rewriting, you have to make sure that all URLs referencing your TPSP pages include this SID in the query string, otherwise the session will be lost. Listing Two uses cookie in the first attempt to save the SID and when it fails, reverts to URL rewriting. The simple approach works as follows:

  1. Read the SID stored in the named cookie.

  2. If the cookie is present, record that you are using the cookie and go to Step 5.

  3. If SID is present in the query string, record that you are using URL rewriting and go to Step 5.

  4. If you get here, either cookies have been disabled or this is the beginning of a new session. Generate a new and unique SID, redirect to the same URL but include the SID in the query string, and a cookie with the SID as its value in the HTTP header. The redirection will bring you back to Step 1.

  5. The SID is now retrieved from the client. Check if the session is still valid before proceeding to service the request. If the session has expired, go back to Step 4.

State Management Strategies

Once you have a mechanism to identify a session by its SID, you must determine how to save the state information on the server. If you are using ASP and JSP, you can use the built-in session object to store the information in memory. PHP has a similar mechanism called "session variables." Some of these technologies also implement the URL rewriting mechanism.

There are several ways to save the session data. Hewlett-Packard's Bluestone J2EE application server offers at least three options for state management:

By far, the most common approach is to make the data persistent in a relational database. For example, IBM's J2EE WebSphere application server uses a shared HTTPSession (a Java Servlet class) implementation. Each interaction with the shared HTTPSession is a transaction with a relational database. The SID is used to identify a row in the database. Serialized Java objects are saved in the database as binary large objects (BLOBs).

TPSP also uses the relational-database approach based on the mySQL relational database.

The session table is simple, having only four columns:

A page object, session object, and several TPSP custom tags have been developed to facilitate state management in TPSP.

A Framework Based On the MVC Design Pattern

Model-View-Controller (MVC) is an abstraction that helps you divide functionality among objects to minimize the degree of coupling among them. The "model" deals with the business rules and data, "view" with presentation aspects of the application, and "controller" accepts and interprets user requests and controls the model and possibly multiple views to fulfill these requests.

Several approaches are possible in using MVC. One example is Struts, an open-source initiative from the Jakarta Project sponsored by the Apache Software Foundation. It uses J2EE servlets and JSP technologies. (For more information on Struts, see http://jakarta.apache.org/struts/). In TPSP's approach, each web application is made up of one controlling TPSP and one or more model and view TPSPs. The controlling TPSP receives all HTTP requests from clients and directs those requests to the appropriate model and view TPSPs. The model TPSPs are responsible for invoking the business objects (in our case Perl modules). In a nontrivial application, the model TPSP extracts parameters from the HTTP requests query string and translates them to a form suitable for use in invoking business objects. These objects should be designed to maximize reusability and hence should not normally be made to handle HTTP requests directly. Model TPSPs do not produce any direct output. After processing, they send the request back to the controlling TPSP so that it can direct the proper view TPSPs to create the content and send it to the client. The model stores the data in the TPSP page object for view TPSPs to pick up.

The controlling TPSP authenticates users for each request that it receives. That is, it makes sure that users have already logged in to the application before certain pages can be accessed.

Each application has an entry in the appl_global_mapping and multiple entries in the appl_page_mapping tables. These table entries map client HTTP requests to the appropriate model and view TPSPs. Listing Three is the schema for these tables.

The fields (columns) in the appl_global_mapping table are:

The fields in the appl_page_mapping table are:

Listing Four is the controlling TPSP. Its complex logic has been hidden by TPSP custom tags. The custom tags use the page and session objects introduced in earlier sections. This controlling TPSP can be reused by other web applications with only minor changes: the cookieName attribute in the PECS:UseSession and the name attribute in the PECS:Application tag. The rest is handled by the configuration information in the database. (PECS is short for "TPSP E-Commerce System.")

The page object is not persisted between user and server interaction. Its purpose is to provide a simple in-memory area for TPSPs to exchange information, for example, between model and view TPSPs. The only method it provides (other than the constructor) is attribute. To set an attribute, use: $page->attribute('errMsg', "error message");. To retrieve an attribute, use: $msg = $page->attribute('errMsg');.

The session object is derived from the page object. Other than the constructor and the attribute method inherited from the page object, it provides the following methods:

The TPSP custom tags used include:

The TPSP template (tpsp.tpl is available electronically together with the complete TPSP package and example applications; see "Resource Center," page 5) has been modified to support these new features. However, the new template is compatible with TPSP pages built using previous versions of TPSP.

To support the session and page object, the built-in $request and $response objects are now instantiated using class Request, a subclass of CGI, instead of using CGI directly as in previous versions of TPSP. The Request class provides one new method, attribute, which functions in the same way as the page object's attribute method. Its main use is for custom tags to retrieve the session and page objects, and other information items that are necessary to perform a task.

The ConsoleApp Example Application

To illustrate the concepts discussed here, we now present a web-based management console for the creation and administration of web applications using the MVC architecture. You can actually use it to change and enhance its own behavior or create new web applications. The application is bootstrapped using a SQL script to create a database and insert data into the appl_global_mapping, appl_page_mapping, user, and security tables. You have to change the absolute path for the application to point to the directory where you are putting all the TPSPs. The web server must be configured to allow executing CGI scripts in this directory. Write access is also needed because TPSPs are translated dynamically at least once. Figure 1 is a typical ConsoleApp screen. This application only helps you to define which TPSPs are needed to handle what requests and where they are located on the server. You are responsible to develop these TPSPs yourself.

There are a few things to be aware of when examining the source code:

Conclusion

In a future article, we'll delve into personalization, explain what it is, how it works, and enhance the web-based ConsoleApp to support the building and administering of web applications that provide rule-based personalized product recommendation. Updates, if any, will be posted on the TPSP home site at http://www.playsport.com/psp_home/.

DDJ

Listing One

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE>Nested Custom Tag Demo</TITLE>
</HEAD>
<
BODY>
<H3>PSP Nested Custom Tag Demo</H3><P>
<UL>
<
test:loop repts=10>
<LI>
<test:if>
  <test:condition><%= (rand > .5) %></test:condition>
  <test:then>Head</test:then>
  <test:else>Tail</test:else>
</test:if>
</test:loop>
<
/UL>
<
/BODY>
</HTML>

Back to Article

Listing Two

 ...
# extract SID from cookie
my $sid = $request->cookie($self->{'cookiename'});
if ($sid) {
    # cookie present: record the fact that we are using cookie
    $url_flag = 2;
}
else {
    # cookie not present
    if (($sid = $request->param("SID"))) {
        # sid found in query string: record that we are using URL rewriting
        $url_flag = 1;
    }
    else {
        # no sid found, redirect url using both cookie and url rewriting
        $self->_sendSid();
        exit(0);
    }
}
 ...
# check if session has timed out
 ...

Back to Article

Listing Three

DROP DATABASE IF EXISTS pecs;
CREATE DATABASE pecs;
USE pecs;
DROP TABLE IF EXISTS appl_global_mapping;
CREATE TABLE  appl_global_mapping(
  appname   varchar(32) NOT NULL PRIMARY KEY,
  path     varchar(128) NOT NULL,
  login_page    varchar(32) NOT NULL,
  error_page    varchar(32) NOT NULL,
  home_page varchar(32) NOT NULL,
  mdate     date NOT NULL,
  comment   text
);
DROP TABLE IF EXISTS appl_page_mapping;
CREATE TABLE  appl_page_mapping(
  appname   varchar(32) NOT NULL,
  cmd       varchar(32) NOT NULL,
  model     varchar(32) NOT NULL,
  success_view  varchar(32) NOT NULL,
  failure_view  varchar(32) NOT NULL,
  authen_level  int unsigned NOT NULL,
  comment   text,
  PRIMARY KEY (appname, cmd)
);
DROP TABLE IF EXISTS user;
CREATE TABLE user  (
  username  varchar(16) NOT NULL PRIMARY KEY,
  passwd    varchar(16) NOT NULL,
  email     varchar(64) NOT NULL
);
DROP TABLE IF EXISTS security;
CREATE TABLE security  (
  username  varchar(16) NOT NULL,
  appname   varchar(32) NOT NULL,
  sec_level int unsigned NOT NULL,
  PRIMARY KEY (username, appname)
  );
DROP TABLE IF EXISTS session;
CREATE TABLE session  (
  sid       varchar(36) NOT NULL PRIMARY KEY,
  mtime     timestamp NOT NULL,
  ctime     timestamp NOT NULL,
  state     text
);
GRANT select, insert, update, delete
ON pecs.*
TO pecs@localhost identified by 'xfiles';

Back to Article

Listing Four

<%!
use PECS::Request;
use PECS::Session;
use PECS::Page;

my $page = new PECS::Page;
$request->attribute('page', $page);
%>
<PECS:UseSession timeout="15" cookieName="ConsoleApp" 
         dsn="DBI:mysql:pecs" user="pecs" password="xfiles">
    <PECS:Application name="ConsoleApp" /> 
</PECS:UseSession> 

Back to Article

Jun02: Using Tiny Perl Server Pages and mySQL

Figure 1: Typical ConsoleApp screen.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.