Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Web Development

Python Server Pages: Part II


Feb00: Python Server Pages: Part II

Kirby is a Microsoft Certified Software Developer and a contributing author to the Quick Python Book (Manning, 1999). He can be contacted at [email protected].


In the first installment of this two-part article (see "Python Server Pages: Part I," DDJ, January 2000), I introduced Python Server Pages (PSP), a JPython and Java Servlet-based server-side scripting engine. To recap, I created PSP to allow developers familiar with Microsoft's Active Server Pages (ASP) development to write HTML pages with a script embedded in them. The page containing the script is executed on the server and the results are sent to the user's browser. You could, of course, use something like Java Server Pages (JSP) to do this, but when I created PSP, JSP was not available. Besides, Java strikes me as more of a system programming tool, not a scripting language.

PSP uses JPython, the Java-based version of the Python programming language, as its scripting language (see "Examining JPython," DDJ, April 1999). In Part I, I looked at how HTML pages with embedded scripts are translated into compilable JPython code by a Python-based code generator. That's only a small part of the work involved, however. This month, I'll examine the Java Servlet side of PSP, which contains all of the code to compile and execute the JPython code in response to a request from a user.

PSP source code and related files are available electronically at http://www .ciobriefings.com/psp.htm or from DDJ (see "Resource Center," page 7).

Figure 1 shows what happens when users request web pages processed by a Java Servlet engine. The servlet engine installs a filter in the web server so it has a chance to process any user requests. When the servlet engine sees a request it is interested in, the engine processes the request and sends the output back to the web server where it is sent back to users, by registering the engine whenever the user requests a file with that extension (such as helloworld.psp).

PSPServlet

The PSPServlet class (Listing One) is the entry point for the servlet. The servlet engine loads it when the first request is received that is to be handled by the servlet. (Some servlet engines let you configure the server so your servlet is loaded when the engine starts. This saves users from suffering through the load time.) In this case, PSPServlet engine contains a static code block that is executed when the class is loaded. The static block sets up the services used by PSP for the rest of its execution. Because Java servlets need to be multithreaded, it was important to make sure this initialization code was executed before anything else.

The purpose of the initialization code is to load and configure the JPython compiler and execution environment. More specifically, this section of code:

1. Loads the Python sys module, which provides access to the Python module search path used by JPython.

2. Updates the module search path to include the directory where PSP is installed.

3. Loads the Python code-generation module (discussed in Part I of this article).

Listing One also shows the PSPServlet class's interaction with the PSP class (see Listing Two). PSP class is a static class that provides several utility functions to the rest of the application. In this case, the PSP.pspRoot property holds the location of the PSP application. This is a nifty trick that I learned from looking at the source code for JPython (like PSP, JPython's source code is freely available, so free in fact that it is installed when you install JPython). The PSP method findRoot searches through the Java classpath looking for psp.jar, which is the file that contains the PSP class files. If someone else packages his classes into psp.jar, then I'm toast, but this is an easy way to find out information that Java does not normally provide.

You need to know where PSP is installed, because this is where the Python module containing the code generator is. I could have put the code generator in the normal JPython library directory (an early version of PSP did that), but after installing PSP on several servers I found this to be a hassle to maintain. Every server had a different location for JPython, which made it hard to remember where the file was when I wanted to update it. Now the code generator (cg.py) is placed in the directory where the "psp.jar" file is, making it a trivial matter for PSPServlet to find and load. This also makes it easy to update when new versions of PSP are released.

The major work performed by PSPServlet is concentrated in the service method, which is called by the servlet engine whenever a request is made that the servlet should handle. In this case, PSPServlet.service is called when a web browser requests a file ending in "*.PSP" from the server. PSP's service method is straightforward and basically looks for some other object to shift the work to -- a PSPAppContext object. Python Server Pages use application workspaces to keep track of PSP pages and allow them to interact. Each workspace is managed by an instance of PSPAppContext (PSPAppContext.java is also available electronically). The remaining methods of PSPServlet work together to determine which application context can service the request.

The getApplication method of PSPServlet is responsible for looking through the cache for the application. New application contexts are created by looking at the physical path of the page being loaded. If the path of the page also contains a file called "global.psa," a new context is created for this application and is tied to this path. If no global.psa is found, the loadApplication method searches the parent directory for the file. This searching goes on until a global.psa file is found or the root of the web server is encountered. If the root is found, then the page is assumed to belong to the default application context. This is exactly the way that ASP looks for its global.asa file. This also means that a PSP application can contain subdirectories and still belong to the same application. It is the global.psa file that controls where an application begins or ends.

So ends the process of determining which instance of PSPAppContext can handle the request. The entry point to this process, getApplication, is a synchronized method. Web servers are multithreaded to handle multiple user requests; Java servlets should also be prepared to serve multiple requests. Synchronized methods are used in a few selected instances to protect code that is not otherwise thread safe.

PSPAppContext

The application context provides a workspace within the JPython interpreter that related pages can interact within. The JPython interpreter (available from the PythonInterpreter class) implements the Python language within a single environment space. If all PSP application pages were to interact within this same space, they could adversely affect each other. In the movie Ghostbusters, the team wants to know why they shouldn't cross the streams of the unlicensed nuclear accelerators on their backs. Harold Ramis's rather vague answer is, "That would be bad." I don't know what having all of the pages within multiple unrelated applications in the same environment would be like, but it definitely sounds bad to me. The PSPAppContext class was created to oversee the interactions with the JPython interpreter. Pages within an application can exchange data while applications are prevented from interacting (within the JPython interpreter at least) with each other.

It is amazing how simple Python Server Pages started out to be. Getting JPython to execute a block of code is as simple as a call to the PythonInterpreter method exec (see Listing Three). In fact, the original version of PSP had only one class and most of the work was handled within the service method of PSPServlet.

If a function called Application_Start exists in global.psa, it is called as soon as the PSPAppContext object is created. You can use this method to set up any services required by the PSP application. I have used this method to initialize data structures, log into databases, clean up temporary files, and similar activities. Any variables you create in global.psa are available to any of the executing PSP pages. PSP applications -- like the Java servlets they are based on -- must support multiple simultaneous requests. This means that any data structures you provide in global.psa should be read only, or use JPython's synchronization features to control access.

One of the more complicated activities performed by PSPAppContext is to get the given page translated into syntactically correct JPython and execute the code. The processPage method is the starting point for this process and is called by PSPServlet's service method. Like application contexts, PSPAppContext keeps a cache of the pages being executed. Why cache the pages? Here are the steps involved in executing a single PSP page from scratch:

1. Translate the virtual path of the page into a physical path (/spam/displaymenu.psp into c:\inetpub\wwwroot\ spam\displaymenu.psp).

2. Open the physical page and process it into a real JPython script.

3. Compile the script into Java bytecodes.

4. Execute the page.

Reading the page, translating it into JPython, compiling it, and finally executing it are expensive processes. Once the page has gone all the way through step 3, the compiled bytecodes (really a PyCode object from JPython) are stored in the g_scripts hashtable. The timestamp of the original script file is stored in g_dates. If you update the actual page, the script is loaded and recompiled the next time it is accessed. This makes testing and updating your PSP applications more convenient. Every time you update a page, Python Server Pages makes the changes immediately available. On the other hand, checking the timestamp each time a page is executed is a little expensive itself. Perhaps a future version will contain a configuration option to turn this feature off.

The methods getPythonScript, loadPythonServerPage, and compile are in charge of maintaining the cache. When attempting to execute a page, the first stop is getPythonScript. If the page cannot be located within the cache (or has been updated since being cached), the loadPythonServerPage method is called. After a wait, this method sends the page through the code generator, which stores its results (a JPython script) in a file of its own choosing. The filename is returned to loadPythonServerPage, which in turn passes it to the compile method. This simple method loads the file into memory, then calls the standard JPython method __builtin__.compile to translate the page into Java bytecodes. Assuming all this goes as planned, the compiled page is ready for execution. Upon looking through the code, you will notice that PSP does not limit its interaction with JPython to the PythonInterpreter class. PythonInterpreter provides most of the functionality needed to integrate JPython in simple applications; however, more advanced applications require tighter coupling to JPython. I have found JPython's __builtin__ and Py modules to be useful. There is no documentation for these modules, so expect to spend time with the JPython source code if you choose to integrate your Java application this tightly with JPython.

In the actual source code, there are several versions of PSPAppContext's exec method, but PSPAppContext.java (available electronically) shows the main method. This is a simple method because most of the work is done in ExecContext. The ExecContext class is in charge of setting up the execution environment when using JPython to execute a single page within the application. To perform the magic of keeping each application's namespace separated, PSPAppContext and ExecContext use JPython-based dictionaries to hold all of the objects related to the application. PSPAppContext actually sets this up in its constructor when it creates a copy of the dictionary it was passed by PSPServlet. Dictionaries are basically hashtables that JPython uses to store and look up variable names, functions, classes, objects, and anything else created by executing JPython scripts. The beauty of JPython is that when you execute a script, you can provide your own dictionary to be used for this purpose. ExecContext's purpose is to place various objects into the namespace prior to a script being executed.

So what does ExecContext put into the execution environment? A script that can't find out about its environment would not be very useful. A Java servlet has access to Request and Response objects that provide access to information coming into the servlet (Request) and information going back out of the servlet (Response). So that the scripts can access these objects, ExecContext places them into the dictionary used by PSPAppContext when executing a script. Unfortunately, this must be done each time a script is executed, because the Request and Response objects are only valid during a single call to the servlet's service method.

Once the namespace is completely configured, the exec method has little to do. It has a namespace and block of Java bytecodes to execute, which it does. Figure 2 shows how these Python Server Pages objects interact within the servlet to produce the output that is sent to the browser.

The code generator translates most everything into a call to a __write__ method. There is no such method in JPython, but the code generator expects you will define one before you try to use the fruits of its labor. In this case, I want all of the output of the pages to go to the user's browser. PSPAppContext accomplishes this by routing all calls to __write__ to the Response.write method. The code in PSPAppContext sets all this up by creating and compiling a __write__ method into the namespace that will be used by any executing pages.

Pythonizing

In the original version of PSP, the Request and Response objects provided by the servlet engine were passed directly to the executing script. As soon as I started writing PSP applications, however, I hated that most of the page looked and worked like Python code while these objects were clearly Java. For instance, this is how you would get a parameter passed to your page using the servlet version of Request:

parm = Request.getParameter("myparm")

Because the parameters are really just a hashtable of values, it is a shame that I couldn't use Python's normal mode of accessing keyed values -- the dictionary. Consequently, PSP is now composed of several objects (PyRequest, PyResponse, PyParams, and so on) that are wrappers around their servlet counterparts. In PSP, the aforementioned section of code now looks like this:

parm = Request.params["myparm"]

This may not look like a huge change, but to a Python programmer it means everything in the world. All of the major objects passed to the PSP pages have been "Pythonized" in some way to make them more palatable to Python programmers.

Conclusion

With PSP's foundation in Java, applications written using PSP are portable to any environment where Java and Java servlets can be found. From a web developer's view, especially an Active Server Pages trained developer, PSP offers a cheap way to implement server-side scripting on a platform that is portable to virtually every Java platform available. To make you feel more at home, PSP provides many of the same objects and services available to ASP applications.

DDJ

Listing One

public class PSPServlet extends HttpServlet {
  PyObject m_cgEngine = null;
  static {
    // Initialize the JPython interpreter
    PSP.interp.exec( "import sys" );
    // Put our installation directory at the beginning of the JPython
    // search path.  That means our modules get loaded before anything else,
    // keeping us away from any nasty module colisions.
    PSP.interp.set( "PSPSearchPath", Py.java2py(PSP.pspRoot) );
    PSP.interp.exec( "sys.path = [PSPSearchPath] + sys.path" );
    // Load up the JPython based code generator module used by PSP
    PyObject cg = __builtin__.__import__( new PyString("cg") );
    PyObject cgVersion = cg.__getattr__( "Version" );

    // Create an instance of the code generator that we can use
    String cachedir = PSP.pspRoot + "cache";
    PyObject engineClass = cg.__getattr__( "cgEngine" );
    PyObject engine = engineClass.__call__( Py.java2py(cachedir) );
    PSP.codeGenerator = engine;
  }
  public void service(ServletRequest req, ServletResponse res)
    throws ServletException, IOException {
    String psp = ((HttpServletRequest)req).getServletPath();
    // Get an application object to satisify this request.
    PSPAppContext ac = getApplication( (HttpServletRequest)req );
    ac.processPage(
        psp, (HttpServletRequest)req,
      (HttpServletResponse)res );
  } // service
  synchronized PSPAppContext getApplication( HttpServletRequest req ) {
    String psp = req.getServletPath();
    psp = psp.replace( '/', File.separatorChar );
        psp = psp.replace( '\\', File.separatorChar );
    // Get application name
    File pspFile = new File( psp );
    String appName = pspFile.getParent();

    PSPAppContext app = (PSPAppContext) PSP.getApp( appName );
    if ( app == null )
        return loadApplication( appName, req );
    else
        return app;
  } // getApplication
  PSPAppContext loadApplication( String name, HttpServletRequest req ) {
    File pspFile = new File( name );
    String appName = name;
    // Look for a global script file indicating the
    // base directory of an application.
    while ( appName != null ) {
        String globalScript = req.getRealPath(
            appName + File.separatorChar + PSP.GLOBAL_SCRIPT);
            File f = new File(globalScript);
        if ( f.exists() )
            break;
        pspFile = new File( appName);
        appName = pspFile.getParent();
    } // while
    if ( appName == null )
        appName = new String( "" + File.separatorChar);
    PSPAppContext app = (PSPAppContext) PSP.getApp( appName );
    if ( app == null ) {
      app = new PSPAppContext(
        (PyStringMap)PSP.interp.getLocals(), tr, appName );
      PSP.addApp( appName, app );
    }
    // If we didn't find this application in the same
    // directory where we started
    if ( !appName.equals(name) ) {
      // Put this application object in the cache
      // under the directory name where we finally found it.
      PSP.addApp( name, app );
    }
    return app;
  } // loadApplication
}

Back to Article

Listing Two

public class PSP {
   static PythonInterpreter interp = new PythonInterpreter();
   static PyObject codeGenerator = null;
   static Hashtable apps = new Hashtable(); // Cache for Application Contexts
   static String pspRoot = findRoot();
   static final String GLOBAL_SCRIPT = "global.psa";

  // finds the directory where psp is installed
  public static String findRoot() {
      String root;
      // If find psp.jar in class.path
      String classpath = System.getProperty("java.class.path");
      if (classpath == null) return null;
      int jpy = classpath.toLowerCase().indexOf("psp.jar");
      if (jpy == -1) {
          return null;
      }
      int start = classpath.lastIndexOf(java.io.File.pathSeparator, jpy)+1;
      return classpath.substring(start, jpy);
  } // findRoot
  // adds a new application context to the cache
  static public synchronized void addApp( String name, PSPAppContext app ) {
    apps.put( name, app );
  }
  // gets a cached application context from the cache
    static public synchronized PSPAppContext getApp( String name ) {
    return (PSPAppContext)apps.get( name );
  }
  // clears the application cache for the servlet
  static public synchronized void clearCache( boolean newValue ) {
    apps = new Hashtable();
    System.gc();
  }
  // returns a python dictionary with statistics for the loaded apps
  static public synchronized PyDictionary getAppStats() {
     Hashtable ht = new Hashtable( apps.size() );
    Enumeration e = apps.elements();
    while ( e.hasMoreElements() ) {
        PSPAppContext app = (PSPAppContext)e.nextElement();
        Hashtable htPages = new Hashtable( app.g_scripts.size() );
       Enumeration scripts = app.g_scripts.keys();
            Enumeration dates = app.g_dates.elements();
      while ( scripts.hasMoreElements() ) {
        PyString script = new PyString( (String)scripts.nextElement() );
        PyLong date = new PyLong( ((Long)dates.nextElement()).longValue() );
                htPages.put( script, date );
      } // While
      ht.put( Py.java2py(app.g_appName), new PyDictionary(htPages) );
    } // while
    return new PyDictionary( ht );
  } // getAppStats
  // generate a PyDictionary containing HTTP cookies
  static public PyDictionary makeCookies( Cookie[] cookies ) {
    if ( cookies == null )
        return new PyDictionary();
    Hashtable ht = new Hashtable( cookies.length );
    for( int i = 0; i < cookies.length; i++ ) {
        Cookie cookie = cookies[i];
      ht.put( Py.java2py( cookie.getName() ),
        Py.java2py( cookie ) );
    } // for
    return new PyDictionary( ht );
  } // makeCookies
}

Back to Article

Listing Three

import org.python.util.PythonInterpreter;
import org.python.core.*;

public class SimpleEmbedded {
    public static void main(String []args) throws PyException {
        PythonInterpreter interp = new PythonInterpreter();

        System.out.println("Hello, brave new world");
        interp.exec("import sys");
        interp.exec("print sys");

        interp.set("a", new PyInteger(42));
        interp.exec("print a");
        interp.exec("x = 2+2");
        PyObject x = interp.get("x");

        System.out.println("x: "+x);
        System.out.println("Goodbye, cruel world");
    }
}

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.