Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

JVM Languages

Python Server Pages: Part I


Jan00: Python Server Pages: Part I

Kirby is a Microsoft Certified Software Developer and a contributing author to the Quick Python Book (Manning, 1999). He can be contacted at [email protected].


Python Server Pages (PSP) constitute a server-side scripting engine designed along the lines of Microsoft's Active Server Pages (ASP) and Sun's Java Server Pages (JSP) specification. The major difference between ASP and PSP is that PSP is written in 100 percent Java and is, therefore, portable to a wide variety of platforms. Web applications written in ASP, on the other hand, can be run only on Microsoft platforms. Compared to JSP, PSP uses JPython as its scripting language, which seems more appropriate for scripting web sites than the Java (a systems language) used in JSP.

One benefit of using PSP is the huge number of add-on modules available for Python and JPython (both available at http://www.python.org/). With PSP, you can access any module that is compatible with JPython from within your PSP application's pages. And because JPython is itself written in Java, you can access Java packages from PSP applications as well. Since creating PSP, I've deployed it in several intranet sites, and it is currently being evaluated for commercial viability in Internet e-commerce and portal sites.

In the first installment of this two-part article, I'll present PSP. In the next installment, I'll examine PSP's Java servlet code, which contains everything to compile and execute the JPython code in response to a request from a user. PSP source code and related files are available electronically at http://www.ciobriefings .com/psp/, or from DDJ (see "Resource Center," page 5).

A Developer's Perspective

HelloWorld.psp (see Listing One) is a PSP implementation of the familiar "Hello World" program. As you can see, it looks similar to Microsoft's Active Server Pages (ASP) and VBScript. All that this simple page (see Figure 1) does is display a user's IP address to prove that a server-side scripting engine processed the page.

There are four things you should notice about the HelloWorld.psp:

  • Like ASP, HTML and server-side scripts are mixed throughout the file.
  • Instead of the ASP script markers <% and %>, PSP uses $[ and ]$.

  • PSP, like ASP, provides an include call to import other scripts into the current script.

  • PSP provides an ASP-like object model including the Request object.

One thing that attracted me to web programming in the first place was the ability to use server-side scripting from within my HTML pages. I generally use a tool such as FrontPage to create an initial page design, then rename the file and begin inserting script statements to customize the page. It was important for my development team that I replicate the paradigm they were used to in the Windows NT/IIS environment even when deploying to Solaris. PSP does this by assuming that the contents of the page will be sent directly to the user's web browser. PSP only interferes with the page when it comes across its script markers.

Still, I made the script markers for PSP different than those of ASP because I didn't want someone to confuse PSP pages with ASP/VBScript pages. This was short sighted. It turns out there is a Python add-on called "win32all" (http://www.python .org/windows/win32all/) that plugs a Python interpreter into ASP. If I had used the ASP script markers, my PSP-based applications might have been portable to ASP.

Like ASP, PSP provides Request, Response, Session, and Server objects that scripts can use to find out about and interact with their HTTP environment. In HelloWorld.psp, the Request object's server dictionary is queried to determine the user's IP address. Although the objects are most likely familiar to you, they have been "Pythonized" so they have a Python feel. For instance, data such as form variables and server parameters are implemented as Python dictionaries instead of Visual Basic-like collections.

Figure 2 illustrates the general architecture of PSP. What made this solution suitable for me was that everything runs inside of the Java Virtual Machine. As long as I stay away from any Java VM idiosyncrasies, I shouldn't have any problems deploying Windows NT on those projects that required it, and then on Solaris later on. Eventually, I did have problems moving the final application from Windows NT to Solaris, but it wasn't because of Python Server Pages.

The enabling technology here is JPython, Jim Hugunin's Java implementation of the Python language (for more information, see my article "Examining JPython," DDJ, April 1999). Python offers many benefits over a lot of other scripting languages. In this situation, Python's ability to compile code on-the-fly was vital. With this feature, PSP can take a page that is a combination of HTML and Python, turn it into a series of Java bytecodes, and execute it inside the JVM. In general, here is what happens inside of PSP when users request a web page:

1. Text is read from a .PSP file.

2. The file is converted to a syntactically correct Python module.

3. The Python module is compiled by the JPython built-in compile method.

4. The compiled method is executed.

The PSP servlet employs caching to improve performance and implements some other features that make it more usable; however, most of the application is concentrated in the Java Servlet that processes requests from the web server. However, one of the hardest parts to write was the code generator used in step 2.

Web Page to Python Code

I prototyped the code generator in JPython. Eventually, I would like to rewrite the generator in Java, but the current code is working remarkably well. Listing Three is an abbreviated version of cgEngine (the code generator class) and related classes. cgEngine is designed to work in a variety of situations other than HTML generation. I have removed some of that support code to concentrate on the parts that are important to PSP.

Again, the code generator turns hybrid HTML/Python files into compatible Python code. Listing Two is the output from the code generator for the HelloWorld.psp page. I've deleted the lines relating to Banner.psp for now, but I'll address them in Part 2 of this article. All that has really happened is that Python statements to output the HTML have replaced the HTML lines. Any Python expressions embedded within the HTML are blended into the final Python statement.

The __convertToPython method of cgEngine is basically a filter function that processes an input file (the unadulterated PSP page) into an output file (a Python module). To do this, the method reads a line of input, then converts it to Python by calling the parseLine method.

The cgEngine class implements a state machine to keep track of what is going on within the input file. parseLine is the primary processing method for this engine and controls the state of cgEngine. The method processes the given line into a series of tokens; either Text or Expression. The state machine starts out expecting normal, nonPython text and converts any such text it finds into a Text object. When the script prefix ($[) is encountered, the state changes and any text found is wrapped by an Expression object. If the parseLine method is processing an expression and finds the script postfix (]$), the state shifts back to processing normal text. What to do inside the state is relatively easy; the hard part is finding out when to change states. cgEngine actually had to have four states instead of the two I mentioned; of course, it took some experimentation to figure this out. Figure 3 shows the states I eventually identified and how parseLine progresses from one state to the next.

After parseLine builds a list of the tokens it found in the input line of text, __convertToPython uses that information to output a valid line of Python code. If parseLine returns a list of tokens that contain only normal text, __convertToPython has a simple job to do. It merely wraps the line in a __write__ statement and puts it into the output file. If the line contains only a Python expression, it is simply written to the output file as well. Some more complex processing goes on if the tokens contain Python expressions and normal text. Near the end of __convertToPython, the method iterates through each token, converting Text tokens into Python expressions and integrating the existing Expression objects. Again, the line is wrapped up into a __write__ statement and written to the file.

The use of the __write__ statement bears some explaining. Originally, I used the Python standard print method to write out the processed text. Using the old method, the first line in Listing Three would have looked like: print "<HTML>".

When JPython executed this code, the output was sent to the standard output device. My applications that used this code generator would have to redirect the standard output device, execute the code, and restore standard output to its previous value. Now, the application that executes the code produced by the code generator provides a __write__ method that is called by the executing code. The method can do anything it wants to with the data; PSP writes the output to the HTTP response output stream. This turned out to be a poor design decision. Listing Four contains a PSP page that outputs everything the page knows about the current request. To do this, a for loop is used to iterate through the list of request variables and each one is sent to the output stream. Python does not use a statement of any kind to terminate blocks of code, such as loops. Python uses a trailing colon to indicate the start of a block and indentation to control the contents of the block. For instance, a simple for loop might look like:

for x in Range(1,10):

x = x + 1

print x

The x=x+1 line is in the loop, the print statement is not. While outputting statements, the code generator has to make sure that the Python statements are set to the proper indentation level. You can see this in Listing Five when the loop is writing each HTML table row. It's easy enough to look for the colon at the end of a line to know that it is time to increase the indentation level. It's not so easy to know when to decrease the indentation level. My solution was to introduce a statement that could be used to terminate blocks of code. When the level increases, __convertToPython outputs an additional TAB character to indent the code to a new level. When the level decreases through an end block statement, the TAB character is removed.

A regular contributor to the JPython mailing list, Jason Orendorff, pointed out that this could be a problem (he also suggested the __write__ method used earlier). The problem is that Python doesn't care how the code gets indented. I typically use all TABs to indent code, but spaces would work just as well. Worse yet, you only need to be consistent within the block of code. You could indent one block with spaces and the next with TABs. My solution forces all PSP pages to use TABs for indentation; not a huge problem, but it is an unnecessary restriction.

PSP Installation

PSP is distributed in a single .ZIP file that contains the Java Servlet and Python-based code generator. Before installing PSP, you need to install JPython and ensure that your web server supports Java servlets. If your server does not support servlets, I recommend the JRun Servlet engine from Live Software (http://www .livesoftware .com/). It provides servlet support for Microsoft Internet Information Server, Apache, and Netscape and I've tried it successfully on Windows NT, Linux, and Solaris.

Your servlet engine will need to be configured to load the JPython package and point to the directory where the PSP code resides. The servlet engine can be configured to pass any files that have a .PSP extension to the PSP servlet. Finally, JPython must be configured to load the code generator by copying the directory containing the code generator to your JPython/lib directory.

Some day the PSP distribution will contain the ubiquitous InstallShield for Java; until then, it is installation the hard way.

Conclusion

Although PSP was not so much designed as evolved, there are some aspects of its design that trouble me. Nonetheless, I've used it to implement several intranet applications and it has been trouble free so far.

In the next installment of this two-part article, I'll delve into the Java servlet code. This servlet started out as a simple class to read a PSP page, execute the page, and return the results to the user. This design eventually ballooned to 19 classes supporting page caching, application space, Request and Response objects, and many other features.

DDJ

Listing One

<HTML>
<HEAD>
<TITLE>Hello World</TITLE>
</HEAD>
<BODY>
$[include banner.psp]$
<H1>Hello World</H1>
<BR>
$[
Response.write( "Hello, %s, from Python Server 
                                Pages." % (Request.server["REMOTE_ADDR"]))
]$
<BR>
If your IP address was in the greeting above, you have 
                              installed Python Server Pages correctly.
</BODY>
</HTML>

Back to Article

Listing Two

__write__(  "<HTML>" )
__write__(  "<HEAD>" )
__write__(  "<TITLE>Hello World</TITLE>" )
__write__(  "</HEAD>" )
__write__(  "<BODY>" )
 ... statements included from banner.psp deleted ...
__write__(  "<H1>Hello World</H1>" )
__write__(  "<BR>" )

Response.write( "Hello, %s, 
             from Python Server Pages." % (Request.server["REMOTE_ADDR"]))
__write__(  "<BR>" )

__write__(  "If your IP address was in the greeting above, you have 
                                installed Python Server Pages correctly." )
__write__(  "</BODY>" )
__write__(  "</HTML>" )
__write__(  "" )

Back to Article

Listing Three

class cgEngine:
    # These constants are used to keep track of the state of the parser.
    stNormal = 0
    stFoundScriptPrefix1 = 1
    stScriptOpen = 2
    stFoundScriptPostfix1 = 3

    def parseLine( self, inBuf ):
        """Parses a text string into a list of tokens. Each token is either 
        a block of text or a python expression. Classes Text and Expression 
        are used to hold the text blocks and Python expressions. """
        tokens = []
        output = Text()
        expression = Expression()
        nPos = 0
                        
        while nPos < len(inBuf):
            char = inBuf[nPos]
            if char == '$':
                if self.state == cgEngine.stNormal:
                    self.state = cgEngine.stFoundScriptPrefix1
                elif self.state == cgEngine.stFoundScriptPostfix1:
                    self.state = cgEngine.stNormal
                    tokens.append( expression )
                    expression = Expression()
                elif self.state == cgEngine.stFoundScriptPrefix1:
                    output.append( "$" )
                elif self.state == cgEngine.stScriptOpen:
                    expression.append( char )
                else:
                    output.append( char )
            elif char == '[':
                if self.state == cgEngine.stFoundScriptPrefix1:
                    self.state = cgEngine.stScriptOpen
                   expression = Expression()
                    if output.empty() == 0:
                        tokens.append(output)
                        output = Text()
                elif self.state == cgEngine.stScriptOpen:
                    expression.append( char )
                else:
                    output.append( char )
            elif char == ']':
                if self.state == cgEngine.stScriptOpen:
                    self.state = cgEngine.stFoundScriptPostfix1
                elif self.state == cgEngine.stFoundScriptPostfix1:
                    expression.append( "]" )
                else:
                    output.append(char)
            else:
                if self.state == cgEngine.stScriptOpen and char != '\n':
                    expression.append( char )
                elif self.state == cgEngine.stFoundScriptPrefix1:
                    self.state = cgEngine.stNormal
                    output.append("$")
                    output.append(char)
                elif self.state == cgEngine.stFoundScriptPostfix1:
                    self.state = cgEngine.stScriptOpen
                    expression.append("]")
                    expression.append(char)
                elif char != '\n':
                    output.append(char)
            nPos = nPos + 1
        # if there is some output left
        if output.empty() == 0:
            tokens.append( output )
        # if there is an expression
        if expression.empty() == 0:
            tokens.append( expression )
        # if there are no tokens, this must be a blank line
        if len(tokens) == 0 and self.state != cgEngine.stScriptOpen:
            tokens.append( Text() )
        return tokens
    # convertToPython
    # Processes the input file, converting all script expressions into
    # python code and other text into print statements
    def __convertToPython( self, inFile, outFile, tabIndent = 0 ):
        inBuf = "spam"
        self.state = cgEngine.stNormal
        while len(inBuf):
            inBuf = inFile.readline()
            tokens = self.parseLine( inBuf )
            # find out if this is only a script line, or contains some output
            printLine = 0
            for token in tokens:
                if token.__class__ == Text:
                    printLine = 1
                    break
            # Write out this line of output
            outBuff = Text()
            if printLine == 0 and len(tokens):
                s = string.strip( str(tokens[0]) )
                if len(s) and s[len(s)-1] == ":" and s[0] != "#":
                    tabIndent = tabIndent + 1
                elif "end block" == string.lower(s):
                    tabIndent = tabIndent - 1
                    if tabIndent < 0: 
                        tabIndent = 0
                    tokens[0] = Expression("#" + self.indent(tabIndent) + s)
                elif string.lower(s)[:8] == "include ":
                    stmt,name = string.split( s, " ", 1 )
                    path,inFileName = os.path.split( inFile.name )
                    if len(path) == 0:
                        newInFile = open( name, "r" )
                    else:
                        newInFile = open( path + os.sep + name, "r" )
                    self.__convertToPython( newInFile, outFile, tabIndent )
                    newInFile.close()
                    tokens[0] = Expression("#" + self.indent(tabIndent) + s)
            for token in tokens:
                if token.__class__ == Text:
                    outBuff.append( ' + "' + escapeQuotes(str(token)) + '"' )
                    printLine = 1
                if token.__class__ == Expression:
                    if printLine:
                        outBuff.append( ' + str(' + str(token) + ')' )
                    else:
                       outBuff.append( str(token) )
            if printLine:
                # Write out the statement, trimming off the leading + sign.
                outFile.write( self.indent( tabIndent ) + "__write__( " 
                                              + str(outBuff)[2:] + " )\n" )
            else:
                outFile.write( str( outBuff ) + "\n" )
class Expression:
   "Contains a single python expression fragment parsed from a template file."
    def __init__(self, expr = ""):
        self.data = expr
    def append(self, s):
        self.data = self.data + s
    def empty(self):
        return len(self.data) == 0
    def __repr__(self):
        return self.data
class Text:
    "Contains a single text fragment parsed from a template file."
    def __init__(self):
        self.data = ""
    def append(self, s):
        self.data = self.data + s
    def empty(self):
        return len(self.data) == 0
    def __repr__(self):
        return self.data

Back to Article

Listing Four

<HTML>
<HEAD>
<TITLE>PSP Snoop</TITLE>
</HEAD>
<BODY>
<H1>PSP Snoop</H1>
<BR>
<PRE>
<H2>Request Headers:</H2>
<table>
$[
for varName in Request.server.keys():
]$
    <tr>
        <td>$[varName]$</td>
        <td>$[Request.server[varName]]$</td>
    </tr>
$[  
end block ]$
</table>
</BODY>
</HTML>

Back to Article

Listing Five

__write__(  "<HTML>" )
__write__(  "<HEAD>" )
__write__(  "<TITLE>PSP Snoop</TITLE>" )
__write__(  "</HEAD>" )
__write__(  "<BODY>" )
__write__(  "<H1>PSP Snoop</H1>" )
__write__(  "<BR>" )
__write__(  "<PRE>" )
__write__(  "<H2>Request Headers:</H2>" )
__write__(  "<table>" )
for varName in Request.server.keys():
    __write__(  "   <tr>" )
    __write__(  "       <td>" + str(varName) + "</td>" )
    __write__(  "       <td>" + str(Request.server[varName]) + "</td>" )
    __write__(  "" )
    __write__(  "   </tr>" )
#end block
__write__(  "</table>" )
__write__(  "</BODY>" )
__write__(  "</HTML>" )
__write__(  "" )

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.