Dr. Dobb's Journal October 1999
Solving the Back button problem
By Jean-François Touchette
Jean-François is a senior software designer from Montreal, Canada, and can be contacted at email@example.com.
When writing applications for thin clients such as web browsers, you face several challenges that do not exist with fat-client applications written in Visual Basic, Delphi, or Java. For one thing, your development tools do not include facilities for automatically building reliable, nonrepeatable transactions into applications. Consequently, you must devise your own techniques to prevent users from transmitting duplicate transactions.
In this article, I'll show how you can implement reliable, nonrepeatable transactions using a technique that is applicable to any Java Server Development Kit-based architecture. Although the examples I present are based on the IBM WebSphere 2.1 Application Server, they do not make use of any IBM WebSphere extensions. In short, the concepts presented here can be implemented in Perl CGI and ASP scripts, and the sample code has been tested with JDK 1.1.6 and 1.2.
Under HTTP 1.0, a thin-client web browser's TCP connections to a server are nonpersistent -- each request is a new connection. There is no session such as the one that a fat client opens to, say, a database server. (A typical fat-client session might include a login, various requests, and finally a logout perhaps several hours later.) Luckily, the Java Servlet API automatically emulates a session between a browser and server.
The browser sends a request that invokes a servlet. If this is the first servlet, the servlet library sends a cookie to the browser that contains a session ID and creates an HttpSession object. The servlet does whatever it was supposed to do and replies. Later, when the browser sends another request invoking a servlet, the cookie value is sent back to the server. The servlet library gets this cookie value, finds the matching HttpSession object, and makes it available to the servlet that is servicing the request. A servlet runs in a thread and can service many clients concurrently. The only place where it can safely hold data belonging to a client between two invocations is the HttpSession object associated with this client.
A servlet can terminate an HttpSession named aSession by calling the method aSession.invalidate(). A servlet would do that typically to service a request from a logout hyperlink. A session also ends automatically when a maximum inactivity time-out value is exceeded. This value is usually configurable for the whole server. You can use the javax.servlet.http.HttpSessionBindingListener interface to listen to the session unbound event and perform any clean up required when a session disappears.
Identifying the Problem
But what happens if a user sends a transaction, then clicks on the browser's Back (or Stop) button and resends the transaction? While this user action makes sense for something like sending a request to a search engine, it can be disastrous when sending a buy/sell order to the stock market. Even if the application were correctly designed to separate data input and validation pages from subsequent "confirm execution" pages, users can go two pages back and resend an order. Clearly, if you need nonrepeatable transactions, you need a reliable solution to this Back button problem. Likewise, if the system is slow, users often become impatient and press Stop or Esc, and then press Send again. What happens in this case?
For starters, the servlet only sees that the first Send request is aborted when it attempts to reply to the request. At that point, it's too late -- the transaction is already in the database, perhaps on its way to the stock market. However, the servlet has been invoked a second time for the second Send. Ultimately, the servlet is invoked as many times as the Stop and Send sequence of events is repeated.
Remember that the current HttpSession object is where your servlet should store any variables related to a client (browser). So the solution to the repeated transaction problem is to send a unique pseudo-random number in a hidden field of the HTML form and store this number in the HttpSession -- this pseudo-random value can be called a "transaction token." When the form comes back with the next request, the servlet that handles it compares the value stored in the hidden field with the one stored in the HttpSession. If they are equal, everything is fine and the transaction can be executed. If not, the servlet should test a few conditions and send back the proper diagnostic message. You might want to have the code inquire whether the Send is a repeat of the last transaction or a replay of an even older page. (Of course, there is the possibility of a "man in the middle" attack. The security aspect of such an application should be handled by a secure protocol such as SSL.)
The value of the pseudo-random unique number must be tied to this specific client session and must not repeat itself. The former can easily be done by using the sessionId associated with the HttpSession. The latter can be done by feeding the current time, with a resolution to the millisecond, into a hash function.
The Java 1.1 library undocumented MessageDigest class generates such a pseudo-random number. This class can generate MD-5 or SHA-1 message digests. The MD-5 message digest is the output of a hash function, which returns a 128-bit value. The SHA-1 message digest is 160 bits long. Both are reputed to be cryptographically strong. Either way, the value is returned as a Byte array, which will need to be converted to a String to be stored in an HTML field. Example 1 takes a string and the current time and generates an SHA-1 message digest.
As Table 1 indicates, the MessageDigest class generates an SHA-1 digest faster than an MD-5 digest. Surprisingly, it seems there is no penalty for getting a longer digest (32 bits more exactly). Consequently, I'll use SHA-1 from this point on.
But There's Another Problem
Another problem that can arise is also related to a feature of the web browser -- the "open new browser window" item in the File menu. Before designing a thin-client application, you have to ask yourself if a new browser window gets its own cookie and session. The answer is no, because it is the same cookie and, therefore, the same HttpSession (with the same session ID).
Each window can have a different token within the HTML form it contains, but there is only one token value held in the HttpSession object associated with the cookie. A request coming from the first window will succeed, but one from the second will fail as the value of the token held in the HttpSession object has changed. The previous recipe for the transaction token generation with the MessageDigest class cannot work as is.
To improve the technique, assume that each user starts from a known page and that in each different path (or flow) from that start point, every page and servlet will carry a flow ID that will be specific to this flow. If you follow this convention in the design of your thin-client application, it is easy to adapt the transaction token technique to a real-world, multiple-browser windows environment. This is done by adding the flow ID (a string) to the input of the MessageDigest and storing it in another hidden field in the HTML form. On the HttpSession side, you need a Dictionary where a key is a flow ID and the associated value will be a transaction token.
Say this Dictionary is named "TxTokenDictionary" and that the servlet stores it in the HttpSession object. When a servlet is invoked, it retrieves the flow ID and transaction token from the form variables. Then, it finds the value matching this flow ID in the TxTokenDictionary. If this value is equal to the transaction token from the form, everything is fine. If it cannot be found, there is a problem. If it is different, there is another problem. The program TxTokenDemo.java and its sample output (both available electronically; see "Resource Center," page 5) demonstrate this strategy. Each browser window has its own flow and follows its own path in the application(s). Of course, if the user starts the same flow twice, there's a problem. In some cases, you would want to detect that early on and stop any progress in the second occurrence of that flow. If the functional design requires it, your servlets could handle that by generating a different flow ID each time this application path is entered.
Two Real-World Scenarios
Coming back to the basic problem of a nonrepeatable transaction: How do the transaction tokens change, both in the form and on the HttpSession side, in a real-world sequence of events?
Figure 1 shows a menu page with a link "enter an order," which invokes the servlet startFlow. In Figure 2, the same sequence of events is presented in a timeline diagram. In this diagram, the flow of time goes from the top to the bottom, and the exchange of requests and replies between the three tiers (browser, app server, and dataserver) are shown as arrows that are always slanted down.
Suppose that there is already a TxTokenDictionary in the HttpSession (say, another servlet created it when the user completed a log-in dialog). The startFlow servlet generates a TxToken, which has the value Hx1. Then the servlet stores it in the TxTokenDictionary under the key orderFlow1. The servlet continues and prepares an HTML page. It puts two hidden fields in the HTML form: flowId with the value orderFlow1, and token with the value Hx1.
The user fills in the form and presses the Submit button. The URL of the button invokes a second servlet, validateOrder, which retrieves the hidden fields flowId and token from the form and checks if they match with what is in the TxTokenDictionary. If they match, the servlet continues its work and validates the input fields. If they are good, the servlet prepares for the next step -- generating another transaction token value, Hx2, updating orderFlow1 in the TxTokenDictionary, and preparing an HTML form that will ask the user to confirm the order. The hidden field token now gets the value Hx2.
The user gets the "Confirm order..." page in the browser, verifies the data, and clicks Execute. The third servlet, executeOrder, retrieves the field's values from the form and validates the token. Its value, Hx2, matches the one stored in the HttpSession, so the servlet proceeds and inserts the order in the database. The insert succeeds, the servlet prepares a third transaction token value, Hx3, sticks it along with the message "your order has been processed successfully," and sends all this HTML back to the browser.
Now, the user reads the confirmation message and clicks OK. The fourth servlet receives the acknowledgment of the confirmation message and checks the token. If it matches the value kept in memory, all is fine and the servlet executes a callPage and lets a JSP display the base menu again.
Figure 3 shows a second timeline diagram that's also related to Figure 1. In this scenario, however, things get messy. The timeline begins at the step where the executeOrder servlet is invoked to handle a request that contains the token value Hx2. At this point, the servlet processes the request normally and executes the sp_insert_order database procedure with success. During that time (perhaps the database is very busy and the response time is unusually long), the user has pressed the Esc key or clicked on the browser's Stop button. By doing this, the user has caused the browser to close its TCP connection to the app server. The servlet is unaware of this, until it tries to reply to the HTTP request that it is processing. At this point, the write on the TCP socket fails and the "insert order OK" message (the page with "your order has been processed successfully") cannot be sent. The only thing that the servlet can do at this point is to remember that this message will need to be sent to this user later. (The new token and the message can be stored in the HttpSession temporarily. And, if the HttpSession is destroyed before the message has been delivered, then the message should be tagged to the user profile in the database. The user should be notified the next time he logs on.)
Even if the user clicks the Execute button shortly after aborting the previous request, the second request to executeOrder for this HttpSession is only serviced by the servlet after the first one has completed. This is because a servlet maintains a lock on the HttpSession while it processes a request.
Consequently, the servlet receives the same execute order request, with the token Hx2 value once again. Because it expects Hx3, the servlet does not duplicate the transaction. Instead, the servlet notifies the user with a diagnostic page saying something like "you stopped the previous request, but the following transaction was completed: insert order OK." When the executeOrder servlet sends this page, it puts the token value Hx3 in it. Therefore, the flow of interaction will be able to resume from this point when the user clicks OK.
Developers at the company where I work needed to build many sizable and robust building blocks by themselves after exploring IBM WebSphere, which is the app server platform that we chose. The developers needed to encapsulate the Servlet and JSP objects to build more robust components. For the database connectivity with JDBC, we also felt that the standard building blocks that are offered are too low level. In fact, even though JDBC and Sybase JConnect are at a higher level than the Sybase Open Client Library (a C-level API), they have a small granularity. Our homegrown CT10API and CT10DLL.DLL components (http://www3 .sympatico.ca/blonchjj) shield our UNIX C programs and 32-bit Visual Basic apps from the complexity of CT-Lib -- while giving us a robust and fast access to the native Sybase connectivity layer.
Because of our needs and our previous experience with general transaction processing components, we have built many Java components that we intend to share with a community of users. We will make that source code available under the LGPL GNU license.