Tim is a senior systems analyst and project manager at Texas A&M University. He can be reached at [email protected].
The Internet has evolved from simple static web sites to web-based computing systems that support thousands of users. This evolutionary process experienced tremendous growth with the introduction of Microsoft's Active Server Pages (ASP), an easy-to-use and robust scripting platform. ASP makes it easy to produce dynamic, data-driven web pages.
The next big step came with ActiveX Data Objects (ADO) and the Component Object Model (COM). These tools let you access multiple data sources easily and efficiently and, best of all, in a way that is easy to maintain. Together, ASP and ADO provide the basic infrastructure for separating data from business and presentation logic, following the now infamous "n-Tier architecture."
With the introduction of XML and XSL, web sites are taking another leap forward. In this article, I'll compare these evolutionary leaps with an eye toward web site performance by building two versions of the same web site one using ASP and ADO to generate the user interface (UI), and the other using Microsoft's MSXML parser to transform XML/XSL documents. Benchmark results compare the throughput (transactions per second) of both models. Like most web-related issues, both methods have tradeoffs. In some scenarios, ASP/ADO performs better than MSXML. In other situations, however, MSXML provides a real performance advantage.
A Quick Review of n-Tier Architecture
Client applications typically manage data. To do that effectively, business rules are established to govern the way information is created, managed, and accessed. The presentation layer allows access to data in conjunction with these rules. Program code grouped into these three distinct building blocks data, rules, presentation follows a logical n-Tier architecture. The reason for this is that it allows the reuse of code objects throughout the software application, thereby reducing the amount of code written.
One common mistake results from confusing a "logical" n-Tier architecture with a "physical" n-Tier architecture. Physical n-Tier implementations usually involve web-based software where most code runs on the server, and little (if any) runs on the client. It is important to understand that you can write an executable Visual Basic application that separates these layers (thereby taking advantage of code reuse) even if the software runs entirely on the client. The downside is that the software is more difficult to maintain. When changes need to be made, they must be distributed across all clients. Web-based computer systems follow a physical n-Tier implementation. When changes are necessary, they occur on the server as opposed to the client, thus reducing maintenance costs.
While ASP and ADO make it easy to isolate the data layer into a separate COM object, the separation between business and presentation logic is less intuitive. In fact, ASP does not provide an efficient way to force this separation. Consequently, most developers usually follow a similar process. A disconnected record set is created using ADO and then an ASP page loops through this record set to render some HTML (see Listing One). This is referred to as the "ASP/ADO model." While simple to create, such pages provide little or no separation between the business and presentation layers.
While the ASP/ADO model works, it is not easy to extend to different types of clients, browsers, or platforms. XML and XSL compensate for this shortcoming. MSXML is a tool for merging XML data with XSL style sheets. With MSXML, true separation between data, business, and presentation logic is achievable; see Listing Two. This programming model results in much cleaner code by:
- Creating two instances of the MSXML parser.
- Loading one with XML data and the other with an XSL style sheet.
- Transforming the first object with the second.
- Writing out the results using the response.write statement.
ASP continues to play a role under this new model. However, instead of being the primary method for generating the UI, ASP becomes the glue that binds data and presentation in a reusable way. This type of coding is referred to as the "XML/XSL model." The chief advantage of this approach is the ability to easily generate a different UI for different devices: web browsers, cell phones, handheld organizers, and the like. When data needs to be presented differently, all that has to be produced is a new XSL style sheet. Figure 1 illustrates the differences between the ASP/ADO and XML/XSL models.
Both the ASP/ADO and XML/XSL examples I have provided rely on a separate COM object (written in Visual Basic) to access data. The ASP/ADO data object returns data in a disconnected ADO recordset. The XML/XSL model uses the ADO Stream object to persist data in XML that is then returned as a string. Listings Three and Four show sample code using both approaches.
Listing Three creates an ADO Recordset object based on a SQL statement that is passed as an input parameter. Once the record set object is opened, it is disconnected from the database connection and then returned. Listing Four uses the ADO stream object to persist the record set to XML format. Now, all that remains is to write the stream's contents to the return object. Both these COM functions can be called in your ASP code (see Listings One and Two).
Of course, there are other possible approaches. While the ASP/ADO and XML/XSL models could be implemented using pure ASP, this solution would be more difficult to maintain, options for code reuse would be limited, and web site performance would be reduced.
Performance Testing Primer
Performance testing is one of the most crucial but often overlooked aspects of deploying a web-based application. Performance is typically measured in throughput (the number of requests that can be processed in one second). There are a couple of options for testing a web site's performance. One option is to coordinate with hundreds of users who could browse the web site at a designated point in time. An easier option is to use a performance-testing tool that simulates this load using a much smaller number of client machines. For these purposes, I use Microsoft's (free) Web Application Stress (WAS) tool (http://webtool.rte.microsoft.com/).
To use WAS, install it on several workstations, one of which serves as your test controller. The controller's task is to coordinate a performance test with the other clients. After installing the software, create a script that walks through a web site, just like average users. When this script is completed, set the performance test parameters the stress level, length of the test, and types of performance counters that should be tracked. When beginning the test, your WAS controller machine coordinates the test with the other clients. It then collects the necessary performance data to prepare a detailed report.
Throughput is one of a handful of important indicators of a web site's performance. A report generated by WAS provides the total throughput and also the average Time Till Last Byte (TTLB) for each web page tested. This is the average time required to load a page. When designated, the report will also contain counters from the Windows Performance Monitor. The ASP performance object exposes several important counters:
- ASP requests per second measures throughput, but does not include static content such as images or static HTML pages. It fluctuates depending on the complexity of the ASP code.
- ASP request wait time is the amount of time (in milliseconds) the most recent request waited in the processor queue prior to being processed.
- ASP request execution time is the number of milliseconds the request actually took to execute.
Together, these last two counters measure the time necessary for a server to respond to an ASP page request. When subtracted from the TTLB, the remainder is a good estimate of the network time required for the response to travel from the server.
The benchmarks provided in this article are not official. They are designed to illustrate the relative performance of both coding models. The sample code illustrates the most common approach to implementing these technologies. There are other ways.
A Sample Web Site
Clearly, the XML/XSL model provides increased flexibility. It has all the advantages of n-Tier architectures: reusability, maintainability, and extensibility. Now, for the really important questions, "How does it perform? Is it faster than the ASP/ADO model? If it doesn't perform quite as well, what are the tradeoffs?" To answer these questions, I have built two simple web sites: One uses the ASP/ADO model, the other uses the XML/XSL model. The web site is a simple drill-down collection of pages that contain information on courses offered at Texas A&M University (see Figure 2; http://courses.tamu.com/).
The first page lists the semesters (Fall, Spring, Summer) for which information is available. Students can click a link, then see a list of departments offering courses for the selected semester. Then, a list of courses offered by a selected department is viewed. From there, students select a course, then a list of available sections, instructors, times, and locations are displayed. The sample code contains all of the ASP, XSL, and VB code. Although the production data resides in a Microsoft SQL Server 2000 server, I have included a Microsoft Access database that contains similar data. (Sample code is available electronically from DDJ, see "Resource Center," page 5, and at http://tim-chester.tamu.edu/.)
I have installed the two web sites on a Windows 2000 Advanced Server (SP1), a Compaq Proliant DL580 with 4 Xeon 800 processors and 4-GB RAM. Everything necessary for the web sites, including Microsoft SQL Server 2000, runs directly on this server. The performance tests were conducted using four Gateway workstations running Windows 2000 Professional and the Microsoft WAS Tool. The WAS tool was used to generate 200 threads of load, which simulates about 2000 active users in this case.
The first results may be surprising. The ASP/ADO model performed better, recording a maximum throughput of 166.99 requests per second, including images. The ASP pages-per-second counter reported a throughput of 39.40. This compares to 139.84 requests per second and 33.39 ASP pages per second for the XML/XSL model. On average, the ADO/ASP pages required 6.3 seconds to load, compared to 8.1 seconds for the XML/XSL pages. To account for this difference, two things are apparent. First, it takes more processor time to transform XML/XSL as opposed to looping through an ADO recordset. Second, the XSL is loaded from disk for each request, thus reducing performance. The other performance statistics also bear this out. The XML/XSL approach consumed 98.61 percent of available processor time, compared to 90.01 percent for the ASP/ADO solution. Also, the wait and execution times were also longer for the XML/XSL solution, 4803.41 and 2156.95 versus 4155.48 and 1890.67 milliseconds. Figure 3 summarizes the results of this test.
The ASP/ADO model performs better than the XML/XSL model under this scenario. While the difference is slight, it is real. However, this is only one facet of the comparison. The code used in this test retrieves information from our database each time a page is requested. If our business rules dictate that our data be fresh for each and every request, the code in Listings One and Two is the best option. Sometimes data doesn't have to be current up to the last second. For example, class schedule course information does not change often. So, it is not necessary to retrieve it each and every time from the database. In this situation, MSXML can be used to cache the presentation of data in the ASP Application object. This is something the ASP/ADO model cannot easily do, and it can provide a tremendous performance advantage.
Caching Presentation of Data
The great thing about XML, XSL, and HTML is that they are ordinary string objects. The ASP Application object was designed for storing this type of information, thus it can be used to cache XML, XSL, or HTML presentation of data. This will only work when the data does not have to be current for each page request. This will also depend on the business rules underlying the web site. There is no technical reason why data cannot be cached for days, hours, or even seconds.
Using this method, a database- and processor-intensive web site can be transformed into a memory-intensive web site, provided that the web server has sufficient RAM. How much? Again, it depends on how much data needs to be cached. The entire sample web site used here consists of approximately 15,000 courses and sections. When cached, this requires approximately 120 MB of RAM.
Listing Five suggests one way an HTML presentation of data can be cached using the ASP Application object. In this code, two variables are stored in the application object. The rendered HTML is stored in one variable using the MSXML transformNode method. Then, another application variable stores a timestamp that denotes when the HTML was cached. When requested, the ASP code determines whether the HTML exists in the cache and whether it is current. If the answer to either question is no, the data is retrieved from the database server, transformed using MSXML, and cached in the Application object. The current date and time is also cached. When this occurs, the page executes more slowly simply because the cache is refreshed. However, subsequent requests will use the cached HTML, thereby executing much faster. In the example code, the HTML presentation is cached every 30 minutes. This could just as easily be 30 hours or 30 seconds.
How does this solution perform? Truly astounding is perhaps the only way to accurately describe the results. The web server was able to respond to 1398.84 requests per second, including images and static content. This included 332.04 ASP pages per second, roughly 10 times the throughput of the ASP/ADO model. On average, the XML/XSL cached pages required less than a second to load (.548 seconds) compared to 6.3 seconds for the ASP/ADO pages. Figure 4 compares these results.
Web site performance is not a black and white subject, but actually very, very, gray. One basic premise is often overlooked: The ways in which a web site is coded have as much (or more) to do with performance than the power of the underlying web server. ADO and MSXML are tools that can be used to create high-performance web sites. MSXML provides increased flexibility to developers, but at a cost. When drawing data directly from a database, MSXML performs slower than ADO. However, MSXML provides an easy way to cache the presentation of data, thereby providing up to a 10-fold increase in web site performance. This is a viable solution for web sites that need to support thousands of concurrent users.
<% Language=VBScript %> <!-- ADO/ASP Model Example Code --> <% '////Let's create some objects Dim objRecordset, objData '///create an adodb recordset object set objRecordset = server.createobject("adodb.recordset") '///create an instance of my custom data access object set objData = server.createobject("mydataobject.selectdata") '///go get some data strSQL = "Select Title, Year, Term from Courses" set objRecordset = objData.SelectSQL(strSQL) '///now render the top of my HTML table %> <table> <tr> <td>Title</td> <td>Year</td> <td>Term</td> </tr> <% '///now loop through my recordset to generate my table rows Do while objRecordset.eof <> True %> <tr> <td><%=objRecordset("Title")%></td> <td><%=objRecordset("Year")%></td> <td><%=objRecordset("term")%></td> </tr> <% objRecordset.MoveNext Loop '///now finish the bottom of my HTML table %> </table> <% '///now destroy my objects set objRecordset = Nothing set objData = Nothing %>
<% Language=VBScript %> <!-- XML/XSL Model Example Code --> <% '////Let's create some objects Dim objData, objXML1, objXML2 '///create an instance of my custom data access object set objData = server.createobject("mydataobject.selectdata") '///create two instances of the MSXML DomDocument Object set objXML1 = server.createobject("msxml.domdocument") set objXML2 = server.createobject("msxml.domdocument") '///run my parsers in blocking mode ObjXML1.async = FALSE ObjXML2.async = FALSE '///set msxml1 to my xml data returned from my data object '///go get some data, set parser1 to xml data strSQL = "Select Title, Year, Term from Courses" objXML1.LoadXML(objData.SelectSQL(strSQL)) objXML2.Load(server.mappath("Courses.xsl")) '///now transform the xml and xsl and write it to the browser response.write(objXML1.TransformNode(objXML2)) %>
Public Function SelectSQLStatement(byval strSQLStatement as String) As adodb.Recordset '////public method that executes a sql statement passed as a parameter '//// and returns the results as an ado recordset object create my objects dim objConnection as adodb.Connection dim objRecordset as adodb.recordset '///initialize my objects Set objConnection = New adodb.Connection Set objRecordset = New adodb.Recordset '///use client side cursor objConnection.CursorLocation = adUseClient '///open connection on connection string objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & App.Path & "\courses.mdb" '////open my recordset on my sql statement objRecordset.Open strSqlStatement, objConnection, adOpenForwardOnly, adLockReadOnly, adCmdText '///disconnect my recordset from my connection set objRecordset.ActiveConnection = Nothing '///set recordset to my return object Set SelectSQLStatement = objRecordset '///close my database connection objConnection.Close '///destroy my objects Set objConnection = Nothing End Function
Public Function SelectSQLStatement(byval strSQLStatement as String) As Variant '////public method that executes a sql statement passed as a parameter and '///returns the results as an string create my objects dim objConnection as adodb.Connection dim objRecordset as adodb.recordset dim objStream as adodb.stream '///initialize my objects Set objConnection = New adodb.Connection Set objRecordset = New adodb.Recordset Set objStream = new adodb.stream '///use client side cursor objConnection.CursorLocation = adUseClient '///open connection on connection string objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & App.Path & "\courses.mdb" '////open my recordset on my sql statement objRecordset.Open strSqlStatement, objConnection, adOpenForwardOnly, adLockReadOnly, adCmdText '//persist recordset to xml in stream object objRecordset.Save objStream, adPersistXML '///start stream at first position objStream.Position = 0 '///return xsl as function output SelectSQLStatement = objStream.ReadText '///set recordset to my return object '///close my database objects objStream.Close objRecordset.Close objConnection.Close '///destroy my objects Set objConnection = Nothing Set objRecordset = Nothing Set objStream = Nothing End Function
<% '///check to see if I have presentation in my cache '///and how old it is. if it doesn't exist, or the data is '///older than 30 minutes I refresh the application object If isempty(application("AvailableTerms")) or (datediff("n", application("AvailableTermsTimeStamp"), now()) > 30) Then '///my cache is old or doesn't exist, therefore I just load it '////create two msxml parsers as before create one data object set objData = server.createobject("xmlCourses.Select") set objXML1 = server.CreateObject("MSXML2.DOMDOCUMENT") set objXML2 = server.CreateObject("MSXML2.DOMDOCUMENT") '///run my parsers in blocking mode ObjXML1.async = FALSE ObjXML2.async = FALSE '///load xml data from my data object objxml1.loadXML(objdata.SelectTermsAvailable()) '////load xsl from disk objxml2.load(server.MapPath("terms.xsl")) '//now refresh my application object '//and set a timestamp variable so I know how old my cache is Application.Lock Application("AvailableTerms") = objxml1.transformNode(objxml2) Application("AvailableTermsTimeStamp") = now() Application.UnLock '///now destroy my objects set objXML1 = Nothing set objXML2 = Nothing set objData = Nothing End If '///now write out my results from the cache if the cache existed and '///wasn't old this would be the only thing to occur on this page Response.Write(application("availableterms")) %>