Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Infrastructure Product Review | A Roundup of Web Traffic Analysis Tools(Web Tec


Infrastructure Product Review | A Roundup of Web Traffic Analysis Tools (Web Techniques, July 2001)

Know Your Visitors

By Ernest Black

There are several things you need to know about your site: who's visiting it, where they're coming from, how many of your pages they visit, how long they stay, and which browsers they use. You also need to know which search engines are driving traffic to your site and which keywords or key phrases visitors are searching on to optimize your site's search results. All Web servers record this data in server logs, which you can analyze by using Web traffic analysis programs.

Web traffic analysis applications broadly fall into three market sectors: small business (small sites); enterprise (medium to large, complex sites); and service provider (large data centers or ISP hosts). Within these sectors are further distinctions based on the performance and features a business requires.

If you're a small business with your site hosted by an ISP, you can either use a budget-conscious Web traffic analysis program or the Web traffic software your host provides. In the first case, you may download the server logs to your local machine for processing; in the second case, the host does the processing and reporting for you. Either choice is economical; the service provided by your ISP is most likely free as part of your package (though you may have little control over the settings), but the small, inexpensive business programs ($495 for Funnel Web Analyzer, $699 for WebTrends Log Analyzer) offer greater flexibility.

Enterprise Applications

These stand-alone desktop applications are highly customizable, easy to use, and produce a wide range of reports rich in graphical detail and data. Two outstanding products in this range are WebTrends Enterprise Suite 5.5 ($2499) and Quest Software Funnel Web Analyzer Enterprise 4.02 ($1495).

The WebTrends Enterprise solution offers Web traffic, link, proxy, and streaming media analysis, along with alerting and monitoring. Version 5.5 enhancements include faster log analysis, improved profile management, and new report elements (over 300 possible tables and graphs-more than any other product). This has a very complete range of features to provide a complex corporate Web presence.

WebTrends features easily managed multiple profiles, highly customizable dynamic content analysis, sophisticated session analysis, and a powerful scheduler. If you need to analyze forms, scripts, shopping carts, URL parameters, and customer sessions from start to finish, Enterprise Suite has the necessary tools. In addition, WebTrends' proxy analysis monitors the activity between your network and the Internet, helping to manage your network's bandwidth. The alerting and monitoring function constantly monitors your Web devices, reports any problems, and offers recovery options to decrease downtime.

Funnel Web Analyzer Enterprise provides a broad suite of features including Web traffic analysis, streaming media, diagnostics, and advertising analysis, and surpasses WebTrends' performance in log processing speed. These capabilities are also much less costly than WebTrends. While Analyzer Enterprise lacks the depth of session and dynamic content analysis provided by WebTrends, it generates more than 50 comprehensive reports and graphs offering the best graphics among the available solutions.

Analyzer Enterprise supports reporting on Web traffic for up to 1000 sites, a bonus for small hosts who want to offer their clients detailed reports. Quest Software has released a new product, Web Profiler, which works in tandem with Analyzer Enterprise to provide an overview of your site structure and the relationship between its content and visitor navigation ($595).

Customization and Report Quality

Both products are easy to set up. Funnel Web is even simpler than WebTrends, as all customization is achieved through an easily navigable interface with minimal mouse clicks and keystrokes. However, WebTrends' depth of customization is worth the extra effort. Consider setting filters to exclude specific files in the reports, thus eliminating unnecessary high hit counts for include files, style sheets, or JavaScript files used across the site. Funnel Web's interface for building exclude/include filters is quicker and more intuitive than WebTrends,' as all the filters can be added on the same screen. WebTrends requires you to walk through a short series of screens in "wizard" fashion for each filter (see

Figure 1 and

Figure 2).

But WebTrends surpasses Funnel Web for session-based (as opposed to hit-based) filters, which are critical to dynamic session analysis. While these are more complex, they aren't difficult to set up.

WebTrends and Funnel Web each support virtually all log formats. As a single user who frequently runs the same report, I found Funnel Web easier to use. But if you need to manage multiple report profiles or automate reporting, WebTrends' profile management and robust scheduler offer more powerful configuration and automation.

Both programs produce a range of high quality report styles. WebTrends offers HTML (framed), MS Word, Excel, comma-delimited, and ASCII; Funnel Web offers HTML with or without frames, RTF, comma-delimited, and PDF, a format no other Web metrics program offers. Note that each program slices the data differently in some categories, creating differences in data classification. Some of these differences are inherent in the programs; others can be tweaked by customizing the settings.

While the two products generate high quality charts, I prefer Funnel Web's graphs because of their detail and the clarity with which they present complex data. Both allow 3D, 2D, pie chart, and other customized graphs that not only present the data colorfully, but that can be pulled from the reports and added to PowerPoint presentations to share with stockholders, employees, or customers.

The reports for both products are simple to navigate, read, and interpret, and they provide accurate analysis. By tweaking the filters and other settings, you can tailor reports to meet your precise needs. Ultimately, you'll draw conclusions based on your own interpretation of the data, so the final criterion is how well the reports facilitate your comprehensive analysis.

Test Results

I tested the performance (defined as the speed with which a tool processes logs and generates the finished report) of both products on a Windows 98 PC, Pentium III 500 with 128MB of RAM, processing six months of logs (182 logs, 23.1MB) covering activity on my site from October 1, 2000 to March 31, 2001. I downloaded the logs to my local machine to eliminate any Internet latency issues other than DNS lookups. Then I published the results on my site.

While testing WebTrends, I compared the results between Enterprise Suite 5.0 and 5.5. In Resolve mode (reverse DNS or all numeric IP addresses resolved to Domain Names), Enterprise Suite 5.0 averaged 3 minutes and 38 seconds in three tests, whereas Enterprise Suite 5.5 averaged 1 minute and 48 seconds in three tests. Variances in the results due to reverse DNS resolution can be extensive. What's notable about WebTrends' performance is the role played by the FasTrends database. Building the database for the first time takes significantly longer than generating subsequent reports because, once the data is stored, WebTrends only processes logs that haven't been stored in the database since the last build. Rebuilding the FasTrends database from scratch yielded several long times (3 minutes, 29 seconds; 3 minutes, 54 seconds; 16 minutes, 50 seconds), while subsequent reports consistently took an average of 45 seconds-between 10 and 14 seconds to actually run the analysis, with the remaining time spent generating the final report.

Funnel Web with DNS lookup turned on initially processed the logs in 50.7 seconds. Subsequent results over five tests varied within a much narrower range than WebTrends did: 22.47, 25, 35, and 48.9 seconds, with a slowest time of 2 minutes, 53.18 seconds, and an overall average (disregarding the slowest time) of 36.4 seconds. The average time for the log analysis alone was 9.5 seconds. Again, variables in reverse DNS lookup account for the different results. On average, Funnel Web proved eight times as fast as WebTrends 5.0 and three times as fast as WebTrends 5.5 (disregarding WebTrends' slowest time).

Service Provider Applications

This market sector has different business needs: automated processing and reporting on Web traffic for any number of sites or any size of server logs (2GB or larger), fast processing speed, simplicity of deployment, reliability, and low resource consumption. These products, typically server-installed and automated, range in price from under $1000 to over $20,000 depending on the number of sites you host and the sheer size of the server logs your system processes.

Hosts commonly use three products: MediaHouse LiveStats, WebTrends Enterprise Reporting Server, and Urchin (prices vary based on provider licensing agreements). The latter offers Pro 3, Dedicated 3, MultiHome, and Enterprise versions. Urchin sees the end user and the hosting market as one market with two customer sets: hosts and end users. To solve the problems of both user sets, Urchin focuses on a combination of front-end features, back-end performance, heterogeneous network support, multilingual support including double-byte characters, and a scaled pricing model.

I tested Urchin Pro 3.3 ($199 per domain) on a Windows NT Server 4.0 PC with a Pentium 166 and 98MB of RAM. Lacking an Internet connection, I couldn't run reverse DNS lookups. I encountered a few problems installing and running Pro 3.3, but its PDF guide (although much leaner than WebTrends' or Funnel Web's guides) provided the answers as I went. The Windows version of Pro 3.3, which runs on NT or Windows 2000, works with IIS 4, IIS 5, or Apache logs. Urchin supports multiple platforms, including Unix, Cobalt, Solaris, Linux, FreeBSD, and Mac OS X. Support for Windows 95/98/ME is expected in the next release.

You can run either local or remote logs. I chose remote logs. Because Urchin typically runs in auto mode every day, it was tricky to run through the same six months of logs as my previous tests. I had to point Urchin to a specific log-it wouldn't accept a wildcard (*.log) to select a range. You can, however, automate daily FTP logs by using YYMMDD. The Unix version does allow command line wildcards. I concatenated the 182 logs into a single log, then processed that 23.1MB log. The analysis took just 2 minutes, 18 seconds. The larger the server log, the faster Urchin performs relative to other programs-especially for reverse DNS lookups-making it ideally suited to large data centers or hosts. Urchin calls this "bi-directional scalability," whereby the product scales from the smallest Web site to the largest data center without compromising performance. To test the reverse DNS lookup and Urchin's large-file processing, the next test ran a 1.1GB log (5 months of logs, 40,000 to 150,000 hits per day) on an AMD Athlon 650 with a 40GB Integrated Drive Electronics (IDE) drive running FreeBSD 4.2. Without DNS lookup, the elapsed time was 2 minutes, 7 seconds; with DNS at 80 percent resolution, the elapsed time was 50 minutes, 14 seconds.

The quality of the reports is fine, even though they aren't as graphically rich as those from WebTrends and Funnel Web. (See Figure 3). There are several reasons for this, the most important being that Urchin creates its reports on the fly, rather than by storing static HTML pages and associated images. Urchin produces only three lean data files per site per month, applying the data to templates to produce dynamic HTML/JavaScript report pages. The JavaScript handles the final calculations for averages and other results, producing reports for any user-specified date range without creating bulky graphics files. These dynamic pages are easy to navigate, and consume far fewer system resources than the Enterprise class products. In this sense, you can't run reports by reanalyzing the logs; in fact, you can discard the logs once the database is built. This is important if your goal is to provide detailed, accurate, and fast Web traffic reports to tens of thousands of customers. If you require 3D pie charts, histograms, and bar graphs, you can export the data to Excel to create the graphs.

Individual Choice

After determining which product class is right for you, you should consider features, pricing, and performance, to help you make a final choice. In the enterprise class, the reporting and graphics quality for both WebTrends and Funnel Web Analyzer are outstanding. You should also take a good look at features unique to each. For the fastest analysis and the best graphics at the lowest price, Funnel Web is the best buy. However, if you need depth of customization, session management, dynamic site reporting, and other high-end features, WebTrends continues to offer the most complete solution in its class.

In the SP category, Urchin is an outstanding performer, offering a wide range of configurations based on platform and server load. It has unique features ideally suited to data centers, government agencies, universities, and large hosts. While you shouldn't rule out the WebTrends Reporting Server, I would recommend that you seriously consider the Urchin product that matches your system needs.


Ernest established his own Web-design firm, Standing Stone Designs, in 1996. He has worked as a Web developer at Applix, a Webmaster at Idiom, and as a freelance Web developer. In April, he became Webmaster at ProSoundWeb.com. You can reach him at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.