Tools

Performance & System Testing

By Thomas H. Bodenheimer, August 01, 2004

When collecting performance data from dozens—if not hundreds—of computers, automation is a necessity.

August, 2004: Performance & System Testing

Automating the data-collection process

Tom is a software engineer with IBM-Tivoli where he has been a member of various test organizations and specializes in software performance engineering. He can be contacted at tbodenheus.ibm.com.

As a performance tester of enterprise software solutions, I have to monitor and collect a large amount of performance data from a variable number of systems. The enterprise software I test involves distributed components running on a wide range of hardware and operating systems. Since these tests can involve anywhere from four to hundreds of actual computers, I need to automate the collection of performance data as much as possible.

On Windows machines, I had been using a combination of Perfmon and Logman (both included with Windows) to gather the performance data during test runs. But setup of these programs involved a manual process of selecting those devices for each physical system I wanted to report on. Some systems are simple, single processor, single disk, desktop-style boxes while others are multiprocessor, multiple hard-drive, high-powered servers. My problem was that I needed to eliminate the manual configuration involved with Perfmon and Logman of these physically different machines while still collecting system-specific data for all machines. I also wanted a simple command-line way to start and stop data collection. Since I often use Perl to analyze the data and generate summary reports for all machines used in a test, the collected data needed to be in a text format.

I developed a solution that leverages the Microsoft Performance Data Helper (PDH) library. By using the PDH library, a simple, single program automatically monitors a wide range of different hardware and collects performance data on all the system devices. The PDH library lets me monitor all the physical—processor, hard disk, and network—activity during tests. By wrapping this function into a Windows service, I gained the command-line start/stop ability that I desired. My primary job responsibilities are to run the actual tests; thus, I have limited time to dedicate to tool development. Using the PDH library in a Windows service, I minimized the amount of time I spent getting the tool up and running.

The Microsoft Performance Data Helper Library

The underlying performance monitoring function used in Perfmon and Logman is provided by the Microsoft Performance Data Helper (PDH) library. The basic function of the PDH library is contained in the PDH.dll and available to C/C++ developers by including the pdh.h and pdhmsg.h libraries in their code. I recommend that you download and use the latest version of the Microsoft Platform SDK to take advantage of all the functions available in recent versions of the PDH library.

The basic abstraction of performance monitoring in Windows is the counter. The MSDN online documentation defines a counter as "a performance data item whose name is stored in the registry." Performance object counters are defined for physical components such as processors, disks, and network interfaces, while system object counters are defined on processes, threads, and other operating-system objects. Counters are grouped together in queries so that they can all be collected at the same time.

Counters are defined by their name and path. Example 1(a) shows the basic format of this path, while Example 1(b) presents examples of actual counter paths. As you can see, not all counters have a ParentInstance or InstanceIndex. There are wildcard functions available that will take a basic path and expand it to match those counters available on a specific system.

Example Counter Strings

I played around with several different string formats when trying to find the counter path names for the different physical object counters I wanted to monitor. Referring again to Example 1(a), the first thing I found was that I didn't really need to include the Machine name part of the string. Also, I initially struggled with trying to figure out how to use the ParentInstance/ObjectInstance#InstanceIndex part of the counter path. Different counters may or may not have all of these parts of the counter paths. For example, the processor counters will have counter path names like:

\Processor(0)\% Processor Time

What I found was that I could simply use the wildcard symbols in the counter path for all of the ParentInstance/ObjectInstance#InstanceIndex counter paths for any counter. Thus, I can just use:

\Processor(*/*#*)\% Processor Time

And I can use the (*/*#*) wildcard syntax for every counter. The PDH wildcard expansion functions correctly expand out the names for me and I get all possible counter path name matches returned. Check out the included source code for examples of this.

The Final Design For the PDH Component

I created a simple class, PerfLogger (see Listings One and Two), that follows the necessary procedure to use the PDH library counters for performance data logging. Those steps are:

Create a query using the PdhOpenQuery function.
Add counters to the query. (a) Generate an array of counters using either the PdhExpandCounterPath or PdhExpandWildCardPath functions. (b) Add counters to the query with the PdhAddCounter function.
Open a logfile using the PdhOpenLog function.
Start logging data to the logfile using the PdhUpdateLog function.
Stop logging and close the logfile with the PdhCloseLog function.

Writing the PerfLogger class was quick and required a limited amount of coding. This class pulls out all the processor, physical disk, memory, and network interface statistics I want to collect during tests. It handles all the Windows machines in my testbed regardless of the physical hardware differences. The only arguments needed in its constructor are a full path and file name for the logging file, and the integer number of seconds between queries to the counters. The design allows some flexibility for other team members to use the PerfLogger class while putting the log file into any directory and using a different time interval between measurements.

Creating a Windows Service for the PerfLogger

After creating the PerfLogger, all the function I needed for the actual logging was in place. But I still needed a simple command-line way to start and stop the logging. I had previously used the net start <service name> and net stop <service name> commands to start and stop Windows services from a command line. This seemed to offer all the function I needed and the steps to create a Windows service were well documented in the Microsoft Platform SDK.

All services are managed by the Service Control Manager (SCM) in the Windows OS. The SCM is started at system boot-up and is a remote procedure call server. It is responsible for maintaining a database of installed services, starting services at system boot-up or on demand, and transmitting control messages to running services. There are five main interaction points with the SCM in creating and running a Windows service that developers work with:

One point is during the service installation. This requires connecting to the SCM on the machine and opening the SCM database. The OpenSCManager function is used to do this. Next, you need to use the CreateService function to register your executable with the Service Control Manager. The CreateService function lets you set the location of the executable for the service, the name of the service, the context of the service (for example, running as its own process or as a shared process), and other SCM database values for the service.
The program should have a normal main method that has two major steps. It should create a SERVICE_TABLE_ENTRY that contains the name of the service and the ServiceMain function used to start the service. The next step is to call the StartServiceCtrlDispatcher function with the SERVICE_TABLE_ENTRY as the argument. This starts the control dispatcher thread, which loops and waits for control messages for the service.
The SCM calls the ServiceMain function to start your service. This function is analogous to the normal main function of a program. You must implement the ServiceMain function in your code. The first task in ServiceMain is to call the RegisterServiceCtrlHandler function to register the method that will handle the messages sent to the service by the SCM. At that point, any code that the service will execute is inserted. I have a while loop that creates an instance of my PerfLogger class, finds the performance counters on the machine, and starts logging the performance data.
Another interaction point is the method that is registered during the RegisterServiceCtrlHandler call in ServiceMain. This function contains a switch statement so that the SCM can pass any control messages to the service. For my service, the only thing I needed to handle was the stop request. When receiving a stop request, a pointer to the PerfLogger instance from the ServiceMain function is used to stop the performance logging. It also sets the Boolean value in the while condition of the ServiceMain loop to False to stop that loop.
The last point is during deletion of the service from the Service Control Manager. Connect to the SCM and open the SCM database with the OpenSCManager function. Then call the DeleteService function to remove the service from the SCM database.

This defines the basic steps and functions required to create a Windows service that can be started from the command line.

Results and Other Applications

Using the PerformanceLogger service (available electronically; see "Resource Center," page 5) I created with these methods has simplified my performance and scale testing of enterprise software. (Also available electronically are a sample configuration file and sample output file.) I install the service on any Windows machine that is included in my test environment and I'm ready to collect the basic performance data I need. This shortens my preparation and setup time considerably. Automation is important to any software test effort that involves multiple machines and this tool removes previously necessary manual configuration steps.

There are potential applications of the PDH library and Windows services that could provide extra value to Windows administrators. Using a service like I've outlined could provide a cheap monitoring solution for Windows machines. You could potentially use the PDH library to build performance monitoring in programs to provide autonomic adjustment of system resource usage. Both the PDH library and the Windows services API offer quick development of test and monitoring tools. Testers can use the methods I've outlined to create test tools that address different aspects of the product testing. Play around with the example code and modify it to fit your environment.

References

Braithwaite, Kevin. Custom Performance Analysis Using the Microsoft Performance Data Helper; IBM WebSphere Developer Technical Journal; http:// www-106.ibm.com/developerworks/websphere/techjournal/0310_braithwaite/ braithwaite.html.

Anish, C.V. Creating a Windows NT/ Windows 2000 Service; Microsoft February 2003 Software Development Kit (SDK); http://www.codeguru.com/Cpp/WP/ system/ntservices/article.php/c5701/.

DDJ

Listing One

#include "stdafx.h"
#include <pdh.h>
#include <pdhmsg.h>

#define INITIALPATHSIZE 2048
class PerfLogger{
    char        logFile[512];
    int         intervalBetweenMeasurements;//in milliseconds
    HQUERY      hQuery;
    HLOG        phLog;
    DWORD       logType;
    BOOL        logging;
public:
    PerfLogger();
    PerfLogger(char* logFileName, int interval);
    int findAndActivatePerfMetrics();
    void startPerfLog();
    void stopPerfLog();

private:
    PDH_STATUS getAllMetricsFor(char *wildCardPath);
};

Back to article

Listing Two

#include "PerfLogger.h"
PerfLogger::PerfLogger(){
}
PerfLogger::PerfLogger(char* logFileName,int interval)
{
    strcpy(logFile,logFileName);
    /*  The logType defines what type of log will be used for output.
        Since I use Perl to often summarize data, a comma separated value
        file is what I wanted.  CSV log files are one of the options, so 
        I was in business.
    */
    logType = PDH_LOG_TYPE_CSV;
    /*  We open a PDH query in the constructor - we'll add counters to it
        later.  By having all our counters in one query, whenever a snapshot
        of the counters is taken, all the counters are sampled at that same 
        time.
    */
    PdhOpenQuery(0,0, &hQuery);
    
    /*  I needed to sample the counters for some integral number of seconds.
        The actual argument is in milliseconds, so we multiply by 1000.
    */
   intervalBetweenMeasurements=interval * 1000;
    
    /*  A boolean value to help keep track of when logging should be ongoing
        or not.
    */
    logging = FALSE;
}
/*  A member to allow us to find the subset of individual machine counters I'm
    interested in logging.  For me, I wanted the
        % Processor Time for all processors.
        % Disk Time for all disks
        % Disk Read Time for all disks
        % Disk Write Time for all disks
        Available Mbytes of memory during the monitoring period.
        Bytes Received/Sec for all network interfaces
        Bytes Sent/Sec for all network interfaces
Althought it's bad practice, I don't check the return code status when
wildcarding through the counters.  So far, it hasn't caused me any problems.
*/
PerfLogger::findAndActivatePerfMetrics(){
    char wildCardPath[256];
    PDH_STATUS pdhStatus;
// Use the counter path format without specifying the computer.
//    \object(parent/instance#index)\counter

    strcpy(wildCardPath,"\\Processor(*/*#*)\\%% Processor Time");
    pdhStatus=getAllMetricsFor(wildCardPath);

    strcpy(wildCardPath,"\\PhysicalDisk(*/*#*)\\%% Disk Time");
    pdhStatus=getAllMetricsFor(wildCardPath);
    strcpy(wildCardPath,"\\PhysicalDisk(*/*#*)\\%% Disk Read Time");
    pdhStatus=getAllMetricsFor(wildCardPath);
    strcpy(wildCardPath,"\\PhysicalDisk(*/*#*)\\%% Disk Write Time");
    pdhStatus=getAllMetricsFor(wildCardPath);
        
    strcpy(wildCardPath,"\\Memory(*/*#*)\\Available MBytes");
    pdhStatus=getAllMetricsFor(wildCardPath);

    strcpy(wildCardPath,"\\Network Interface(*/*#*)\\Bytes Received/sec");
    pdhStatus=getAllMetricsFor(wildCardPath);
    strcpy(wildCardPath,"\\Network Interface(*/*#*)\\Bytes Sent/sec");
    pdhStatus=getAllMetricsFor(wildCardPath);
    return 0;
}
/*  This member actually takes a counter path with wildcards and uses the
    PdhExpandWildCardPath to get all the matching counter paths. It then 
    adds these expanded paths to the query we created in the constructor.
*/
PDH_STATUS PerfLogger::getAllMetricsFor(char *WildCardPath){
   LPSTR  szCtrPath = NULL;
    char   szWildCardPath[256] = "\000";
    DWORD  dwCtrPathSize = 0;

    HCOUNTER phcounter;
    PDH_STATUS  pdhStatus;
    sprintf(szWildCardPath, WildCardPath);//works
// First try with an initial buffer size.
    szCtrPath = (LPSTR) GlobalAlloc(GPTR, INITIALPATHSIZE);
    dwCtrPathSize = INITIALPATHSIZE;
    pdhStatus = PdhExpandWildCardPath(NULL,szWildCardPath, szCtrPath, 
                    &dwCtrPathSize,NULL);
// Check for a too small buffer.
    if (pdhStatus == PDH_MORE_DATA)
    {
        dwCtrPathSize++;
        GlobalFree(szCtrPath);
        szCtrPath =  (LPSTR) GlobalAlloc(GPTR, dwCtrPathSize);;
        pdhStatus = PdhExpandWildCardPath(NULL,szWildCardPath, szCtrPath, 
                    &dwCtrPathSize,NULL);
    }
// Add the paths to the query
    if (pdhStatus == PDH_CSTATUS_VALID_DATA)
    {
        LPTSTR ptr;
        ptr = szCtrPath;
        while (*ptr)
        {
            pdhStatus = PdhAddCounter(hQuery,ptr,0,&phcounter);
            ptr += strlen(ptr);
            ptr++;
        }
    }
    else printf("PdhExpandCounterPath failed: %d\n", pdhStatus);
    return pdhStatus;
}
/*  Since eventually the PerfLogger class will be used as a 
    service, I just start logging in an infinite loop - I'll
    count on the Windows Service API to allow me to stop
    logging by updating the value of the boolean.
*/
void PerfLogger::startPerfLog()
{
    logging = TRUE;
    PDH_STATUS pdhStatus;
    // Open the log file for write access.
    pdhStatus = PdhOpenLog (logFile, PDH_LOG_WRITE_ACCESS | 
             PDH_LOG_CREATE_ALWAYS, &logType, hQuery, 0, NULL, &phLog);
   // Capture samples and write them to the log.
   while(logging) {
       pdhStatus = PdhUpdateLog (phLog, TEXT("Some Text."));
       Sleep(intervalBetweenMeasurements); // Sleep between samples
   }
// Close the log and the Query
   pdhStatus = PdhCloseLog (phLog, PDH_FLAGS_CLOSE_QUERY);
}
//  Just to stop the logging loop
void PerfLogger::stopPerfLog(){
    logging = FALSE;
}

Back to article

1 2 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Tools

Performance & System Testing

Automating the data-collection process