Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

.NET

Win32 Performance Measurement Options


No Single Solution

A summary of the characteristics of the counter classes (and their underlying timing functions) is given in Table 6. The first thing to note is that none is an out and out winner in every conceivable scenario. As mentioned in the introduction, the selection of a particular measurement function (or class) depends not only on its availability on your targeted platform(s) and on the type or measurement (systemwide/per-process/per-thread), but also on the actual resolution of the measurement and on its cost.


Table 6: Advantages and disadvantages of Win32 timing functions.


If you want user and/or kernel timings, then you must use either the threadtimes_counter or processtimes_counter classes, but these are only functional on NT operating systems.

If you want timings that give useful results on busy systems, again the threadtimes_counter or processtimes_counter classes are your choice. Since systems where you cannot suspend or terminate other busy processes are most likely to be high-performance servers, the specificity to NT systems is unlikely to be a problem.

If you want high timing resolutions, then you must use the highperformance_counter class. This does have a high call cost, but has the highest resolution by far. In addition, it appears that the call cost is relatively lower on newer operating systems, so the dissuasively high costs seen in Windows NT 4 are likely to be less and less significant in the future. (Note that highperformance_ counter has a wrap time of 100+ years. Each time the processor speed doubles, the wrapping time will halve, so when we have 1-THz machines, we will have to worry about catching the wrapping.)

If minimal call cost is the most important factor, tick_counter or multimedia_counter should be used, but be aware that they may wrap on a system that has been active, especially if it has been suspended: The value continues to be incremented when a machine is suspended. A simple program that demonstrates this is:

  void main()
  {
    DWORD const dw = ::GetTickCount();

    for(; ; ::Sleep(500))
    {
      printf("%d\n", ::GetTickCount() - dw);
    }
  }

On NT operating systems, systemtime_ counter is almost as low cost, and it does not have the wrap problem.

If you require support on every operating system, without the use of any dispatching by either the precompiler or at run time, then you must use tick_counter.

Overall, the choice of a class depends on the circumstances in which it is to be used. Hopefully, the information in the article should be of use when making such assessments, and the two programs described may be executed on your target system(s) to provide more detailed information.

performance_counter

Because no single class provides the best solutions in all cases, a seventh counter class, performance_counter, is provided, which has the functionality of highperformance_counter where a high-performance hardware counter is available, otherwise defaulting to that provided by tick_counter. Its implementation is shown in Listing Ten. It also uses the late-evaluation and statics techniques to work out (one time only) whether the hardware counter support is present.

Listing Ten: Extract from winstl_performance_counter.h


/* /////////////////////////////////////////////////////////////
 * ...
 *
 * Extract from winstl_performance_counter.h
 *
 * Copyright (C) 2002, Synesis Software Pty Ltd.
 * (Licensed under the Synesis Software Standard Source License:
 *  http://www.synesis.com.au/licenses/ssssl.html)
 *
 * ...
 * ////////////////////////////////////////////////////////// */

inline /* static */ performance_counter::interval_type performance_counter::_query_frequency()
{
    interval_type   frequency;

    // If no high-performance counter is available ...
    if( !::QueryPerformanceFrequency(reinterpret_cast<LARGE_INTEGER*>
        (&frequency)) ||
        frequency == 0)
    {
        // ... then set the divisor to be the frequency for GetTickCount(), 
        // which is 1000 since it returns intervals in milliseconds.
        frequency = 1000;
    }

    return frequency;
}

inline /* static */ performance_counter::interval_type performance_counter::_frequency()
{
    static interval_type    s_frequency = _query_frequency();

    return s_frequency;
}

inline /* static */ void performance_counter::_qpc(epoch_type &epoch)
{
    ::QueryPerformanceCounter(reinterpret_cast<LARGE_INTEGER*>(&epoch));
}

inline /* static */ void performance_counter::_gtc(epoch_type &epoch)
{
    epoch = ::GetTickCount();
}

inline /* static */ performance_counter::measure_fn_type 
  performance_counter::_get_measure_fn()
{
    measure_fn_type fn;
    epoch_type      frequency;

    if(QueryPerformanceFrequency(reinterpret_cast<LARGE_INTEGER*>(&frequency)))
    {
        fn = _qpc;
    }
    else
    {
        fn = _gtc;
    }

    return fn;
}

inline /* static */ void performance_counter::_measure(epoch_type &epoch)
{
    static measure_fn_type  fn  =   _get_measure_fn();

    fn(epoch);
}

// Operations
inline void performance_counter::start()
{
    _measure(m_start);
}

inline void performance_counter::stop()
{
    _measure(m_end);
}

// Attributes
inline performance_counter::interval_type performance_counter::get_period_count()
   const
{
    return static_cast<interval_type>(m_end - m_start);
}

inline performance_counter::interval_type performance_counter::get_seconds() 
   const
{
    return get_period_count() / _frequency();
}

inline performance_counter::interval_type performance_counter::get_milliseconds() 
   const
{
    interval_type   result;
    interval_type   count   =   get_period_count();

    if(count < __STLSOFT_GEN_SINT64_SUFFIX(0x20C49BA5E353F7))
    {
        result = (count * interval_type(1000)) / _frequency();
    }
    else
    {
        result = (count / _frequency()) * interval_type(1000);
    }

    return result;
}

inline performance_counter::interval_type performance_counter::get_microseconds() 
   const
{
    interval_type   result;
    interval_type   count   =   get_period_count();

    if(count < __STLSOFT_GEN_SINT64_SUFFIX(0x8637BD05AF6))
    {
        result = (count * interval_type(1000000)) / _frequency();
    }
    else
    {
        result = (count / _frequency()) * interval_type(1000000);
    }

    return result;
}

Despite having to call the underlying timing functions via an additional indirection, the call costs of this class range from 101-106 percent of that of the performance_counter class over the range of systems used in this analysis.

A final point worth remembering is that if you do not need absolute times, only relative ones, then you should just call get_period_count() on instances of this, or any other, counter class.

References

Java 2 Performance and Idiom Guide, Craig Larman & Rhett Guthrie, Prentice-Hall PTR, 2000.

More Effective C++, Scott Meyers, Addison-Wesley, 1996.

More Exceptional C++, Herb Sutter, Addison-Wesley, 2002.

The five original classes — TickCounter, PerformanceCounter, SystemTimer, ThreadTimes, and ProcessTimes — were developed by my employer Synesis Software (http://synesis.com.au). They have been donated, and reworked somewhat, to form part of the WinSTL open-source project, which aims to apply STL programming techniques to the Win32 API in the form of a robust, lightweight, header-only library (http://winstl.org/).


Matthew Wilson holds a degree in Information Technology and a Ph.D. in Electrical Engineering, and is a software-development consultant for Synesis Software. Matthew's work interests are in writing bulletproof real-time, GUI, and software-analysis software in C, C++, and Java. He has been working with C++ for over 10 years, and is currently bringing STLSoft.org and its offshoots into the public domain. Matthew can be contacted via [email protected] or at http://stlsoft.org/.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.