The benefit of the macros is turning on/off profiling easily. If PROFILER_PROFILING is defined, the application will be profiled; if not, the macros simply go away.
Starting with Windows NT, Microsoft operating systems use UNICODE strings internally. If the ANSI strings are used, they are converted to UNICODE first. If you use the UNICODE strings, no conversion is necessary and thus the performance improves. Consequently, I decided to use the UNICODE strings in the CProfiler class at the expense of not supporting the ANSI-only early Windows versions. However, I think this is not a big loss. Note that the macro parameters are prefixed "L" with the "##" token-pasting preprocessor operator to specify that they are UNICODE strings.
The basic idea is to create temporary text files per thread and dump the profiler data to them in the comma separated variable (CSV) format. When a CProfiler object is created, it gets its thread id via the Windows GetCurrentThreadId() API call. Then, the object searches the static map of thread id's and temporary files. If a file is opened already for the thread, the object uses it. Else, a file is opened named after the "threadId.profiler" pattern and the object uses it. I selected the "profiler" suffix instead of the "csv" deliberately; using the obscure "profiler" suffix decreases the possibility of accessing or deleting wrong files. Note that a mutex controls the access to the map.
The class should be initiated with a directory where the temporary files and the final file will be created when the application is launched. Before the application is terminated, the class should process all the temporary files to create the final one; see Listing Three.
Void main() { PROFILER_INITCLASS("C:/Profile_ProjectName"); ... PROFILER_PROCESSDATA; }
If no directory is specified, the class uses the current directory for the profiled application. In this case, as the "current" directory may be set by some other means, one can end up using an unexpected directory. Then, the developer should use the PROFILER_START and PROFILER_STOP macros within the functions to be profiled; see Listing Four.
void CFoo:Foo() { PROFILER_START("CFoo:Foo"); ... PROFILER_STOP; }
The CProfiler constructor takes a string id and starts profiling. Initially, I thought that the CProfiler destructor should stop profiling. However, as destructors are not usually called deliberately, this would leave stopping the profiler to the compiler in effect. Obviously, the users should be given more control on the code scope profiled. Hence, a dedicated function, namely CProfiler::StopInstance() is provided to stop profiling.
The time elapsed in milliseconds between the start and stop of the profiling is the all-important profiler data. I use Windows QueryPerformanceFrequency() API function to get the current performance-counter frequency in counts per second. If the hardware doesn't support it, the CProfiler class does nothing. Windows QueryPerformanceCounter() API function is called twice; when the profiling is started and when stooped. This function retrieves the current value of the high-resolution performance counter. The deltas of the stop and start values are stored to the temporary files. Converting the delta values to milliseconds is left to the tool to improve the performance of the CProfiler.
The PROFILER_START macro takes a string id for the profiling point and starts the profiling. The function names can be used as the ids. The PROFILER_STOP macro dumps the id and the delta of counts like "CFoo::Foo,961486" to the temporary file for its thread. If the profiling is not stopped, the data dumped will be "CFoo::Foo,".
When the CProfiler::ProcessData() is executed, a final CSV format file like "profiling_2006.04.26_11.26.53.csv" is opened and the frequency of high performance counter is recorded like "Frequency,3579545". The file name format is "profiler_year.month.day.hour.min.seconds.csv". Alternatively, you can first call the CProfiler::StopClass() to stop all the profiling operations and later call the CProfiler::ProcessData(). Note that the tool needs the frequency to convert the delta values to milliseconds. Then, all the temporary files are accessed one-by-one to copy the data and then deleted. Note that when the class is stopped, it no longer generates the profiling data. The intention is to run the application under profiler several times and then process all the final CSV files via the ProcessProfilerData.hta tool. Running an application once under profiler doesn't give much data!
The data dumped that have no delta values generate warnings by the tool and they are ignored. Although this kind of records give no useful profiling data, they are still important: Assuming that the user expects the code execution reaches the PROFILER_STOP macros, they show that the execution somehow jumps out of the expected paths. I think they are particularly important when you use the CProfiler class as a probing tool to understand some new code rather than profiling some well-known code.