The ParBenCCh suite is a collection of small C and C++ applications designed to characterize compiler optimization capabilities, language support, object-oriented-programming style overhead, and machine performance. We have developed a testing framework using a virtual base class that encapsulates the essential functionality of any benchmark. Each of our specific tests derive from this base class and contain the code unique to that test. This common interface makes creating a new benchmark straightforward and gives us identically formatted output files containing timing information that can then be easily and automatically processed.
- The Haney test, written by Scott Haney compares the performance of matrix-vector operations in a variety of settings.
- Real matrix multiplication by implementing a simple matrix class that provides data management similar to Fortran and Fortran-syle data indexing.
- Complex matrix multiplication by introducing a complex data class and measuring overhead associated with function overloading.
- Real vector operations to test the cost of overloaded operators for arithmetic operations on complete arrays.
Each of these three tests is performed using object oriented C++, hand-coded C, and hand-coded Fortran 77. Comparisons therefore reflect the optimizations available in each compiler. As such, this makes a nice comparison of programming language + compiler combinations to solve common programming challenges.
- The Stepanov test measures the so-called abstraction penalty of C++. The code is quite simple: it simply adds 2000 doubles in an array 10000 times. The sophistication of the operation ranges from the most basic of indexing a Fortran-like array with integers, to using wrapped pointers wrapped in a reverse-iterator adaptor wrapped in a reverse-iterator adaptor. For each of 13 tests of increasing abstraction, times are reported relative to the Fortran-like loop.
- The Blitz test measures the compiler support and performance of expression templates- a C++ template mechanism intended to achieve Fortran-like performance while retaining the elegant code style of C++. This test compiles a number of example codes which simply test the compiler support for expression templates, and runs a small subset of them to generate performance information for the C++ code versus F77 and F90 versions.
- The OpenMP test is a test of OpenMP-style parallel direct and indirect addressing. In this test, we first allocate a one-dimensional array of doubles whose size is close to the maximum heap space available on the node. We then perform four operations on this array: A 'linear read' - step through the array in order, reading each element of the array to a temporary variable, a 'linear read-write'- again step through the array in order, but at each step increment the value of the array by 1.0, a 'random read' - step through the array in random order reading each array element into a temporary variable, and a 'random read-write' - randomly step through the array incrementing each entry by 1.0. The driver loops over the number of threads and performs each of the four tests the specified number of times filling the output file with statistical information about the tests.
- The tests in the IndirectAddressing directory exercise parallel indirect addressing using MPI-based parallelism. There are three related tests which are performed, each of which represents phases of the abstract operation A(I) = B(J). In this particular test, we restrict A and B, I and J to be one dimensional parallel arrays of long ints. The three tests execute the following phases: T = B(J), A(I) = T, and A(I)=constant. The first two tests are the two halves of the general operation A(I)=B(J) while the third test can be considered an initialization step. The array indices, I (J) are computed to be random integers whose values are restricted to be valid indices of A (B).
Most of the code is written in C++. Some of the tests require some level of template support. There is one Fortran 77 file in Haney. Several files in the Blitz tests are F77 and F90.
We have included parallelism tests implemented using MPI and OpenMP. The MPI tests have been run on a number of platforms and operating systems, including MPICH on Linux, vendor MPI on OSF, vendor MPI on AIX, and MPICH on Solaris. The OpenMP tests have been executed on a number of platforms using the Guide compiler from KAI.
Files in the Suite
We list the directory structure below:
- ParBenCCh-1.1.2: Contains configuration scripts, data and Makefile information.
- config: A directory holding configuration files, should not be needed by a user.
- BenchmarkBase: Directory holding base class for all derived benchmarking objects: BenchmarkBase.
- Haney: Directory holding source code for Haney test.
- Blitz: Directory holding source code for Blitz tests.
- KAI_Bmarks: Directory holding Stepanov test files.
- OpenMP: Directory holding OpenMP parallel test files.
- IndirectAddressing: Directory holding MPI parallel test files.
If your C++ compiler does not contain a collection of standard headers that includes the STL e.g. <vector>, <list>, etc. you will need to find one which will work with the compiler. http://www.stlport.org provides a version of STL that has been designed for efficiency and portability.
Each of the deepest level directories have similar layout. For instance, KAI_Bmarks contains the files: Makefile.am, Makefile.in, Makefile.user, StepanovBench.C, StepanovBench.h, and StepanovBenchTest.C. The StepanovBenchTest.C file is the driver for the test. StepanovBench.C and StepanovBench.h contain the source code for the test itself. Makefile.in is read by configure to build the Makefile. Makefile.am is included to assist users in debugging Makefile errors. Makefile.user is a cleaned up simple makefile for users unable to successfully use configure.
Building the Code
In a directory with at least 50 MB of space, first untar the ParBenCCh-1.1.2.tar.gz tar ball, using the following command:
gunzip -c ParBenCCh-1.1.2.tar.gz | tar xf -
This will create the directory "ParBenCCh-1.1.2" that contains all the source to ParBenCCh.
Configuring ParBenCCh for Your Platform
Cd into the "ParBenCCh-1.1.2" directory. Execute the following two commands:
./configure <i>[options--see below]</i> make
Configure is a shell script that tests many features of the system including basic functionality, e.g., what arguments to pass ar to build a library, what flags to pass the C compiler to build with debugging symbols, what Fortran libraries are needed at link time, and many other characteristics of the machine and compiler features. These settings are stored in a header file which is included in the source code to conditionally compile parts of the code depending upon the results of configure. At the conclusion of the configure script, a file Makefile.user.defs is written in the top level directory containing all the settings determined by configure. Problems or errors encountered at configure time can sometimes be diagnosed by looking at the top level config.log file. It contains the shell commands that were executed and the resulting output.
Usually it is necessary for the user to configure the code with some options explicitly set. A full list of options for configure can be seen by executing:
However, the short list of options that are typically needed is:
- --enable-SERIAL_TESTS...execute the serial C++ tests (Stepanov,Haney, Blitz).
- --enable-MPI_TESTS ...execute the MPI test (IndirectAddressing).
- --enable-OPENMP_TESTS ...execute the OpenMP tests (OpenMP).
- --with-F77 ...set Fortran 77 compiler.
- --with-F90 ...set Fortran 90 compiler
- --with-CC ....choose C compiler.
- --with-CXX ...choose C++ compiler.
- --with-FORTLIBS ...choose Fortran libraries to link to a combined C++, F77, and F90 application.
- --with-mpi-include ...choose location of MPI include files.
- --with-mpi-lib-dirs ...choose location of MPI libraries.
- --with-mpi-libs ...choose MPI libraries.
- --with-mpirun ...define MPI run script name.
- --enable-CXX_OPTIONS ...set C++ compiler options, typically warnings, standard conformance, template flags.
- --enable-CXX_OPT ...set C++ optimization flags.
- --enable-CXX_DEBUG ...set C++ debugger flags.
- --enable-F77_OPTIONS ...set F77 compiler options, typically warnings, standard conformance.
- --enable-F77_OPT ...set F77 optimization flags.
- --enable-F77_DEBUG ...set F77 debugger flags.
- --enable-F90_OPTIONS ...set F90 compiler options, typically warnings, standard conformance.
- --enable-F90_OPT ...set F90 optimization flags.
- --enable-F90_DEBUG ...set F90 debugger flags.
For example, when configuring on a DEC alpha OSF1 V4, we use the following configure command (when compiling for speed):
./configure --with-F77=f77 --with-F90=f90 --with-CC=cc --with-CXX=cxx --enable-SERIAL_TESTS '--enable-CXX_OPTIONS=-std ansi -pthread -D__NO_USE_STD_IOSTREAM' '--enable-CXX_OPT=-fast -nofp_reorder -tune host' --enable-CXX_DEBUG= -enable-MPI_TESTS --with-mpi-include=/usr/opt/MPI102/include --with-mpi-lib-dirs=/usr/opt/MPI102/lib '--with-mpi-libs=fmpi mpi elan elan3 pthread' --with-mpirun=prun --with-mpirun_num_procs=-n --enable-FORTLIBS=-lfor '--enable-F77_OPT=-fast -tune host -no_fp_reorder' '--enable-F90_OPT=-fast -tune host -no_fp_reorder'.
Because many platforms have different names for the serial C++, MPI C++, and OpenMP C++ compilers, it may be simpler to configure first with --enable-SERIAL_TESTS using the serial C++ compiler and build and run the serial tests. Next, run configure using --enable-MPI_TESTS with the MPI C++ compiler and libraries necessary and build and run those tests. Repeat for the OpenMP tests (if applicable).
If configure was successful, simply execute make to build the executables:
Alternatively, you can use Makefile.user:
make -f Makefile.user
This choice will require that the top level Makefile.user.defs have correct values set for all variables. This is done by configure automatically, but can be changed by the user with a text editor. Any errors when building the tests using Makefile.user should be investigated by first looking at the top level Makefile.user.defs.
Finally, to generate the data files from the execution of the tests, execute make check:
or if using the Makefile.user interface,
make -f Makefile.user check
Running the Code
All the test executables will be created in their source directory. Each test is run by building the check target for make as described above. We describe below the names of the resulting data files for each test that must be included in the final report.
Haney: Output indicating the progress of the test is written to standard out and the data file is HaneyBenchTest.data.
KAI_Bmarks: The StepanovBenchTest program writes to standard output results for the individual tests, but puts the summary data into StepanovBenchTest.data.
Blitz: The Blitz benchmarks are unique in that there is a collection of several executables that are run. The results are stored in MATLAB formatted files (denoted by file suffix .m) or in a simple text file (denoted by file suffix .out.)
OpenMP: Upon completion there will be a file RandomLoop.data that contains the timing and statistical data generated by RandomLoopTest. It will be up to the user to specify any environment variables needed for OpenMP. These settings must be provided in the report.
IndirectAddressing: Execute IndirectAddressingBenchTest on the parallel machine over a range of processors. For instance:
mpirun -np 4 IndirectAddressingBenchTest mpirun -np 16 IndirectAddressingBenchTest mpirun -np 64 IndirectAddressingBenchTest
IndirectAddressingBenchTest will append subsequent runs results to the output data file IndirectAddressingBenchTest.data. The benchmark specific readme file will describe the exact runtime parameters we request. The report must contain any MPI environment variables that were set during the runs.
We have tried to use platform-independent timers in the serial and OpenMP tests while relying on MPI_Wtime for the MPI tests.
Courtesy Lawrence Livermore National Lab