Performance Analysis Tools for Linux Developers: Part 1


The free application is freely available on all Linux distributions and is generally installed by default. Similar information can be found using top or sar, but it is a convenient command to view a snapshot of system memory usage and can be used to identify memory leaks (the allocation of memory blocks without ever freeing them) or disk thrashing due to excessive swapping.

Figure 3: free View (Idle System)


The oProfile utility is a system-wide profiler and performance monitoring tool for user space as well as kernel space (the kernel itself can be included in the profiling). The profiler introduces minimal overhead and as such can been seen as relatively unobtrusive. However, it does require that the gdb debugging (-g) flag be used. Although active since 2002 and stable on a majority of platforms, oProfile still dubs itself as an alpha quality open source tool. The tool is released under GPL and can in fact, be found in post 2.6 kernels by default. The tool works by collecting data via a kernel module from various CPU counters, then displaying that information to user-space via a pseudo file system in the same way as ps collects data via the "/proc" file system.

Figure 4: opreport from oProfile)


The GNU profiler, gprof, is an application-level profiler. The tool is open source, licensed under GDB and is available as standard on most Linux distributions. Compiling the code using gcc with the -pg flag instruments the code producing an executable that measures the wall clock execution time of functions with a hundredth of a second accuracy and exports this information to a file. This file can then be parsed by the gprof application giving a flat-profile representation of the performance data and a call-graph.

Figure 5: gProf View

The profiler collects data at sampling intervals in the same way as many of the tools described in this paper. Therefore, there may be some statistical inaccuracies of the timing figures if the run-time figure is close to the sampling interval. By running your application for long periods of time, you can reduce any statistical inaccuracies. As can be seen from the output in Figure 5, gprof can help locate hot spots at function granularity. However, it also allows you to compile this information at a finer granularity using the -l flag.

As an unexpected side-benefit, gprof can suggest function and file orderings within your binary to improve performance.


valgrind is an instrumentation framework that can be used primarily for detecting memory-related errors and threading problems, but is also extendable. It is an open source tool licensed under GPL2. The tool can detect errors such as memory leaks and incorrect freeing of memory. The valgrind tool detects these errors automatically and dynamically as the code is executing. In some cases it can produce false positives.

However, the developers of valgrind claim that it produces correct results 99% of the time and any errors can be suppressed. Although it is a very useful tool, it can be extremely intrusive as the code runs much slower than its true execution speed (by a factor of 50 in some cases) and needs to be compiled with the gcc -g flag. It is also recommended to be compiled with no optimization of code using the gcc -O0 flag. An example of the execution of a small binary through valgrind can be seen below.

Figure 6: valgrind Example

Although, it may be useful in some cases, for real-time applications that wait on I/O, valgrind can be so obtrusive as to make the checking unreliable. However, valgrind can be a highly useful tool when used in conjunction with a unit test and/or nightly build strategy. A clean run of valgrind in a nightly build allows the developer to keep track of any newly-introduced latent memory errors.

Like many of the tools presented here, valgrind is not limited to the purpose that most developers have in mind. For example, valgrind can also check for cache misses and branch mispredictions. WE strongly encourage you to read the relevant documentation and play around with this and all tools to fully appreciate their power.

