Channels ▼

Embedded Systems

Performance Analysis Tools for Linux Developers: Part 1


The free application is freely available on all Linux distributions and is generally installed by default. Similar information can be found using top or sar, but it is a convenient command to view a snapshot of system memory usage and can be used to identify memory leaks (the allocation of memory blocks without ever freeing them) or disk thrashing due to excessive swapping.

Figure 3: free View (Idle System)


The oProfile utility is a system-wide profiler and performance monitoring tool for user space as well as kernel space (the kernel itself can be included in the profiling). The profiler introduces minimal overhead and as such can been seen as relatively unobtrusive. However, it does require that the gdb debugging (-g) flag be used. Although active since 2002 and stable on a majority of platforms, oProfile still dubs itself as an alpha quality open source tool. The tool is released under GPL and can in fact, be found in post 2.6 kernels by default. The tool works by collecting data via a kernel module from various CPU counters, then displaying that information to user-space via a pseudo file system in the same way as ps collects data via the "/proc" file system.

Figure 4: opreport from oProfile)


The GNU profiler, gprof, is an application-level profiler. The tool is open source, licensed under GDB and is available as standard on most Linux distributions. Compiling the code using gcc with the -pg flag instruments the code producing an executable that measures the wall clock execution time of functions with a hundredth of a second accuracy and exports this information to a file. This file can then be parsed by the gprof application giving a flat-profile representation of the performance data and a call-graph.

Figure 5: gProf View

The profiler collects data at sampling intervals in the same way as many of the tools described in this paper. Therefore, there may be some statistical inaccuracies of the timing figures if the run-time figure is close to the sampling interval. By running your application for long periods of time, you can reduce any statistical inaccuracies. As can be seen from the output in Figure 5, gprof can help locate hot spots at function granularity. However, it also allows you to compile this information at a finer granularity using the -l flag.

As an unexpected side-benefit, gprof can suggest function and file orderings within your binary to improve performance.


valgrind is an instrumentation framework that can be used primarily for detecting memory-related errors and threading problems, but is also extendable. It is an open source tool licensed under GPL2. The tool can detect errors such as memory leaks and incorrect freeing of memory. The valgrind tool detects these errors automatically and dynamically as the code is executing. In some cases it can produce false positives.

However, the developers of valgrind claim that it produces correct results 99% of the time and any errors can be suppressed. Although it is a very useful tool, it can be extremely intrusive as the code runs much slower than its true execution speed (by a factor of 50 in some cases) and needs to be compiled with the gcc -g flag. It is also recommended to be compiled with no optimization of code using the gcc -O0 flag. An example of the execution of a small binary through valgrind can be seen below.

Figure 6: valgrind Example

Although, it may be useful in some cases, for real-time applications that wait on I/O, valgrind can be so obtrusive as to make the checking unreliable. However, valgrind can be a highly useful tool when used in conjunction with a unit test and/or nightly build strategy. A clean run of valgrind in a nightly build allows the developer to keep track of any newly-introduced latent memory errors.

Like many of the tools presented here, valgrind is not limited to the purpose that most developers have in mind. For example, valgrind can also check for cache misses and branch mispredictions. WE strongly encourage you to read the relevant documentation and play around with this and all tools to fully appreciate their power.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.