Channels ▼
RSS

Parallel

Performance Analysis Tools for Linux Developers: Part 1


VTune

VTune1 from Intel is a proprietary system-level profiler and performance analysis tool for Intel architecture. It introduces minimal overhead and therefore can be perceived as relatively unobtrusive. VTune works by collecting data via a kernel module from various CPU counters. This information is collected when an interrupt is generated. The granularity of the data can run from a process level down to an instruction level and is accessible through a highly-usable and configurable GUI.

VTune, when fully configured for your application and operating system, can identify performance issues at several levels of granularity from system-level to microarchitecture-level. As a tool for developers, it is extremely valuable since it has a global view at all granularities. OS performance counters can also be monitored and correlated to instruction-level hotspots. By using this correlation, we can answer questions such as "When the memory use in our system begins to ramp, what happens to our applications CPU usage?" If the source code in your test application is hooked into the VTune application, we can also drill down from the application level into threads and down to code functions.

It is impossible to outline all the features of VTune and indeed many of these tools described in this paper, however, the interested reader is directed to the references.

Intel Thread Checker

The Intel Thread Checker is a plug-in for the VTune debugging environment. It can be used to locate hard to find threading errors such as race conditions and deadlocks.

sar

The system activity reporter (sar) is a lightweight open source tool licensed under GPL that is used for collecting system-wide performance measures. The tool is generally installed by default on Linux, however, sometimes it may need to be installed using the sysstats package. Like top and ps, sar collects data from operating system counters via the proc file system. It provides performance data at system-level granularity reporting on a wide variety of metrics such as CPU usage, disk IO, memory, network IO, and IRQ. The tool can update these values at intervals of a minimum of 1 second.

sar can only provide information at system-level granularity and is used only to provide snapshots and overviews of overall system performance. Spurious or unexpected measurements from sar can be a first indication of performance issues of the system as a whole or of a single process or group of processes. It can be configured to run in the background, constantly providing a readily accessible database of system performance at any second during the day.

Figure 7: sar System-wide CPU Usage View

Figure 8: sar System-wide Memory Usage View

LTT

Linux Trace Toolkit (LTT) consists of a kernel patch and tool chain that gives the user the ability to trace events on the system. These events can be system kernel events (such as context switches, or system calls, and so on) or any application-level event. It is GPL licensed and has minimum impact to the run-time performance of traced applications. It can be used to isolate performance problems on parallel and real-time systems and analyze application timing. Any code that the user would like to be analyzed needs to be recompiled to be instrumented by LTT.

Alternatively, LTTng (Next Generation) is also available, which adds features such as a GUI Trace Viewer. See Figure 9.

Figure 9: Sample LTTng Viewer

iostat

The iostat command is used for monitoring system input/output block device loading. With multiple block devices in the system, it can be useful to determine which device(s) is currently the bottleneck. iostat provides a per device view of the number of transfers per second on each device as well as read and write rates. See Figure 10, for an example of the "extended iostat device" only output during a large file copy. Note: Observe the temporary increase in device activity while the file was being copied.

Figure 10: Sample iostat View (File Copy Example)

iotop

iotop is a Python program with a top-like user interface that can be used to associate processes with I/O. It requires Python version 2.5 or greater and a Linux kernel version 2.6.20 or later with the TASK_DELAY_ACCT and TASK_IO_ACCOUNTING options enabled. Therefore, a potential recompilation of the kernel may be required if these options have not been enabled by default. iotop is licensed under GPL. iotop provides data regarding the amount of Disk IO occurring within the system on a per process basis. This lets users determine which applications are using the disk(s) the most.

Figure 11: Sample iotop View


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video