Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


CUDA, Supercomputing for the Masses: Part 19

Debug Focus, Conditional Breakpoints and Memory Checking

Setting a break point in the assignment of v_d in the method assign() in Figure 5 lets us see how Parallel Nsight Debug Focus works. Pressing F11 moves the yellow arrow indicating program execution to the close curly bracket. The display of v_d in the Memory 1 window is also updated. Note that all the values from 1-31 are highlighted in red indicating that this warp changed those values in v_d. The value at index zero is not highlighted in red as it did not change value. It already contained the value of zero.

[Click image to view at full size]
Figure 5

Notice that the values after 31 are still zero. This happens because Parallel Nsight has not executed the warp that assigns locations 32-63.

Two important characteristics to note about breakpoints:

  • Program execution stops when any warp hits the breakpoint. As this example demonstrates, a partial result might be displayed as the other threads might not yet completed their work.
  • Any warp can trigger the breakpoint. Do not assume that the Block 0(0,0,0) Warp 0(0,0,0) will always trigger the breakpoint.

Clicking on Nsight | Device Summary allows is to select whatever block and warp desired on the GPU. (To quickly make a selection in large grids, click on Nsight | Debug Focus … and type in the block and thread.) As in Figure 5, Warp 0(0,0,0) is highlighted.

[Click image to view at full size]
Figure 6

Clicking on the following line, Warp 1 (32,0,0) and double-clicking on AssignScaleVectorWithError.cu in the Solution Explorer window shows that the yellow arrow is still in the breakpoint circle. This indicates that warp 1 in this block has not run. Pressing F11 advances the arrow and shows the values from 32-62 are updated. Selecting any warp in any block aside from block 0 shows that tid never exceeds 255, which confirms the incorrect assignment of tid.

It is also possible to set conditional breakpoints on the predefined CUDA variables like blockIdx and threadIdx. Parallel Nsight 1.0 does not support conditional breakpoints on data, but it does support setting breakpoints on a memory address.

To set a conditional breakpoint:

  • Click in the gray area next to the source line to set a breakpoint.
  • Right-click and select Condition … from the menu that appears.
  • Type in the condition. A plus ('+') will be added to the breakpoint circle.

In Figure 7, the condition @blockidx(10,0,0) is required to be true. The Parallel Nsight CUDA debugging was started and F11 was pressed to advance the execution by the warp. As can be seen, tid is incorrectly set to zero. Pressing F5 to continue shows that scale() correctly calculates tid as 2560.

[Click image to view at full size]
Figure 7

Just like the Linux cuda-gdb debugger, Parallel Nsight also supports out-of-bounds memory checking. Multiplying tid by a 1000 as shown in the code snippet below purposely causes an out-of-bounds condition that is caught by Parallel Nsight with memory checking enabled.

// Scale the vector
__global__ void scale(float *v_d, int n, float scaleFactor)
  int tid = blockIdx.x * blockDim.x + threadIdx.x;
  if(tid < n)
    v_d[1000*tid] *= scaleFactor;

Figure 8 illustrates how Parallel Nsight flags the out-of-bounds memory errors.

[Click image to view at full size]
Figure 8

The pop-up window in the lower right of the screen (Figure 9) shows that the memory checker found access violations.

[Click image to view at full size]
Figure 9


With version 1.0 of Parallel Nsight, NVIDIA has demonstrated a significant commitment to Visual Studio developers around the world. The thought process behind Parallel Nsight is clearly in-place to create a powerful and flexible product to support developers for many years to come. As with other NVIDIA software, it is expected that Parallel Nsight will evolve quickly in the near future to ease project creation, remote machine package management, and improve the user debugging experience by providing a more unified debugging experience. Look to the release notes to stay abreast of the latest developments. Still, the current 1.0 release is quite useful. In particular, the large amount of information that is displayed in an interactive fashion can speed debugging efforts.

The next article in this series will focus on using the trace capabilities provided by Parallel Nsight, which allow developers to gain "Nsight" into what is happening on host and GPU processing systems as well as pinpoint exactly where time spent within the various APIs. Support for tracing OpenCL applications within Parallel Nsight will also be covered.


Rob Farber is a senior scientist at Pacific Northwest National Laboratory. He has worked in massively parallel computing at several national laboratories and as co-founder of several startups. He can be reached at [email protected]

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.