Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


CUDA, Supercomputing for the Masses: Part 19

Configuring Visual Studio and Parallel Nsight to Debug an Application

Most readers will want to create a new CUDA application to debug and trace their code with Parallel Nsight. For this reason, we now provide a simple example demonstrating how to create an entirely new project in Visual Studio that can build and debug a CUDA program. The source code from Part 14 that was use to demonstrate cuda-gdb will be built and debugged with Parallel Nsight.

At the moment there is no project wizard so all configuration of the build, environmental variables and remote application packaging must be done by hand. For this reason, accessing the properties menu in the Windows Solution window is important because that is where much of this work is performed. Once a project is configured, the Nsight tab will provide access to the functionality of the Parallel Nsight debugging and analysis capabilities.

There are two important things to note when configuring a Parallel Nsight solution for debugging and analysis:

  • Plan on taking some time to correctly configure a Parallel Nsight/Visual Studio CUDA project. The Parallel Nsight team is working on a true CUDA project system for Visual Studio, which will make CUDA C/C++ a first-class citizen rather than a VC++ project compiling CUDA code. This will considerably ease the configuration and build process and make project setup much easier as it will require many fewer steps plus be significantly more robust against both human error and version changes. However, release 1.0 does not have this capability.
  • Succinctly, package management for the target machine is done by hand. Since Parallel Nsight uses a remote execution model, building the executable is only part of the process required to use Parallel Nsight. The user is also responsible for aggregating everything needed so the application can be shipped to the remote machine and run correctly. DLLs in particular are a challenge as even simple applications can require access to a large number of disparate libraries.
    • Identifying the required dlls can require a significant effort on the part of the developer for even very simple applications. Third-party dlls present additional challenges as the user may choose to have more than one copy on their system, with each application using a different version. Similarly, 32- and 64-bit issues are a problem. Even the simple examples provided in this article series require gathering multiple libraries such as glut, glew, cudart and cutil from various system and Internet locations. Considerable time can be wasted by mistakenly copying a dll with the correct name but incorrect version or type. As a result, the solution will build but fail to run on the target machine.
      • Specification of the appropriate paths to the header files is also complicated by the library/dll challenges and is exacerbated by the need to separately define the paths for both the Visual Studio and nvcc compilers.

    • Once the dlls are found, Parallel Nsight is configured so the remote application will find them so long as they are in the same directory as the executable (or perhaps in the working directory), or in the system PATH. There are two ways to gather the dlls once they have been identified.
      • A kludgy way to handle this issue is to hand copy the dlls to the project Debug directory once the project is created.
      • A more elegant approach is to specify one or more commands in the Visual Studio project properties field Build Events | Post-Build Event | Command line to gather any needed dlls. Note: the commands specified in this field are executed after each successful solution build, which can adversely impact build time.

Following are the steps required to create a new Parallel Nsight project using the simple debugging example used in Part 14, which discussed debugging with cuda-gdb.

  1. Start Visual Studio
  2. Click on File | New | Win32 Console Application
    1. Specify the name as DDJ014_debug and click OK
    2. Click Finish
  3. Right click on project in the Solution Explorer window and select properties
  4. Add the following to Inherited Project Property Sheets
    1. Click the arrow next to Linker and select General
    2. Add the following to Additional Library Directives
    3. $(NSIGHT_CUDA_TOOLKIT)/lib/$(PlatformName)
    4. Click on Input
    5. Add the following to Additional Dependencies cudart.lib
    6. Click OK
    7. Click on Build Events
    8. Click on Post-Build Event
    9. Copy/paste the following into Command Line
      copy "$(NSIGHT_CUDA_TOOLKIT)\bin\cudart*_*.dll" "$(TargetDir)"
  5. Copy the following source file (source page in DDJ014) and save to AssignScaleVectorWithError.cu
  6. Right click on Source Files
    1. Add | Existing Item ...
    2. Select AssignScaleVectorWithError.cu in the DDJ014_debug directory
    3. Note: a Matching Custom Build Rules dialogue will appear a mouseover occurs over the names in the Rule File column
    4. Click to select the one that has NsightCudaRuntimeApi.v31.rules (or the latest version) in the name
    5. Click OK
  7. Click on AssignScaleVectorWithError.cu and edit the source to change int main() to void myMain().
  8. Click on DDJ014_debug.cpp and edit the source to add
    1. Add the forward declaration extern void myMain(); before int _tmain
    2. Add a call to myMain(); before the line return 0;
    3. The source should look like this
// DDJ014_debug.cpp : Defines the entry point for the console application.

#include "stdafx.h"

extern void myMain();

int _tmain(int argc, _TCHAR* argv[])
	return 0;

  • Right click on project in the Solution Explorer window and select Nsight User Properties
  1. Change localhost in Connection Name to the remote IP address (e.g. or the name of the target machine that is running the Parallel Nsight monitor.

Note: It is important to verify that the compiler code generation flags are identical for both the C/C++ compiler and nvcc otherwise link errors can occur.

  • Verify that Property | C/C++ | Code Generation | Runtime Library is set to /Mtd.
  • Right-click on a .cu file and select Properties | Parallel Nsight | Host | Runtime Library and verify the flag is set to /MTd.

Build the project: CTRL + Shift + B

If you wish, you can create a 64-bit version. The copy command for the post-build processing copies both 32- and 64-bit versions of cudart.lib for the target machine to use.

  • Click the dropdown next to Win32

  1. Click Configuration Manager
  2. Click the drop-down next to Win32
  3. Click New
  4. Select x64 in New Platform
  5. Click OK
  6. Click Close
  • 10. Right click on project in the Solution Explorer window and select properties
  1. Click on Host
  2. Select Target Machine Platform and from the dropdown menu select x64
  3. Click OK
  • On the top toolbar menu
  1. Build | Build Clean
  2. Build | Build

Click on the filename AssignScaleVectorWithError.cu and set a breakpoint next to the first line in scale(float *v_d, int n, float scaleFactor). The break point can be set by clicking in the gray area next to the source line. A red breakpoint circle will appear in the gray area next to the source line. (Note: one way to remove a breakpoint is to click on the red circle again.)

Start Parallel Nsight by clicking Nsight | Start CUDA Debugging and a console window will start on the target machine. Meanwhile a yellow arrow will appear in the red breakpoint circle next to the source line as in Figure 3.

[Click image to view at full size]
Figure 3

The variable v_d can be typed or dragged to the Address field in the Memory 1 window at the bottom lower side. This will specify the memory address to examine. The user can specify that v_d is a floating-point array by right clicking in the Memory 1 window and selecting 32-bit Floating Point as the type. Scrolling down shows that v_d is filled with zeros after the value of 255. This indicates there is an error in assign(), which should fill v_d with consecutive integers from 0 to n-1. As can be seen in the Locals window, n is 8192. Per the discussion in Part 14, the calculation of tid in assign() is incorrect and must be changed to the form shown in scale() for the program to run correctly.

[Click image to view at full size]
Figure 4

Pressing F5 (or clicking Debug | Continue top toolbar) highlights two items to note in the v1.0 release:

  • Parallel Nsight version 1.0 does not support console redirection of text from the target machine. Thus the "TEST FAILED" message printed by the application is not accessible from Visual Studio or visible to the user on the target console because the window closes so quickly.
  • It is not possible in this example to set a functional breakpoint on the printf() statements in the host side code because AssignScaleVectorWithError.cu was compiled with nvcc.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.