Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


CUDA, Supercomputing for the Masses: Part 15

In CUDA, Supercomputing for the Masses: Part 14 of this article series, I focused on debugging techniques and the use of CUDA-GDB to effectively diagnose and debug CUDA code -- with an emphasis on how to speed the process when looking through large amounts of data and how to use the thread syntax and semantic changes introduced in CUDA-GDB. In this article, I discuss mixing CUDA and OpenGL by utilizing a PBO (Pixel Buffer Object) to create images with CUDA on a pixel-by-pixel basis and display them using OpenGL. A subsequent article in this series will discuss the use of CUDA to generate 3D meshes and utilize OpenGL VBOs (Vertex Buffer Objects) to efficiently render meshes as a colored surface, wireframe image or set of 3D points. All demonstration code compiles and runs under both Windows and Linux.

The articles in this discussion on mixing CUDA with OpenGL cannot do more than provide a cursory introduction to OpenGL. Interested readers should look to the plethora of excellent books and tutorials that are readily available in bookstores and on the Internet. Here are a few that I have found to be useful:

To focus on CUDA rather than OpenGL, I use an OpenGL framework that can mix CUDA with both pixel and vertex buffer objects. It is anticipated that this framework will be used and adapted by many others as they investigate various aspects of mixing CUDA and OpenGL not covered in my articles.

In many cases, only the CUDA kernels that generate the data will need to be modified to create and view your own content -- as will be shown in a second example at the end of this article that generates and allows interactive movement over an artificial landscape. Finally, this same framework will be used in the next article with minor modifications to discuss and demonstrate vertex buffer objects.

In a nutshell, creating a working OpenGL application requires the following steps that are instantiated through the files in the framework as illustrated in the schematic below:

  1. simpleGLmain.cpp: Create an OpenGL window and performs basic OpenGl/GLUT setup.
  2. simplePBO.cpp: Perform CUDA-centric setup; in this case for a Pixel Buffer Object (PBO).
  3. callbacksPBO.cpp: Define keyboard, mouse, and other callbacks.
  4. kernelPBO.cu: The CUDA kernel that calculates the data to be displayed.

I anticipate that many readers will just copy and paste these four files and build the example. This is fine. Similarly, many readers will also cut and paste the additional two files, perlinCallbacksPBO.cpp and perlinKernelPBO.cu used in the second example to see the artificial landscape in action.

For many, working with the source code of these two examples will be sufficient (along with the comments) to provide the basic visualization functionality needed for their work or to establish a known-working code base that can be leveraged and adapted to create other more advanced CUDA applications.

Following is the source code combined with a discussion of the essential features needed to combine CUDA and OpenGL in the same application to create images.

Interested readers might also like to watch the video of Joe Stam's presentation given at the 2009 NVIDIA GTC (GPU Technology Conference) entitled What Every CUDA Programmer Needs to Know about OpenGL. Joe's presentation discussed many of the OpenGL concepts covered in my articles and provides a live demonstration of the simplest PBO and VBO demonstrations from this and the follow-on article.

Framework and Rational for Combining CUDA and OpenGL

Just as in the NVIDIA SDK samples, GLUT (a window system independent OpenGL Toolkit) was utilized for Windows and Linux compatibility. Figure 1 illustrates the relationship between the four files used in the framework.

Figure 1

As we will see, CUDA and OpenGL interoperability is very fast!

The reason (aside from the speed of CUDA) is that CUDA maps OpenGL buffer(s) into the CUDA memory space with a call to cudaGLMapBufferObject(). On a single GPU system, no data movement is required! Once provided with a pointer, CUDA programmers are then free to exploit their knowledge of CUDA to write fast and efficient kernels that operate on the mapped OpenGL buffers. However, the separation between OpenGL and CUDA is distinct so OpenGL should not operate on any buffer while it is mapped into the CUDA memory space.

There are two very clear benefits of the separation (yet efficient interoperability) between CUDA and OpenGL:

  • From a programming view: When not mapped into the CUDA memory space, OpenGL gurus are free to exploit existing legacy code bases, their expertise and the full power of all the tools available to them such as GLSL (the OpenGL Shading Language) and Cg.
  • From an investment view: Efficient exploitation of existing legacy OpenGL software investments is probably the most important benefit this mapped approach provides. Essentially, CUDA code can be gradually added into existing legacy libraries and applications just by mapping the buffer into the CUDA memory space. This allows organizations to test CUDA code without significant risk and then enjoy the benefits once they are confident in the performance and productivity rewards delivered by this programming model.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.