Channels ▼
RSS

Parallel

Parallel Gets Simpler, Faster, With Nvidia CUDA 5


Nvidia has updated the CUDA 5 pervasive parallel computing platform and programming model in a production release version. Targeted at scientific and engineering applications driven by GPU acceleration technology, the company's developer zone has now seen more than 1.5 million free downloads.

CUDA has been popular up until now, so Nvidia almost certainly knew that it would need to bring something new to the table to ignite developer interest in the technology. As such, new support for dynamic parallelism has been included along with GPU-callable libraries, the firm's own GPUDirect technology support for RDMA (remote direct memory access), and the Nvidia Nsight Eclipse Edition integrated development environment (IDE).

The firm cites approval from an unnamed pool of developers who it says have witnessed "dramatic application acceleration and improved programmability" with the pre-release version of CUDA 5.

With GPU-accelerated applications in use in the defense and aerospace industries, this type of programming model is capable of processing images, video, and "sensor data" such as radar. One customer reports success with streaming sensor data directly into the GPU with low latency using the GPUDirect support for RDMA on new Kepler GPUs.

CUDA 5 has been designed to take advantage of the Nvidia Kepler compute architecture. A technical PDF on the company's website refers to Kepler as a GPU comprised of 7.1 billion transistors to produce a "computational workhorse" with teraflops of integer, single precision and double precision performance, and the highest memory bandwidth.

"GPU threads can dynamically spawn new threads, allowing the GPU to adapt to the data. By minimizing the back and forth with the CPU, dynamic parallelism greatly simplifies parallel programming. [It also] enables GPU acceleration of a broader set of popular algorithms, such as those used in adaptive mesh refinement and computational fluid dynamics applications," said the company.

Expanded Feature Set

Other new features include a CUDA BLAS library to allow developers to use dynamic parallelism for their own GPU-callable libraries. They can design plug-in APIs that allow other developers to extend the functionality of their kernels, and allow them to implement callbacks on the GPU to customize the functionality of third-party GPU-callable libraries.

The "object linking" capability provides an efficient and familiar process for developing large GPU applications by enabling developers to compile multiple CUDA source files into separate object files, and link them into larger applications and libraries.

GPUDirect technology enables direct communication between GPUs and other PCI-E devices and supports direct memory access between network interface cards and the GPU. It also significantly reduces MPISendRecv latency between GPU nodes in a cluster and improves overall application performance.

NVIDIA Nsight Eclipse Edition enables programmers to develop, debug, and profile GPU applications within the Eclipse-based IDE on Linux and Mac OS X platforms. An integrated CUDA editor and CUDA samples speed the generation of CUDA code, and automatic code refactoring enables easy porting of CPU loops to CUDA kernels.

An integrated expert analysis system provides automated performance analysis and step-by-step guidance to fix performance bottlenecks in the code, while syntax highlighting makes it easy to differentiate GPU code from CPU code.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video