Channels ▼
RSS

Parallel

RapidMind: C++ Meets Multicore

Source Code Accompanies This Article. Download It Now.


Heterogeneous Cores

Parallelization is only half the battle. The move to multicore is an opportunity for chip vendors to embrace other interesting architectural changes in an attempt to reach teraflop performance within this decade. We illustrate this by describing the architecture of the Cell BE processor. Each Cell BE contains a total of nine cores. The Power Processing Unit (PPU) is a fairly traditional CPU core, with a PowerPC instruction set architecture and a reasonably sized cache. The other eight cores are referred to as Synergistic Processing Units (SPUs), and use a nontraditional architecture to achieve high performance in a small chip area. Instead of caches, each SPU sports 256 KB of on-chip, locally addressable memory (the local store), in addition to 128 128-bit registers. The SPUs use a vector instruction set that allows, for example, operating on groups of four floating-point numbers at once. Each SPU also features its own Memory Flow Controller (MFC), which can independently issue DMA transfers between main memory and the SPU local stores. The SPU's predictable, dual-issue pipeline allows complete certainty about the optimality of a particular sequence of assembly instructions. This lets a single Cell BE processor using all eight SPUs perform large matrix multiplications at over 200 Gflops, compared to about 12 Gflops for a single traditional CPU core.

The same features that make the Cell such a high-performance chip also make it hard for compilers for languages such as C/C++ to take advantage of. The assumptions these languages make about memory organization and instruction sets simply do not map well to heterogeneous processors such as the Cell. And if you think this kind of architecture is restricted to the domain of game consoles and special-purpose equipment, consider what vendors such as AMD and Intel are saying about their upcoming architecture. AMD's Fusion project aims to merge GPU-like cores with traditional x86-style cores, which, given the similarities between SPUs and GPU processors, will result in an architecture much like that of the Cell. At its 2006 developer forum, Intel showed off an 80-core processor prototype capable of more than a teraflop of compute power. Those 80 cores are not traditional x86 cores—this is an entirely different processor architecture. Intel also recently announced the Larrabee project, an attempt at converging GPUs and multicore CPU architectures.

So wouldn't it be nice to be able to program with the familiar tools and languages without compromising on performance?


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video