Channels ▼
RSS

Parallel

RapidMind: C++ Meets Multicore

Source Code Accompanies This Article. Download It Now.


Values

In terms of the Value type, you can declare two values, initialize them, then add them up and place the result in a third value:


Value1f x(1.0f), y(2.0f);
Value1f z = x + y;

The 1 and f portions of the value types are used here. The 1 means that this value contains a single scalar. Values can contain any fixed number of scalars, but they usually contain between one and four elements. The f stands for float. Values can contain any standard C++ type, as well as some nonstandard ones (most notably, half-precision floating-point types). Values are templated; Value1f is actually a typedef for Value<1, float>. Typedefs for up to four elements of all basic types are provided by the platform.

Programs

The previous example of using values might not make them seem particularly interesting. The computation, if inserted into a C++ function, would simply happen immediately in the same thread the rest of the function is executing in. Values become more interesting when they are combined with the Program type. This code illustrates a program definition using RapidMind:


Program add_two_numbers = 
      RM_BEGIN {
  In<Value1f> x, y;
  Out<Value1f> z;
  z = x + y;
} RM_END;

This trivial program captures the same computation we performed directly on values above. However, the computation does not execute immediately. Instead, it is stored in the program object add_two_numbers, and can later be used to compute the sum of two numbers. When a program object is defined, every computation on RapidMind types between the RM_BEGIN and RM_END statements is collected and stored within the object. This process happens at runtime and does not require any special preprocessing or compiler modifications. Programs can be defined in any function, but typically a program is defined in a constructor of a class encapsulating some computation.

Arrays

Adding two numbers is not a very parallel operation. Adding two arrays of numbers, however, can be parallelized assuming the arrays are large enough. RapidMind lets program objects be called on entire arrays at once:


Array<1, Value1f> a(10000),
     b(10000), c;
for (int i = 0; i < 10000; i++) {
  a.write_data()[i] = ...;
  b.write_data()[i] = ...;
}
c = add_two_numbers(a, b);


The first line declares three arrays. The arrays a and b are initialized to make space for 10,000 elements each, whereas c is initially empty. Just like Value, Array is a simple class template. The first template parameter specifies the dimensionality of the array (one, two, or three), and the second parameter specifies the element type of which the array holds a collection.

In the next four lines we simply initialize our input arrays with some data. The write_data() function obtains a plain C++ pointer to the data—in this case, a float*—that can be used to modify the array. A similar read_data() function can be used for read-only accesses. Distinguishing between the two helps the platform understand when it needs to move data around; for example, to do a copy-on-write operation.

The last line of the snippet calls upon our program object, add_two_numbers, to perform the computation. Note how the program object is used just as though it were a C++ function. Now, even though this program takes two values as its input, and produces another value, you can call it on the entire array. This is effectively the same as calling the program once for each entry in the array; for example, by looping through the array. However, by applying the program to entire arrays at once, the parallelism is explicit. The platform knows that this computation can be executed independently for each element it computes, and therefore knows that the computation can be split across an arbitrary number of cores.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video