Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Inside the VSIPL++ API


Performance

Of course, there are other high-productivity environments (such as Matlab) for prototyping numerical applications. However, VSIPL++ is designed to provide high performance in addition to a convenient programming style. Because VSIPL++ can be used on workstations, supercomputers, or embedded devices, it's easy to prototype an algorithm on a workstation, then move it to the target device.

The VSIPL++ API lets you manually specify the layout of data. For example, it is usually more efficient to arrange a matrix in row-major order if performing computations along the rows, but more efficient to arrange the matrix in column-major order if performing computations along the columns. VSIPL++ programmers can pick whichever arrangement is most convenient.

Sourcery VSIPL++ uses several additional techniques to obtain good performance. Sourcery VSIPL++ can dispatch operations (like FFTs) to math libraries that have been carefully tuned for the target system. For example, on Intel processors, the Intel Performance Primitives (IPP; http://www.intel.com/cd/software/products/asmo-na/eng/perflib/ipp/index.htm) provide handwritten code for FFTs. If no library is available for a particular operation, Sourcery VSIPL++ falls back to generic routines.

The generic routines for some computations (like the *= operator used to perform element-wise multiplication in the example) use expression templates to perform loop fusion and eliminate temporaries. In the *= case, this line of code:

tmp *= filters.row(i); 

is transformed into code like this:

for (length_type j=0; j<N; ++j)
  tmp[j] *= filters[j];

This technique is even more effective on code such as:

Vector<T> A, B, C;
A = B + cos(C);

which is transformed into code like:

for (length_type i=0;       i<A.size(0); ++i)
  A[i][j]=B[i][j]+cos(C[i][j]);

which can be compiled to very efficient code.

Some compilers (the GNU C++ compiler, for instance) have extensions that can be used to explicitly request that Single-Instruction Multiple-Data (SIMD) units on the processor be used to perform several computations at once. Sourcery VSIPL++ uses these extensions when they are available.

The net result of these techniques is that the Sourcery VSIPL++ performance for an application is usually within a hair's breadth of the performance attained by directly using the underlying low-level math libraries. So, VSIPL++ users get productivity and portability, without sacrificing performance.

On occasion, Sourcery VSIPL++ is able to achieve better performance than the underlying math libraries, due to the use of loop fusion. For example, in the cosine example, using a library like IPP, you would have to perform the addition and cosine operations separately. As a result, you would get code that looks more like this:

for (length_type i = 0; i <       A.size(0); ++i)
  A[i][j] = B[i][j];
for (length_type i = 0; i <       A.size(0); ++i)
  A[i][j] += cos(C[i][j]);

Because there are two separate loops, there is more overhead and an inferior cache access pattern. The loop fusion used by Sourcery VSIPL++ collapses these two loops into a single loop.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.