C/C++

Inside the VSIPL++ API

By Mark Mitchell, September 05, 2006

VSIPL++ is a C++ API for high-performance computing. One unique feature of it is that it includes direct support for parallel applications.

Performance

Of course, there are other high-productivity environments (such as Matlab) for prototyping numerical applications. However, VSIPL++ is designed to provide high performance in addition to a convenient programming style. Because VSIPL++ can be used on workstations, supercomputers, or embedded devices, it's easy to prototype an algorithm on a workstation, then move it to the target device.

The VSIPL++ API lets you manually specify the layout of data. For example, it is usually more efficient to arrange a matrix in row-major order if performing computations along the rows, but more efficient to arrange the matrix in column-major order if performing computations along the columns. VSIPL++ programmers can pick whichever arrangement is most convenient.

Sourcery VSIPL++ uses several additional techniques to obtain good performance. Sourcery VSIPL++ can dispatch operations (like FFTs) to math libraries that have been carefully tuned for the target system. For example, on Intel processors, the Intel Performance Primitives (IPP; http://www.intel.com/cd/software/products/asmo-na/eng/perflib/ipp/index.htm) provide handwritten code for FFTs. If no library is available for a particular operation, Sourcery VSIPL++ falls back to generic routines.

The generic routines for some computations (like the *= operator used to perform element-wise multiplication in the example) use expression templates to perform loop fusion and eliminate temporaries. In the *= case, this line of code:

tmp *= filters.row(i);

is transformed into code like this:

for (length_type j=0; j<N; ++j)
  tmp[j] *= filters[j];

This technique is even more effective on code such as:

Vector<T> A, B, C;
A = B + cos(C);

which is transformed into code like:

for (length_type i=0;       i<A.size(0); ++i)
  A[i][j]=B[i][j]+cos(C[i][j]);

which can be compiled to very efficient code.

Some compilers (the GNU C++ compiler, for instance) have extensions that can be used to explicitly request that Single-Instruction Multiple-Data (SIMD) units on the processor be used to perform several computations at once. Sourcery VSIPL++ uses these extensions when they are available.

The net result of these techniques is that the Sourcery VSIPL++ performance for an application is usually within a hair's breadth of the performance attained by directly using the underlying low-level math libraries. So, VSIPL++ users get productivity and portability, without sacrificing performance.

On occasion, Sourcery VSIPL++ is able to achieve better performance than the underlying math libraries, due to the use of loop fusion. For example, in the cosine example, using a library like IPP, you would have to perform the addition and cosine operations separately. As a result, you would get code that looks more like this:

for (length_type i = 0; i <       A.size(0); ++i)
  A[i][j] = B[i][j];
for (length_type i = 0; i <       A.size(0); ++i)
  A[i][j] += cos(C[i][j]);

Because there are two separate loops, there is more overhead and an inferior cache access pattern. The loop fusion used by Sourcery VSIPL++ collapses these two loops into a single loop.

Previous 1 2 3 4 5 6 7 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++

Inside the VSIPL++ API

Performance

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C/C++

Inside the VSIPL++ API

Performance

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content