Design

Performance Portable C++

By Jeff Keasler, May 07, 2008

Performance portability means that code can achieve good performance across a range of computer architectures while maintaining a single body of source code.

Conclusion

Performance portability is likely to become a necessary part of programming in the near future. Already, some compilers will only create good SSE code for the x86 processor architecture if stride one vectors are used (also known as Array-like layouts). At the same time, older superscalar machines usually perform best when Struct-like layouts are used as often as possible. Without the flexibility to switch memory layouts, performance almost certainly suffers as code is ported, and this is more likely to be so as we head into the future.

The technique I describe here is used in a large multiphysics project that encompasses hundreds of thousands of lines of code. The technique was originally used as a refactoring tool more than a performance portability tool. At one point, the project was able to quickly refactor the hydrodynamics portion of their physics code for a 42-100 percent speedup depending on the problem being solved and the machine being used. The hydrodynamics can be the dominant portion of the runtime for many physics applications, so doubling the performance with just a change of data structures is impressive. Part of the performance gain probably came from the compiler recognizing extra optimizations that could be applied (same compiler flags), and the rest came from different cache latency characteristics.

Thanks to Brian McCandless for questioning me during a presentation on the sidebar material, and inspiring me to write this article. His group uses a variant of this technique in their software.

References

Triangle and Quadrilateral areas (softsurfer.com/algorithms.htm).

Grandy, Jeff. "Efficient Computation of Volume of Hexahedral Cells," Lawrence Livermore National Laboratory UCRL-ID-128886, Oct 30, 1997.

Next-Generation Performance Portability

To take full advantage of multicore, NUMA, and heterogeneous technologies, lots of code will likely need to be rewritten, especially in the High Performance Computing (HPC) arena.

To mitigate the impact of this shift to new architectures, I'm investigating the use of source-to-source translation as a possible salvation. I've used a sophisticated tool called ROSE (www.roseCompiler.org) to write a Tiny C language extension that implements the techniques in this article in a way that is transparent to users. All the drawbacks I mention in the article go away, and new features emerge, especially with respect to multidimensional arrays.

Users create a single schema file describing the struct/array grouping of array names used within a code. The compiler can then use the schema to generate structs or arrays as appropriate at compile time. The groupings can be hierarchical, thus subtle relationships among groups of arrays can easily be captured.

I plan to investigate generating RapidMind, Intel's Threaded Building Block, or other backend code to complement the performance portable language extension. My hope is that source translation combined with a vector-like language extension might let the power of next-generation architectures be exploited with a minimal amount of effort devoted to code rewrites.

—J.K.

Previous 1 2 3 4 5

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Design

Performance Portable C++

Conclusion

References

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Design

Performance Portable C++

Conclusion

References

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content