Channels ▼

Gastón Hillar

Dr. Dobb's Bloggers

High-Level Programming Languages Should Improve Support for SIMD Instructions

June 13, 2011

Most modern microprocessors offer the ability to execute the same instruction on multiple data, something classified by Michael J. Flynn in his proposed Flynn's taxonomy back in 1966 as SIMD (short for Single Instruction, Multiple Data). It is possible to utilize these vector processors to reduce the time needed to execute certain algorithms. However, many popular high-level programming languages don't allow developers to take advantage of SIMD instructions.

Modern microprocessors can execute SIMD instructions, but these instructions are part of different extended instruction sets. Because the need for greater computing performance continues to grow across industry segments, most CPU manufacturers have incorporated extended instruction sets in their new CPU models. For example, the most advanced Intel CPUs added two new SIMD instruction sets: AES-NI (short for Advanced Encryption Standard New Instructions) and AVX (short for Advanced Vector eXtensions).

In addition, the execution units for SIMD instructions usually belong to a physical core, and therefore, it is possible to run as many SIMD instructions in parallel as available physical cores. The use of these vector-processing capabilities in parallel can provide impressive speedups.

If you work with C or C++, it is pretty easy to call any SIMD instruction in your code. In fact, many applications written in C and C++ take advantage of these instruction sets to work on vectors and matrices. They are very useful to improve performance in algorithms that need to do multiple calculations on many data blocks. Most modern C and C++ compilers optimize loops to take advantage of SIMD instruction sets. As such, they are able to perform an auto-vectorization when following certain guidelines for writing the loops that perform operations on arrays.

However, if you write JavaScript or PHP code, there is no way to take advantage of the available SIMD instructions in the underlying hardware. With JavaScript, the engine provided by the Web browser might decide to use SIMD instructions under certain circumstances, but you cannot suggest the usage of even the simplest SIMD instructions via code.

Other modern and popular high-level programming languages such as C#, F#, Ruby, Java, and Scala don't provide direct support for calling SIMD instructions. Two years ago, I wrote an article describing SIMD support for C# via Mono 2.2. Mono has improved the support in its latest release, 2.10, and you can read the release notes describing the Mono.Simd namespace here. Sadly, .NET Framework 4 hasn't added any kind of support for SIMD instructions.

A few months ago, Jonathan Parri, Daniel Shapiro, Miodrag Bolic, and Voicu Groza wrote a very interesting article in Queue, the ACM's magazine for practicing software engineers. In "Returning Control to the Programmer: SIMD Intrinsics for Virtual Machines," the authors state that exposing SIMD units within interpreted languages could simplify programs and unleash floods of untapped processor power. They explain why SIMD instructions are greatly underutilized and list the arguments for supporting the inclusion of pre-mapped SIMD vector intrinsics within interpreted languages, which include:

  • Faster application runtime
  • Lower cost
  • Smaller code size
  • Fewer coding errors
  • A more transparent programming experience

In addition, the authors address SIMD inclusion concerns and provide a complete example of designing an API, called "jSIMD," for mapping Java code to SIMD instructions. The authors believe it is very important to give more control to the programmer. I agree with them — developers should be able to execute SIMD instructions to achieve the best performance in high-level programming languages. We should be able to take full advantage of the processing power offered by modern hardware to create better software and enterprise applications.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Comments:

ubm_techweb_disqus_sso_-46c85115c2f44b77f6279276ef1f8deb
2012-06-26T23:19:36

Very late reply, but I read the Queue article and re-read the this article and I stand by my earlier comments, but also find that our only real points of disagreement have more to do with the problem description than with the proposed solution. The problem as I see it is not lack of access to SIMD instruction sets (as is available in C/C++ via inline assembly), but rather lack of built-in or library functions (or even language constructs) for performing vector and other parallel operations that SIMD are specifically made for. I the find the the Mono.Simd and jSIMD libraries that are described to be perfectly reasonable way to leverage the available hardware to achieve the desired speed-ups, though again I think the naming should be based on the high level problem being solved - (e.g. Mono.VectorProcessing) as opposed to the underlying hardware technology that is used to optimize it. That way it is kept at the appropriate level of abstraction for the high level language.


Permalink
AndrewBinstock
2012-03-07T22:41:45

@father_ramon You might want to look at the article that is referred to by Gaston in Queue magazine. It specifically addresses the point you make.


Permalink
ubm_techweb_disqus_sso_-46c85115c2f44b77f6279276ef1f8deb
2012-03-07T22:33:52

What part of "High-Level" do you not understand? While it certainly would be appropriate for compilers, JITS, and VM's, and libraries for a high level language to make use of machine level instructions, I find it not at all appropriate to wedge machine instructions into the more abstract model provided by a high-level language.


Permalink
ubm_techweb_disqus_sso_-f4d555c99e564d1c87bf4177076cfe2f
2012-03-02T14:48:48

Java HotSpot VM generates SIMD instructions on hot loops. At least this is what it looked like the last time I watched it in a disassembler. But of course you are right: the developer should be able to actually rely on and control that feature. However, it sadly appears, the designers of current major VM (especially .NET CLR) do not consider this as important enough. It might be, because the execution of machine instructions by the processor often is not the bottleneck nowadays. Rather the memory bandwith is the limiting factor - at least without doing careful, non trivial memory management. But for those cases, where this could be archieved, SIMD support would be very profitable.


Permalink


Video