High-Level Programming Languages Should Improve Support for SIMD Instructions
Most modern microprocessors offer the ability to execute the same instruction on multiple data, something classified by Michael J. Flynn in his proposed Flynn's taxonomy back in 1966 as SIMD (short for Single Instruction, Multiple Data). It is possible to utilize these vector processors to reduce the time needed to execute certain algorithms. However, many popular high-level programming languages don't allow developers to take advantage of SIMD instructions.
Modern microprocessors can execute SIMD instructions, but these instructions are part of different extended instruction sets. Because the need for greater computing performance continues to grow across industry segments, most CPU manufacturers have incorporated extended instruction sets in their new CPU models. For example, the most advanced Intel CPUs added two new SIMD instruction sets: AES-NI (short for Advanced Encryption Standard New Instructions) and AVX (short for Advanced Vector eXtensions).
In addition, the execution units for SIMD instructions usually belong to a physical core, and therefore, it is possible to run as many SIMD instructions in parallel as available physical cores. The use of these vector-processing capabilities in parallel can provide impressive speedups.
If you work with C or C++, it is pretty easy to call any SIMD instruction in your code. In fact, many applications written in C and C++ take advantage of these instruction sets to work on vectors and matrices. They are very useful to improve performance in algorithms that need to do multiple calculations on many data blocks. Most modern C and C++ compilers optimize loops to take advantage of SIMD instruction sets. As such, they are able to perform an auto-vectorization when following certain guidelines for writing the loops that perform operations on arrays.
Other modern and popular high-level programming languages such as C#, F#, Ruby, Java, and Scala don't provide direct support for calling SIMD instructions. Two years ago, I wrote an article describing SIMD support for C# via Mono 2.2. Mono has improved the support in its latest release, 2.10, and you can read the release notes describing the Mono.Simd namespace here. Sadly, .NET Framework 4 hasn't added any kind of support for SIMD instructions.
A few months ago, Jonathan Parri, Daniel Shapiro, Miodrag Bolic, and Voicu Groza wrote a very interesting article in Queue, the ACM's magazine for practicing software engineers. In "Returning Control to the Programmer: SIMD Intrinsics for Virtual Machines," the authors state that exposing SIMD units within interpreted languages could simplify programs and unleash floods of untapped processor power. They explain why SIMD instructions are greatly underutilized and list the arguments for supporting the inclusion of pre-mapped SIMD vector intrinsics within interpreted languages, which include:
- Faster application runtime
- Lower cost
- Smaller code size
- Fewer coding errors
- A more transparent programming experience
In addition, the authors address SIMD inclusion concerns and provide a complete example of designing an API, called "jSIMD," for mapping Java code to SIMD instructions. The authors believe it is very important to give more control to the programmer. I agree with them developers should be able to execute SIMD instructions to achieve the best performance in high-level programming languages. We should be able to take full advantage of the processing power offered by modern hardware to create better software and enterprise applications.