Many performance fanatics have been excited about the potential of the various hardware accelerators now on the market, from GPUs such as the NVIDIA GeForce series and those from AMD/ATI, field programmable gate arrays (FPGAs) such as the Xilinx Virtex series, and application-specific integrated circuits such as those from ClearSpeed. While the performance potential is clear for well-suited algorithms, the logistics of using the accelerators in real programs can be daunting for most algorithm developers. Here again, this new generation of productivity languages offers dramatic steps forward. For example, the NVIDIA GeForce has FFTs implemented in its scientific library. Thus, for the example above, users don't need to change their core algorithms, but rather just request the use of the GeForce FFT routine, typically running in multiple GeForce chips, instead of the standard one running in the general-purpose cores. This might be requested, at the beginning of a Star-P session, by the following:
Again, compatibility with the desktop language is crucial, so this change can be made conditionally in a set-up portion of the code, leaving the rest of the program oblivious to whether it's running on a single core on the desktop, on multiple general-purpose cores in Star-P, or on multiple accelerators via Star-P.
Software developers targeting appliances are coping with demanding requirements for adaptability and performance. A new generation of productivity tools, extending existing desktop languages for parallelism, is delivering much higher performance while preserving the productivity of the desktop.