Software Must Mask Hardware Volatility From Most Programmers
Despite the programming difficulties, the new level of performance promised by GPUs is undeniably valuable, and HPC-dependent organizations must reap that potential performance in a way that does not disrupt their overall missions. Thus, beyond key pilot projects, many of these organizations will find themselves waiting for the programming obstacles to be overcome before adopting GPUs broadly.
In addition to the exposure of the memory hierarchy detailed above, the uncertainty about the future HPC hardware landscape and software's ability to mask details of that landscape is another considerable obstacle. At the hardware level, GPUs could persist as distinct chips and cores, or GPU cores could be collocated with CPU cores on hybrid sockets, or GPU-core functionality could be subsumed into CPU cores. At the level of the memory hierarchy, compiler, and language, techniques may evolve that enable (subsets of) standard C and Fortran to run efficiently without today's exposure of the memory hierarchy. Each of these plausible scenarios would lead to a different approach for programming whole applications (i.e., those that need functionality beyond that supported by today's GPUs and their compilers).
If GPUs persist distinct from CPUs, then the splitting of work between GPUs and CPUs will be an enormous difficulty, as the large granularity needed for GPU computations (to overcome the cost of moving the data) is beyond the state of the art for compiler recognition. Of course, if GPU performance and functionality are subsumed by CPUs, then most of these problems evaporate. For GPUs to be broadly used, software interfaces or platforms must emerge that mask most of this uncertainty from the typical programmer. Those interfaces could be libraries and run-time support that enable a function to be executed on a GPU, if such a version exists, and hide the details of data motion. They could be compilers that prompt a programmer to provide the extra information needed to use the GPU memory hierarchy effectively. Or they could be compilers that do the necessary analysis automatically, or that cleverly map algorithms onto existing libraries of GPU kernels. They could be novel tools for computing on both the CPU and GPU cores simultaneously. Many aspects of the technical issues are not new and have been deeply explored by HPC researchers. Whatever path(s) emerge, most HPC-dependent organizations will expect stability of software interfaces, which in this case applies not only to correctness but also to performance, before making large investments to move more than the first few codes to exploit GPUs.
A constraint sometimes overlooked in the enthusiasm for a new HPC technology (GPUs or otherwise) is that HPC has become vital to many organizations that do not have in-house HPC expertise, in contrast to the traditional market for HPC in government labs and universities. These non-HPC-expert organizations usually depend on software from third parties, sometimes commercial companies and sometimes open-source communities. Since developers of third-party software are usually obsessed by sustainability and are conservative in adoption of new programming methods, even ones that offer productivity benefits, they will need usable, standard interfaces before changing their codes. Of course, the HPC market is not monolithic, and organizations critically dependent on high-end HPC will still take extreme measures to get the maximum absolute performance possible on key applications, going well beyond the level of effort that the bulk of the market will contemplate, and accepting frequent reengineering as the cost of staying at the bleeding edge. But even those organizations will face these issues when considering other than their key applications.
Creating This Software Will Require Investments by HPC-dependent Organizations
Unfortunately, the usable, standard software I just described does not exist today, for the most part. Developing it will require significant investments of money, whether implemented as commercial products or as open-source collaborations, and ultimately HPC-dependent organizations will pay for this development. (See Paradigms for Parallel Computation, by Dan Stanzione et al., for related views.) HPC-dependent organizations that primarily use open-source software tools may want to consider chartering or supporting development projects to address these issues. Organizations that use commercial program development tools will need to make considered early purchases of products that solve these problems in practical, sustainable ways.
However the money is spent, HPC-dependent organizations need to anticipate that a larger fraction of their overall budget will be invested in application development tools than traditionally. Organizations that navigate the adoption of this disruptive technology wisely can expect to have markedly higher delivered performance for less money and power than their competitors.