Dr. Dobb's Contributing Editor Eric Bruno recently had the opportunity to speak with John Fowler, executive vice president of the Systems Group at Sun Microsystems. In a wide ranging conversation, Fowler boldly proclaimed that "single-core systems are history." Indeed, as Fowler told Dr. Dobb's, you can't even purchase single-core servers anymore, and they're disappearing from data-centers as though they never existed in the first place. And even if you do happen to locate a vendor that still sells a single-core server, why would you buy it? If you have a legacy application that's hopelessly single-threaded, you can still take advantage of multi-core systems with virtualization software; simply run multiple instances of it, including the OS, and consolidate servers. Here's Eric's report:
To kick off our conversation, Fowler began by recapping the hardware/software relationship prior to multi-core technology. It used to be that compute power was measured by clock rate, where increases in clock speed were common if not predictable. For software vendors, this meant a reliable increase in performance without changing any code. However, as processor manufacturers began to bump up against the barriers of heat generation and power consumption, it looked as though Moore's law would stall along with clock rates. Fortunately, by that time, Sun was well ahead of the game with its SMP-based server products and multi-core CPUs. AMD and Intel took the next step and introduced this technology to the desktop and notebook manufacturers. Fowler made it clear to me that adding cores to a processor is not where it ends. Sun has paid careful attention to scale other components of the processor, such as different levels of cache, as well as other components in their servers to take advantage of the architecture.
To be accurate, the discussion should be around multi-thread capable microprocessors, not just multi-core. For instance, Sun's latest UltraSPARC T2 Plus Niagra processor consists of 8 cores, each of which can run 8 threads, forming a 64-way SMP system-on-a-chip. New for the T2 Plus, when compared to the T2, is that it's now multi-socket aware. This means that UltraSPARC T2 Plus processors can be paired to form a whopping 128-thread capable server, such as Sun's just announced 1U servers (Sun SPARC Enterprise T5140 and T5240 servers). In fact, by the end of 2008, Sun promises to have systems that are 256-thread capable. More information on the SPARC Enterprise T5140 and T5240servers.
All of these servers, including the ones announced yesterday, will be available for developers to try for free through Sun's Try-and-Buy program. The intent is to get these systems into the hands of those who will put the cores to good use, and help to make this scale of multi-threading pervasive in the marketplace.
Beyond being multi-core and multi-socket capable, the UltraSPARC T2 Plus offers cryptography and networking functions directly on the chip. The processor is capable of performing MD5 checksum calculations, AES and DES encryption, and other cryptography functions. Additionally, with its built-in PCI Express IO functionality, 10GB Ethernet is built into the T2 Plus as well. This amounts to significant hardware acceleration in areas used heavily by most server applications.
Solaris 10 was not only built for highly multi-threaded environments, such as the 128-thread systems Sun is introducing, it will automatically take advantage of the hardware acceleration features listed above. Solaris 10 application developers will be able to take advantage of the processor's new features without any changes. Therefore, applications built for Solaris 10, or even earlier versions of Solaris, will automatically see the benefits of hardware acceleration in these areas without any changes to code or even the need to recompile.
Software Developers and Tools
Of course, to take full advantage of highly multi-thread capable systems, development tools need to be updated. Sun is currently working on enhancements to both NetBeans for Java developers and Sun Studio for C/C++/Fortran developers to take full advantage of multi-thread capable systems. In fact, these tools, as well as Java, are ready today for the multi-core world. Java, for instance, ships with garbage collector (GC) technology that takes advantage of additional processors to perform work in parallel to the application it serves. The next version of Java will extend this with its Garbage First garbage collector to offer lower-pause, lower-latency, GC functionality. And don't forget Just-In-Time (JIT) compilation in the virtual machine that runs in the background to compile and continuously optimize your Java code.
Future of Multi-threading
Fowler's prediction on the future of multi-threaded (or multi-core) processors is two-fold. From a software perspective, he believes that cores will become like memory today -- a commodity that developers don't need to pay as much attention to a they did decades ago. With gobs of gigabytes of cheap physical memory at their disposal, developers just aren't focused on memory constraints, and rightfully so. The same will occur with threads; developers will one day write applications that spawn all sorts of dedicated threads that go into tight loops with little concern for context switching and other effects because processing cores will be readily available. If you can imagine a system with the ability to concurrently run 256 or more threads, you can begin to see the reality in this statement.
On the hardware side, Fowler sees dedicated cores, or groups of cores that are set aside to perform specific tasks. However, you need to look beyond the server, even the desktop, and into the mobile world to truly understand the future of multi-threading. For instance, imagine a cell phone or PDA that uses a mobile multi-core processor, where cores are so readily available that one is dedicated to power management, another for network communication, another for video, another for audio, and a remaining pool divided amongst OS and application threads. Such a device, with all of these processing units on one piece of silicon, would save on complexity, design and manufacturing costs, and power consumption. It would also result in a very powerful mobile device that's capable of a lot more than today's already powerful smart phones.
The combination of multi-threaded applications and server virtualization running on such highly multi-threaded hardware will result in highly utilized, high-performance parallel systems that represent a data-center-on-a-box solution. The result will be more compute power in ultra-small spaces with an ultra-low consumption of power compared to the data-centers of just a few years ago. And with tomorrow's multi-core mobile devices being as powerful as today's servers, that data-center might one day fit in your pocket.