Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

The Future of Computing


Maintaining Performance

Well, there are many ways to maintain performance. The first one -- exploitation of instruction level parallelism -- resulted in creation of super-scalar processors that we see today. Theoretically any modern CPU whether from Intel, AMD, IBM, or Sun can process and retire multiple instruction per cycle due to multiple parallel internal execution units. Funny enough, instruction-level parallelism does not yet allow sustained performance of substantially more that 1 instruction per cycle (IPC) on general benchmarks due to memory latency and branch misprediction penalty that stalls even the fastest CPUs more than half the time (source:Intel). Only highly-optimized tests or special-purpose code is capable of 3x to 5x performance boost warranted by multiple execution units. Practical gains due to architectural improvements of cache coherency or branch prediction amount to mere 5 percent in general. Long multi-stage execution pipelines that were developed to achieve higher clock speeds and inadequate memory performance created a situation when CPU can process data faster than the data can be supplied. So the trend for higher clock speed has already reversed in favor of shorter pipelines and better memory throughput. The best example of pipeline shortening is UltraSparc T1 processor with its six stage pipeline as opposed to 31-stage Pentium 4 models (Athlon XP has 10-stage pipeline and Intel's new "Woodcrest" server chip as only 14). Extrapolating the trend it is reasonable to expect CPU frequency to roughly remain the same while the CPU performance will increase due to pipeline shortening and emphasis on memory subsystem performance improvements.

Still, there is a hard limit for instruction-level parallelism, which makes it difficult in practice to keep individual execution units inside a CPU busy. Thus to improve CPU efficiency two alternative approaches are currently being pursued. One approach is super-threading (or Hyper-threading if we use Intel's terms), which allows CPU to process several parallel threads simultaneously switching from one thread to another when a stall occurs. UltraSparc T1 takes this approach to extreme by executing four threads on each core (with 32 threads on 8-core chip), switching threads in round-robin manner and when a stall occurs. While super-threading certainly boosts performance of multi-threaded applications speculative threading is pursued for improving performance of critical single-threaded applications. Intel is highly involved in speculative threading research and offers a Mitosis technology that with the help of compilers designates threads most suitable for speculative execution. AMD is developing similar technology, although the company is more tight-lipped about it. Still many rumors are circulating about AMD's clandestine "inverse hyper-threading" technology allegedly capable of uniting two individual CPU cores into a single CPU super-core CPU that would crunch single-threaded applications with a considerable performance boost. Yet the only piece of evidence on AMD's involvement with speculative threading that so far surfaced is infamous U.S. patent # 6,574,725 that looks like hardware support for speculative threading in the vein of to Intel's Mitosis. So with clock-speed increases effectively curbed by power consumption concerns most likely performance gains on upcoming CPUs would be due to super-threading (server chips) and speculative-threading (desktop chips).

There is another approach for boosting instruction-level parallelism, which has been pursued on and off by various commercial and government entities. I mean very-large instruction word (VLIW) or explicitly-parallel instruction set (EPIC) computing. First successful application of VLIW concept can be tracked back to early 1980s when a group of Russian engineers lead by Boris Babayan (who is now an Intel fellow) development a series of Elbrus supercomputers that were produced as a part of the anti-ballistic missile defense system deployed around Moscow. Massive performance gains warranted by proper application of VLIW concept allowed Elbrus machines to overcome manufacturing and technological limitations and beautifully serve their purpose. Remember that these were a special-purpose computers running hand-optimized code.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.