Sequential Programming: Like Eating Peas with a Straw.

Before the era of multicore chips, performance gains in CPUs was achieved by a combination of ever increasing speed and architectural enhancements. This resulted in more and more power being consumed by the processor -- a situation that could not continue forever.

Something's cooking

I remember a tongue-in-cheek competition 'alternative uses of the Pentium' that came up with some entertaining suggestions. The winner suggested wiring four Pentiums together and using them as a cooker hob. Very amusing! 

The multicore race is here

Rather than making processor faster and faster, the received practice now is to get extra performance by multiple cores. Recently I read a news item that said a start-up company was proposing to make the first 100 core CPU -- claiming that they would 'pip Intel to the post.' So rather than the MHz race we had in the '80s and '90s, we are now entering the multicore race. It seems to be that the multicore era is here to stay -- so we'd better get used to the idea.   

Back in 2007, Intel announced to the public it's 80 core research chip. It made some  programmers wonder how on earth a program be written to take advantage of so many cores.  Writing for 2 or 4 cores seemed manageable, but 80 cores seemed unimaginable. 

Figure 1: Intel's 80 core research chip (circa 2007) .

Speed
GHz
Power
Watts
Perf.
Teraflops
3.16
62
1.01
5.1
175
1.63
5.7
265
1.81

Figure 2: Performance of the 80 core chip.

The performance of this 80 core chip is over 1 Teraflop.  Interestingly,  the first teraflop computer  ASCI Red was considerably bigger and was decommissioned in 2006.

Figure 3: ASCI Red, the first Teraflop computer. Try fitting this in your garage!

Software tools are the real challenge

The challenge for the programmer is how to write programs to take advantage of so many cores. Thankfully companies such as Intel (did I say I work for them ) are putting huge efforts and resources into enabling programming in parallelism.  The Intel Parallel Studio is an example of a tool suite that can be used to write parallel applications.  Its good that the semiconductor industry is taking a lead in developing software tools as well as silicon, otherwise programming for these newer devices would be something akin to eating peas with a straw -  entirely possible , but not very efficient.

 

Parallel Pattern 5: Stencil
All memory addresses used for reads are expressed as offsets
Distributing Work Across Cores Using .NET
A roll-your-own ThreadPool implementation
Looking For The Lost Packets: Part 2
Looking For The Lost Packets: Part 1

Real World Parallelism Webinar Series
  • February 18, 2010
    Lock Contention, Using Intel Parallel Studio to Improve Performance
    Speaker: Vasanth Tovinkere, Software Engineer, Intel Corporation (Bio)

    Vasanth Tovinkere is a software engineer in the Developer Products Division (DPD) at Intel. His current role involves defining novel approaches to understanding and visualizing parallel performance and consulting with strategic customers to help them prepare and deliver code for the multicore world. Vasanth has been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. He also has been awarded three patents and has two patents pending.

    Abstract:
    Discover how easy it is to use the power of Microsoft Visual Studio and Intel Parallel Studio to find performance issues due to lock contention in threaded applications. This ensures that shipped applications can take better advantage of multicore processors. In this webcast, we provide live demonstrations that show how to identify lock contentions issues with Visual Studio and Intel Parallel Studio, an add-in to Visual Studio that helps developers create fast, reliable code on multicore processors.t.