Keeping Parallelism Balanced is a Must
It is very important to optimize applications for existing and forthcoming multicore microprocessors. However, a lack of balance in parallelism levels could lead to explosive parallelism with resulting slowdown rather than speedup.
An old Chinese proverb says "An army of a thousand is easy to find, but, ah, how difficult to find a general"
Optimizing many small applications to exploit multicore microprocessors is not a difficult task. However, optimizing large applications to exploit multicore keeping parallelism balanced is a complex job.
The great problem begins when you have application architectures with many layers running on the same computer. It is very common to have independent developers working on each layer. If each group of developers tries to exploit multicore without strict global design guidelines, each layer will create many concurrent processes, tasks, actors, threads, etc., according to the number of available hardware threads (logical cores).
Besides, sometimes, you do not have control over the components, libraries and frameworks employed to develop a large application. You have to consider the optimizations introduced in these external reusable components.
For example, if you have an application that uses an embedded database engine, you could optimize it for multicore microprocessors launching many queries in parallel, concurrently. Each query can run in a new independent thread. If the embedded database engine, running in the same computer, does not perform multicore optimizations, you can achieve a better performance.
However, if a new version of this embedded database engine adds multicore optimizations to solve queries, your application will have a super-optimization problem. On the one hand, you have an application creating threads to run queries in parallel. But, on the other hand, the embedded database engine will create many threads to run each query in parallel. In this case, you could achieve a resulting slowdown rather than speedup. Why? Because the two layers are trying to parallelize their work, assuming that the other layer is programmed using linear code.
Therefore, it is very important to keep parallelism balanced. A software engineer must consider the parallel optimizations found in each component, library and framework employed to develop a large application. There has to be a parallel optimization plan. Groups and developers have to know their multicore optimization goals before going parallel.
If you leave multicore optimizations as a free will, the applications will not take advantage of multicore microprocessors as expected. Parallelization requires a great design, planning and clear rules. You cannot optimize an application for multicore without well designed rules. Each developer has to understand their concurrency restrictions.
Some code will be optimized at a higher level. If you want to run queries in parallel, the queries won't have to use multicore optimizations. However, it depends on the number of available hardware threads (logical cores), the global design and the scalability goals. There isn't a silver bullet. Each application should have its multicore optimization goals and guidelines.
There are many libraries, frameworks and components offering new multicore optimizations. Thus, it is very important to consider the aforementioned example. If you have a large application and you replace an old component with a new one, you have to take into account its multicore optimizations before starting to parallelize your code.
Beware of explosive parallelism. It is as dangerous as the lack of parallelism.
Parallel Pattern 5: Stencil
All memory addresses used for reads are expressed as offsets
Distributing Work Across Cores Using .NET
A roll-your-own ThreadPool implementationLooking For The Lost Packets: Part 2
Looking For The Lost Packets: Part 1
- Intel Parallel Studio; Download the free eval today!
- Parallelism Breakthrough Video Series; Watch and learn more about Intel® Parallel Studio
- 2009 Intel Software Webinar Series; View On-Demand webinars
- Coding for Multi-core Processes; Intel® Compiler Pro eBook
- Performance Through Parallelism; Intel® Tuning for Vista eBook
- Intel® Software Network; Connect with developers and Intel engineers
-
February 18, 2010
Lock Contention, Using Intel Parallel Studio to Improve Performance
Speaker: Vasanth Tovinkere, Software Engineer, Intel Corporation (Bio)Vasanth Tovinkere is a software engineer in the Developer Products Division (DPD) at Intel. His current role involves defining novel approaches to understanding and visualizing parallel performance and consulting with strategic customers to help them prepare and deliver code for the multicore world. Vasanth has been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. He also has been awarded three patents and has two patents pending.
Abstract:
Discover how easy it is to use the power of Microsoft Visual Studio and Intel Parallel Studio to find performance issues due to lock contention in threaded applications. This ensures that shipped applications can take better advantage of multicore processors. In this webcast, we provide live demonstrations that show how to identify lock contentions issues with Visual Studio and Intel Parallel Studio, an add-in to Visual Studio that helps developers create fast, reliable code on multicore processors.t.



