If you want to understand modern multicore hardware and have questions about the system, it is a good idea to run CPU-Z -- the legendary freeware that gathers information on microprocessors, mainboard, memory, and graphics. What kind of questions:
While developing software optimized for parallelism and multicore, you need answers to questions such as:
- Does the microprocessor support SSE 4.1?
- Does it support SSE 4.2?
- Does it offer virtualization extensions?
- How many physical and logical cores does it offer?
- Does it offer a Level 3 cache?
- How many Level 2 caches does it offer?
- I need information about the QPI (short for QuickPath Interconnect) link. Which is the speed offered by the QPI link?
CPU-Z 1.53 provides this information and answers to the aforementioned questions with a simple freeware application. The new version is compatible with both 64-bits and 32-bits modern Windows versions, including Windows 7 and Windows Server 2008 R2. This new version is also available in Chinese.
You can install CPU-Z or run it without installation. Once you run the application, it takes a few seconds to detect the underlying hardware and it will display a window with many tabs. The first tab, CPU, offers valuable information about the microprocessor(s) found in the computer. For example, Figure 1 shows the information for an Intel Core i7 820QM (mobile Core i7):
As you can see in Figure 1 there is valuable information about the additional instruction set:
Instructions MMX, SSE(1, 2, 3, 3S, 4.1, 4.2), EM64T, VT-x
The aforementioned line means that this microprocessors offers MMX and SSE up to SSE 4.2, EM64T (x86-64) and Intel virtualization extensions (VT-x). In fact, the HTML report that you can save with this utility will show a more complete line:
Instructions sets MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, EM64T, VT-x
As you can see in the same tab, the processor offers 4 physical cores (Cores) and 8 logical cores or hardware threads (Threads). Remember that the number of graphics shown in Windows Task Manager corresponds to the number of hardware threads, as in Figure 2:
If you have more than one microprocessor installed in the computer, you can see the information for the other processors choosing it on the Selection combo box.
The information shown about the clocks correspond to the first core (Core #0). As many new microprocessors, like Core i7, are able to change the multiplier individually, per core, this information doesn't apply to all the cores offered by the Intel Core i7 820QM.
The speed for the QPI link is useful when you're trying to determine potential bottlenecks for multicore designs.
Besides, it shows information about the cache memories included in each microprocessor. However, there is more detailed information in the Caches tab, as in Figure 3:
In this case, as you can see, there is a shared 8 megabytes level 3 cache memory. The four physical cores share this cache. Besides, each core has its own 256 kilobytes (0.25 megabytes) level 2 cache memory. Therefore, CPU-Z adds an x4 to the cache size.
It also offers detailed information about the data and instruction level 1 caches, their number of ways and the line size.
This information is very important to measure the potential multicore power offered by the underlying hardware and the additional instructions to take advantage of parallel execution. Sometimes, a small change in the micro-architecture produces important speedups in certain parallelized algorithms. Therefore, it is very important to take into account the underlying hardware.
CPU-Z also offers detailed information about the mainboard, the memory subsystem and the graphics card. The information about memory timings is indeed important to understand the problems that you are going to face when the three cache levels fail to deliver your data, because the latencies are still too high, compared to the evolution of the microprocessors' architecture. In fact, this is one of the most important problems that you are surely going to meet whilst trying to optimize your algorithms as much as possible in modern multicore hardware.
The paradigm shift required to translate multicore power into application performance oblige developers to learn the new hardware that runs their software and systems. Tools like CPU-Z are a great alternative to understand the features offered by most modern microprocessors and micro-architectures that power different Windows versions.