Intel Adds Features to Parallel Universe Portal
Intel improved the free services offered in the cloud to measure scalability in parallelized applications. Architects and developers interested in multicore optimizations will welcome the new possibilities.
A few weeks ago, I had added a post with a detailed step-by-step explanation about the usage of Intel Parallel Universe Portal in order to detect scalability problems. The service offers the possibility to access a powerful multicore server in the cloud and the forum began receiving suggestions from many users around the world. I've also written my ideas for new features and improvements in the forum. In fact, many of them are part of this service's new version.
The most common problem was that the application could take more than 1 minute to run with all the different configurations, using 1; 2; 4; 8 and 16 logical cores. Therefore, the execution time limit has been upgraded to 2 minutes. Besides, there is a possibility to use 5 minutes for premium users.
Besides, if your application is not capable of running with all the different configurations within the execution time limit, you can access a partial report. This is very useful because sometimes it is very difficult to calculate an accurate execution time in unknown hardware configurations for certain complex algorithms. Remember that some unexpected bugs could appear introduced by unexplored concurrency and parallelism conditions. The partial report shows the information about the hardware configurations where the application successfully completed its execution before reaching the time limit. This way, it is possible to have useful information about the partial execution. In the previous version, you had to make the necessary changes to improve your code in order to make it possible to complete its execution before 1 minute. The partial report allows you to understand the problems in certain hardware configurations.
Now, it is very easy to compare many reports. You just have to click on their checkboxes and then on the Compare button. The new report comparison function is very useful because it allows you to see the differences between different sessions, as shown in Figures 1 and 2.
Figure 1: A report comparison displaying the graphs with the results of two different sessions using different colors.
Figure 2: A report comparison displaying the charts with the results of two different sessions.
The user interface is simple to use and you can activate or deactivate the sessions to display in the graphs just clicking the corresponding checkboxes. You can compare many sessions and check the results for many different versions of the same algorithms. However, the generated Parallel Amplifier report will always bring you more detailed information that you can use to improve the design and the code. It is also useful to compare the results produced by many threading libraries or task-based programming alternatives for C/C++.
There are other improvements that allow you to clean-up your sessions. It is especially useful when you run dozens of applications. Now, you can remove the old sessions. This way, you don't have to handle reports of more than 100 sessions after a few weeks using this service.
Besides, you can cancel pending jobs at any time before they start their scheduled execution.
All these new features and improvements make this service even more attractive to those trying to translate multicore power into application performance using C/C++. Besides, there is a great interest in future versions supporting additional programming languages.