Parallel

High-Performance Computing: RAM versus CPU

By Michael A. Schulman, April 30, 2007

HPC users are still demanding more performance as they try to solve more complex problems and desire faster turnaround.

Real-Life Applications

Looking at an example in the Mechanical Computer Aided Design (MCAE) area, techniques have been developed to use memory efficiently and not allowing the OS to start swapping. For example, in simulating a car crash, running on a single thin node would take significant amounts of memory. To get maximum performance, the entire model, composed of breaking up the solid car definition into small elements ("finite elements"), the material properties, the external forces, and the like, would have to be loaded into memory. Depending on the resolution of the model-- basically, how many elements are created and how many time steps are solved for -- a tremendous amount of memory would have to be used. Applications like this can be solved either on one system, or broken into smaller parts and solved on a number of horizontal systems.

Application developers have been able to write software that runs across a number of systems. The problem is divided into smaller pieces, with each piece placed on a different computer. For example, the hood of the car can be simulated on one system, the engine compartment on another, the roof on a third, and so on. The different systems would need to communicate the boundary conditions to the other systems, which require fast communications. The benefit to this approach is that the simulation can be run in much less time, since a number of CPUs are working together, in parallel. Also, since each node or core only gets a part of the data, the memory requirements are less on each node. In total, the memory requirements will be similar (or slightly more) compared to running on one system, but the time to solution will be considerably less. In an era of high competitiveness in many markets, the time to get the results outweighs the additional cost of purchasing more computer systems.

[Click image to view at full size]

Figure 2: Results from Scaling of an MCAE application.

An example of a crash analysis CAE code is LS-DYNA from Livermore Software Technology (LSTC). Using one of the test codes, called "Neon Refined," customers or systems vendors can run this test case across various machine types, and look at different parameters. Figure 2 shows how an MCAE application scales across multiple machines. The system used for this example was the Sun Fire X2100, which contains one socket per system. By dividing the problem and using the memory associated with each node, the LS-DYNA run is very scalable.

Another example where running an application in a horizontal scaling environment is in the area of Computational Fluid Dynamics (CFD). Since the simulation can be broken up and solved in a parallel manner, the scaling can be very good and even slightly "super-linear" (see Figure 3). Since the multiple CPUs or cores contain more cache than a single core, this typically happens when more data can be held closer to the computational unit. Thus, the overall processing will be faster, since there is less access to main memory.

[Click image to view at full size]

Figure 3: Scaling for a CFD example.

Conclusion

The amount of memory is critical in the overall performance of the system. The general rule is not to skimp on memory purchases and then to buy the fastest CPU available. It is important to investigate whether paying more for RAM is more beneficial than paying more for a faster CPU. In addition, by scaling horizontally, more memory can be addressed, which may result in higher overall performance.

Previous 1 2 3

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

High-Performance Computing: RAM versus CPU

Real-Life Applications

Conclusion

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Parallel

High-Performance Computing: RAM versus CPU

Real-Life Applications

Conclusion

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content