Supercomputer Debugging: More Than Just Breakpoints
There's no question you could set a boatload of breakpoints with a 20-petaflop supercomputer, but you probably need more than that when developing parallelized software for it. You might also resort to fast conditional watchpoints, compiled expressions, asynchronous thread control, and full post-mortem debugging capabilities, and that's just for starters.
Those werejust a few of the features scientists at the Lawrence Livermore National Lab (LLNL) were looking for when supporting scalable development efforts on the Advanced Simulation and Computing (ASC) Sequoia. At 20 petaflops, Sequoia will be 34 times as powerful as LLNL's current Blue Gene/L, giving scientists a lot more computing cycles for simulations and basic science research.
"Sequoia represents a major challenge to code developers as the multi-core era demands that we effectively absorb more cores and threads per MPI task," says LLNL's Mark Seager.
The Sequoia effort includes two generations of IBM Blue Gene supercomputers that will deliver the next generation of advanced systems being developed under the ASC program. ASC is a cornerstone of the National Nuclear Security Administration's (NNSA) program to ensure the safety, security, and reliability of the nation's nuclear deterrent without underground testing. These two Blue Gene systems are Dawn, a 500-teraflop system that was accepted by LLNL in March of 2009, and Sequoia, a 20-petaflop system based on future Blue Gene technology, slated for delivery in 2011.
In this case, LLNL turned to TotalView Technologies, a developer of interactive analysis and debugging tools for serial and large scale parallel software. TotalView is a source code analysis and memory error detection tool that is designed to simplify the process of debugging parallel, data-intensive, multi-process, multi-threaded or network-distributed applications. In short, TotalView offers a number of features that make it capable of scaling to thousands of processes or threads with applications distributed over multiple machines or processors.
However, the company isn't focused solely on advanced supercomputers like Sequoia; it also provides tools like MemoryScape 3.0 that support platforms like Apple's Mac OS X Snow Leopard. MemoryScape 3.0 introduces support on Snow Leopard for malloc zones, a mechanism for controlling multiple pools of memory on Mac OS X systems. Both the allocator and owner of all heap allocations can be tracked, displayed and used for filtering. MemoryScape also provides the capability of detecting and controlling low available memory conditions in the heap.
Snow Leopard completes the transition for Mac to 64-bit, with all key system applications rewritten as 64-bit, enabling the Mac to address massive amounts of memory. Its Grand Central Dispatch handles threads for multicore processing at the operating system level, automatically distributing work to provide for optimal performance.

