Channels ▼


A Real-Time HPC Approach for Optimizing Multicore Architectures

Case Studies

The following examples show how real-time HPC is being applied today, in many cases where only five years ago the computational results would not be achievable. All of these examples were developed with the LabVIEW programming language to take advantage of multicore technology.

  • The Donghai Bridge (Figure 17) is China's first sea-crossing bridge, stretching across the East China Sea and connecting Shanghai to Yangshan Island. The bridge has a full length of 32.50 km, a 25.32-km portion of which is above water.

    Figure 17 Donghai Bridge. (Source: Wikipedia Commons)

    Obviously, the monitoring system for Donghai Bridge is of a large scale with a variety of quantities to be monitored and transmitted.

    Modal analysis methods can be used to reflect the dynamic properties of the bridge. In fact, modal analysis is a standard engineering practice in today's structural health monitoring (SHM).

    To cope with the modal analysis on large structures like bridges, however, a relatively new type of modal analysis method has been developed, which works with the data gathered at the same time the structure being analyzed is working. This is operational modal analysis. In this method, no explicit stimulus signal is applied to the structure; rather, the natural forces from the environment and the work load applied to the structure serve as the stimuli, which are random and unknown. Only the signals measured by the sensors put on the structure can be obtained and used, which serve as the response signals. Within the operational modal analysis domain, there is a type of method that employs output-only system identification (or in other terms, time series analysis) techniques, namely, stochastic subspace identification (SSI).

    To monitor a bridge's health status better, some informative quantities are needed to be tracked in real-time. In particular, it is highly desirable that the resonance frequencies are monitored in real-time. The challenge now is to do resonance frequency calculation online, which is a topic of current research for a wide range of applications.

    To enable SSI methods to be working online, SSI needs to be reformulated to some sort of recursive fashion so as to reach the necessary computational efficiency. This is recursive stochastic subspace identification (RSSI). With RSSI, the multichannel sampled data are read and possibly decimated. The decimated data then are fed to the RSSI algorithm. Each time a new decimated data sample is fed in, a new set of resonance frequencies of the system under investigation are produced. That is, the resonance frequencies are updated as the data acquisition process goes on. If the RSSI algorithm is fast enough, this updating procedure is running in real-time. Although further experiments need to be performed to validate the RSSI method, the results so far have shown feasibility and effectiveness of this method under the real-time requirement. With this method, the important resonance frequencies of the bridge can be tracked in real-time, which is necessary for better bridge health monitoring solutions.

  • In an autonomous vehicle application, TORC Technologies and Virginia Tech University used LabVIEW to implement parallel processing while developing vision intelligence in its autonomous vehicle for the 2007 DARPA Urban Challenge. LabVIEW runs on two quad-core servers and performs the primary perception in the vehicle. This type of application is a clear example of where high-computation must be obtained in an embedded form factor, in order not only to meet the demands of the application but also to fit within low power constraints.

  • At the Max Planck Institute for Plasma Physics in Garching, Germany, researchers implemented a tokamak control system to more effectively confine plasma. For the primary processing, they developed a LabVIEW application that split up matrix multiplication operations using a data parallelism technique on an octal-core system.

    Dr. Louis Giannone, the lead researcher on the project, was able to speed up the matrix multiplication operations by a factor of five while meeting the 1-millesecond real-time control loop rate.

  • The European Southern Observatory (ESO) is an astronomical research organization supported by 13 European countries, and has expertise developing and deploying some of the world's most advanced telescopes. The organization is currently working on a USD 1 billion 66-antenna submillimeter telescope scheduled for completion at the Llano de Chajnantor in 2012.

    One current project on their design board is the Extremely Large Telescope. The design for this 42 m primary mirror diameter telescope is in phase B and received USD 100 million in funding for preliminary design and prototyping. After phase B, construction is expected to start in late 2010.

    The system, controlled by LabVIEW software, must read the sensors to determine the mirror segment locations and, if the segments move, use the actuators to realign them. LabVIEW computes a 3,000 by 6,000 matrix by 6,000 vector product and must complete this computation 500 to 1,000 times per second to produce effective mirror adjustments.

    Sensors and actuators also control the M4 adaptive mirror. However, M4 is a thin deformable mirror—2.5 m in diameter and spread over 8,000 actuators. This problem is similar to the M1 active control, but instead of retaining the shape, we must adapt the shape based on measured wave front image data. The wave front data maps to a 14,000 value vector, and we must update the 8,000 actuators every few milliseconds, creating a matrix-vector multiply of an 8 by 14 k control matrix by a 14 k vector. Rounding up the computational challenge to 9 by 15 k, this requires about 15 times the large segmented M1 control computation.

    Jason Spyromillo from the European Southern Observatory, describe the challenge as follows: "Our approach is to simulate the layout and design the control matrix and control loop. At the heart of all these operations is a very large LabVIEW matrix-vector function that executes the bulk of the computation. M1 and M4 control requires enormous computational ability, which we approached with multiple multicore systems. Because M4 control represents 15 3 by 3 k submatrix problems, we require 15 machines that must contain as many cores as possible. Therefore, the control system must command multicore processing."

    Figure 18: Example Section of M1 Mirror, simulated in LabVIEW

  • Over the last 15 years, passive safety technologies such as ABS, electronic stability control, and front/side airbags have become ubiquitous features on a wide range of passenger vehicles and trucks. The adoption of these technologies has greatly accelerated the use of simulation software into vehicle engineering. Using a combination of CarSim (Mechanical Simulation's internationally validated, high-fidelity simulation software) and LabVIEW, engineers routinely design, test, optimize, and verify new controller features months before a physical vehicle is available for the test track.

    Now that vehicles are monitoring their environment with several vision and radar sensors and actually communicating with other cars on the road, it is essential that every vehicle in the test plan has a highly accurate performance model because each car and truck will be automatically controlled near physical limitations.

    Figure 19 Simulation of Adaptive Cruise Control using CarSim

    To address these requirements, CarSim has been integrated with National Instruments multicore real-time processors and LabVIEW RT to allow vehicle designers to run as many as 16 high-fidelity vehicles on the same multicore platform. This extraordinary power allows an engineer to design a complex, coordinated traffic scenario involving over a dozen cars with confidence that each vehicle in the test will behave accurately. This type of a test would be impossible at a proving ground.

    Optical coherence tomography (OCT) is a noninvasive imaging technique that provides subsurface, cross-sectional images of translucent or opaque materials. OCT images enable us to visualize tissues or other objects with resolution similar to that of some microscopes. There has been an increasing interest in OCT because it provides much greater resolution than other imaging techniques such as magnetic resonance imaging (MRI) or positron emission tomography (PET). Additionally, the method is extremely safe for the patients.

    To address this challenge, Dr. Kohji Ohbayashi from Kitasato University led a team of researchers to design a system based on LabVIEW and multicore technology. The hardware design utilized a patented light-source technology along with a high-speed (60 MS/s) data acquisition system with 32 NI PXI-5105 digitizers to provide 256 simultaneously sampled channels. The team at Kitasato University was able to create the fastest OCT system in the world, achieving a 60 MHz axial scan rate. From a pure number crunching perspective, 700,000 FFTs were calculated per second. The end goal of this research is to help detect cancer sooner in patients and increase their quality of life.


This article presented findings that demonstrate how a novel approach with Intel hardware and software technology is allowing for real-time HPC in order to solve engineering problems with multicore processing that were not possible only five years ago. This approach is being deployed in widely varying applications, including the following: structural health-monitoring, vehicle perception for autonomous vehicles, tokamak control, "smart car" simulations, control and simulation for the world's large telescope, and advanced cancer research through optical coherence tomography (OCT).


The authors would like to acknowledge the following for their contributions to this article: Rachel Garcia Granillo, Dr. Jin Hu, Bryan Marker, Rob Dye, Dr. Lothar Wenzel, Mike Cerna, Jason Spyromilio, Dr. Ohbayashi, Dr. Giannone, and Michael Fleming.


Multicore Programming: Increasing Performance through Software Multithreading

Bridge Health Monitoring System. Shanghai Just One Technology.

A Comparison of LAM-MPI and MPICH Messaging Calls with Cluster Computing.

Softtware Development for Embedded Multicore Systems: A Practical Guide Using Embedded Intel Architecture

Real-Time Plasma Diagnostics

Programming Strategies for Multicore Processing: Pipelining

Advanced Cancer Research Using Next Generation Medical Imaging

Developing Real-Time Control for the World's Largest Telescope

Software Techniques for Shared-Cache Multicore Systems

This article and more on similar subjects may be found in the Intel Technology Journal, March 2009 Edition, "Advances in Embedded Systems Technology". More information can be found at

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.