Neural Nets for Predicting Behavior

An off-the-shelf neural-network package is used to build a software model that takes sensor input and predicts wind speed and direction as output. Along the way, it solves problems like noisy data.

February 01, 1993
URL:http://www.drdobbs.com/database/neural-nets-for-predicting-behavior/184408944

Figure 1

Figure 2

Figure 1

FEB93: NEURAL NETS FOR PREDICTING BEHAVIOR

NEURAL NETS FOR PREDICTING BEHAVIOR

Software that learns from its mistakes

James F. Farley and Peter D. Varhol

James is a project manager at Arm-tech Industries (Manchester, New Hampshire). Peter is an assistant professor of computer science and mathematics at Rivier College in Nashua, New Hampshire.

Neural networks will find use in a decision-making capacity only if the decision-making process inherent in the network can be embedded in a larger software or hardware system. The advantage over traditional embedded approaches is that the network can be developed interactively using modern, neural-net development tools, and the complex mathematical interactions between network nodes can provide a better model of processes than more traditional software approaches.

Two months of developing different neural-network models of a complex nonlinear process demonstrated the effectiveness of this approach. We investigated several approaches to developing an appropriate model of the behavior of an electronic wind sensor.

We generated simulated data based on a behavioral model of a device described in "Development of Subminiature Multi-Sensor Hot-Wire Probes" (see "References"). The data we produced related sensory input with a given wind speed and direction, so that the speed and angle of the wind could be known with great accuracy. Based on these test data, we had to incorporate an algorithm that could make a prediction of wind speed and direction from sensory inputs into an embedded microprocessor.

We examined two conventional approaches to determining wind speed and direction from the sensor readings. One was purely computational and used a look-up table residing in memory. This was computationally fast, but the data were too ambiguous to determine results in a straightforward manner. The second approach was to fit the data to an appropriate curve, which yielded some success. Wind speed was not difficult, but after extensive curve fitting we could not get wind direction to any better than +-30 degrees, an unacceptable error rate for any purpose.

Part of the problem was that some of the data were noisy and difficult to interpret. Real-life sensors would have to be protected from flying debris with a wire mesh or other covering, which would create currents and eddies around the sensor interfaces. Furthermore, the sensors' readings would be less accurate at some angles than at others, depending on the configuration of the sensors themselves. (See the accompanying textbox, "Two Configurations for Wind Sensors.") These limitations made for confusing and sometimes contradictory data; in other words, a normal state of affairs in an actual design project.

Investigating Neural-net Technology

Our approach to solving this problem was, to say the least, unconventional. Neural networks are usually described as being useful as models for classification, but at least some of the network structures can also be used for prediction. We thought that it might be possible to construct and train an appropriate network using our generated data, and to produce a model that could take sensor input and produce the appropriate wind speed and direction as output.

Using NeuralWare's NeuralWorks II Plus software, we built literally hundreds of different neural networks, often working into the early hours of the morning as the results of one run gave us ideas of new models or directions to pursue. NeuralWorks' DOS-based graphical interface lets you start building networks almost immediately, and the straight-forward design and training process makes it possible to quickly generate large numbers of increasingly complex networks.

Early on, we settled on the backward-propagation model. This model is considered to be a feed-forward model, in that the results of the processing elements (PEs) of one level are not fed back to the PEs in the previous level. Where the backward propagation comes in is with the resulting error. The net determines the difference between the desired output and the actual output, then propagates that error backward through the net so that the weights at each processing element can be adjusted. In this way, the net "learns" from its mistakes. This approach facilitates making predictions rather than classifications.

Because of the nature of the problem and the curve-fitting techniques used to try to solve it, we viewed a neural network as a mathematical model. We wanted to input certain values and have nonlinear transformations occur on these values to produce the appropriate outputs. In the world before neural networks, we would have had to do it manually, deriving our own set of differential equations, and testing various combinations of equations with the available data. This could have occupied us full time for months or even years.

Surprises Along the Way

Suspecting that determining wind speed might have a simple solution, we broke this component out into a separate network. It had, in fact, a straightforward nonlinear solution without a neural network. However, we let the network do our work for us, and produced a solution with a single sensor input, four processing elements in the hidden layer, and one output. This net delivered an average error of 0.7 meters per second and a maximum error of 2.29 meters per second.

In the process of designing a workable neural network for wind direction, we had one surprise. Because we had generated data for two different sensor designs, we were able to determine which design produced data that improved the neural-network output. We were, in effect, using the neural net to make hardware-design recommendations. This is possible only if systems designers agree to incorporate a neural network into the final product, since the hardware configuration is dependent upon the network to interpret its data. However, this means that the neural-net approach to analyzing data can be used as an integral part of the overall system-design process.

Our final solution for determining wind direction was unexpected. On the basis of advice from manufacturing engineers who claimed that the design was in fact realistic, we decided to combine the two data sets to emulate a configuration that combined both types of sensor design. This resulted in by far the best results to date, with an average error of 2.1 degrees and a maximum error of 7.17 degrees on one set of test data. This was our primary breakthrough.

A Filter for Noisy Data

We had yet another surprise in coming up with the best neural-network configuration. The data for one sensor design was taken at angles of 5-degree intervals, while the data for the second was generated at 7.5-degree intervals. To quickly emulate a combined design, we simply combined the two data sets, including data only for common angles. This resulted in data at angles of 15-degree intervals, which, when used to train a network, began to produce predictions of wind direction within reasonable error rates.

However, in addition to having more data for each angle, we also had fewer angles, at a greater distance from one another, with which to train the net. Was this at least partially responsible for our improved results? As it turns out, the answer was yes. Going back to one of the original, uncombined data sets, we filtered out excess data and trained a similar network at 15-degree intervals, then tested it with data taken at 5-degree intervals. The results were significantly better than those obtained by training the net with data at 5-degree intervals, although not as good as those found with the combined data.

NeuralWare technical support expressed surprise at this result, indicating that more data is almost always better than less. Together we speculated that it might be due to the noisy nature of the data. We considered using a Kalman filter on the data prior to running it through the neural net, but a Kalman filter assumes white noise and an autoregressive-moving average (ARMA) time-series model. Neither of these assumptions seemed to fit our problem.

More Details.

Instead, we let the neural net itself act as a noise filter. In addition to training the net at angle intervals that made it easier to learn the differences between different wind speeds, our final network had three hidden layers, the most permitted by NeuralWorks. Each hidden layer acted as a filter, examining different combinations of inputs, determining which of them were best contributing to an accurate solution, and passing only those combinations along to the next layer.

The Final Network

The final network structure for determining wind direction is diagrammed in Figure 1. Figure 2 shows the net for determining wind speed. Figure 1 shows 12 input-processing elements, which correspond to 12 sensor readings. At each successive layer in the network, we reduced the number of processing elements by three, and had one output-processing element. NeuralWare was once again surprised to discover that more than one hidden layer made a difference, but agreed that the noisy nature of the data was the likely culprit.

Code generation was simply a matter of selecting a menu item, naming a function and source file, and letting the software do its work. The resulting code was generic C, which we compiled without modification, using Turbo C++. Listing One shows an example of code produced from a much simpler network, the one used to determine wind speed alone.

Further Work is Needed

While the design described above produces results that largely meet the specifications, more refinement is needed. Of some interest is fine tuning the variables used to get incrementally better results. The learning momentum and learning coefficients seem to make a difference, probably because lowering these values dampens some of the noise in the data.

We also want to cut down the size of the network; currently, NeuralWorks generates just over 10 Kbytes of C source code for the wind-direction network and almost 2 Kbytes for the wind-speed network. NeuralWare claims that using more than a single hidden layer rarely improves performance, but our tests with a single-layer network produced solutions with up to four times the average error rate of the best model. It seems that one is not enough, but maybe the problem can be solved with a less complex network. For embedded-system use, this is a critical consideration.

The input data also needs refining. Our 12 inputs include atmospheric pressure and temperature from both designs' data sets, since the data were generated under slightly different conditions. However, removing pressure and temperature entirely from the input data produced less accurate results. Ideally, we would like to prototype the design our network recommends and generate new test data. With integrated inputs, we suspect that we can refine our network to be both smaller and more accurate than it is now.

The sensor and neural-network technology explored has applications beyond determining wind speed and direction. The same approach can be used on aircraft and marine vessels to determine vessel speed and to help analyze the speed and direction of sonar targets.

Many Features Left Unexplored

As for NeuralWorks Professional II Plus, our two months of exploring models barely scratched the surface of its capabilities. It provides tools for creating over two dozen different types of networks, as well as choosing different learning rules, transfer functions, training schedules, and other variables. It requires extended memory, but can handle networks with up to 8192 processing elements. Using an 80486-based PC with the integrated coprocessor, it's possible to run 50,000 training trials in just a few minutes.

Its sole weakness was the inability to analyze the results of trial networks from within the package itself, so we turned to SPSS and Quattro Pro. NeuralWare is remedying that weakness with the introduction of the DataSculptor, a Windows-based graphical tool for preparing input data and analyzing results. Since these activities can take up to 80 percent of the time it takes to build a neural network, a good method of manipulating large amounts of data is a necessity. Our early look at a beta copy of DataSculptor indicates that it could easily have saved us some time.

However, we're confident of its ability to more quickly and accurately program embedded systems. The caveat is that systems designers have to be committed to integrating neural networks into the design process from the beginning of the project. The combination of easily experimenting with very different behavioral models and quickly generating C code make neural networks and NeuralWorks valuable additions to many design and development projects.

References

Westphal, Russell V., Phillip M. Ligrani, and Fred R. Lemos. "Development of Subminiature Multi-Sensor Hot-Wire Probes." NASA Technical Memorandum 100052, 1988.

Nelson, Marilyn McCord and W.T. Illingworth. A Practical Guide to Neural Nets. Reading, MA: Addison-Wesley, 1991.

Soucek, Branko, and the IRIS Group. Neural and Intelligent Systems Integration. New York, NY: John Wiley and Sons, 1992.

Products Mentioned

NeuralWorks II Plus NeuralWare Inc. Penn Center West Building IV Pittsburgh, PA 15276 412-787-8222

Two Configurations for Wind Sensors

The sensors envisioned in this article and described in the cited reference were based on hot-wire technology. A wire is electrically heated to maintain a constant temperature. As air moves past the element, heat is drawn off, requiring more current to maintain the constant temperature on the wire. The increased voltage required to maintain the temperature is correlated to the amount of heat removed. This technology is the subject of many textbooks on fluid dynamics.

Several atmospheric variables affect the amount of heat taken from the elements. Wind speed and direction, the quantities we are trying to determine, have an affect. This is the effect we hope to be measuring, but there are confounding variables, such as wind currents and eddies swirling around the sensors. Atmospheric pressure and temperature must also be factored in.

The Westphal, et al. monograph describes the design of the wire-sensor probe. From this we generated two different sensor configurations. The first design used three elements: Two are separated by a 90-degree angle, in an x,y configuration, while the third is perpendicular to that plane in a z direction. The second used two elements, running in parallel with one another. It was by combining these two design approaches that we were able to obtain a satisfactory neural-network model for wind direction.

--J.F.F. and P.D.V.



_NEURAL NETS FOR PREDICTING BEHAVIOR_
by James F. Farley and Peter D. Varhol

[LISTING ONE]



#if __STDC__
#define  ARGS(x) x
#else
#define  ARGS(x) ()
#endif /* __STDC__ */

/* --- External Routines --- */
extern double tanh  ARGS((double));

#if __STDC__
int NN_Recall( void *NetPtr, float Yin[1], float Yout[1] )
#else
int NN_Recall( NetPtr, Yin, Yout )
void  *NetPtr; /* Network Pointer (not used) */
float Yin[1], Yout[1];  /* Data */
#endif /* __STDC__ */
{
    float  Xout[8];  /* work arrays */
    long   ICmpT;  /* temp for comparisons */

    /* Read and scale input into network */
    Xout[2] = Yin[0] * (1.4204544) + (-3.3721588);
LAB110:

    /* Generating code for PE 0 in layer 3 */
    Xout[3]  = (float)(-0.079340167) + (float)(1.0320195) * Xout[2];
    Xout[3] = tanh( Xout[3] );

    /* Generating code for PE 1 in layer 3 */
    Xout[4]  = (float)(3.1511722) + (float)(-3.2430165) * Xout[2];
    Xout[4] = tanh( Xout[4] );

    /* Generating code for PE 2 in layer 3 */
    Xout[5]  = (float)(-0.095040455) + (float)(1.0205934) * Xout[2];
    Xout[5] = tanh( Xout[5] );

    /* Generating code for PE 3 in layer 3 */
    Xout[6]  = (float)(-0.076126046) + (float)(1.0341249) * Xout[2];
    Xout[6] = tanh( Xout[6] );

    /* Generating code for PE 0 in layer 4 */
    Xout[7]  = (float)(0.41889122) + (float)(0.31151304) * Xout[3] +
        (float)(-0.79348314) * Xout[4] + (float)(0.30291474) * Xout[5] +
        (float)(0.31325051) * Xout[6];
    Xout[7] = tanh( Xout[7] );

    /* De-scale and write output from network */
    Yout[0] = Xout[7] * (35.937499) + (31.25);
    return( 0 );
}