Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Security

DDJ Handprinting Recognition Contest Wrap-Up


JAN93: DDJ HANDPRINTING RECOGNITION CONTEST WRAP-UP

DDJ HANDPRINTING RECOGNITION CONTEST WRAP-UP

The envelope, please

This article contains the following executables: HWXDT2.ARC HWXHR2.ARC HWXRES.ARC

Ray Valdes

Ray is senior technical editor at DDJ. He can be reached through the DDJ offices, at 76704, 51 on CompuServe, or at [email protected].


The challenge DDJ posed to you was formidable: Create a recognition engine for handprinted text that rivals the one we published in April 1992. As always, you rose to the occasion and acquitted yourselves honorably.

For those of you who missed previous issues, Ron Avitzur presented the complete code to a handprinting recognizer in "Your Own Recognition Engine" (DDJ, April 1992). In June, we released the code to a test harness for evaluating recognizers, along with sample data files. Contestants had to create a recognition engine that plugs into this test harness, following a specified interface. The test harness is a simple command-line utility that reads in some data during an initial training phase, reads test data during the recognition phase, and finally displays an accuracy score and speed results. Pen computers or pen operating systems weren't required, only an algorithm and your coding chops.

Apple Computer generously provided a PowerBook 100 as first prize. The runners up get to pick, in finish-line order, from a grab-bag of development tools including C/C++ compilers from Borland, Microsoft, and Watcom, debuggers from Symantec and Nu-Mega, and text editors from Borland and Lugaru.

Start Your Engines

Despite the specialized nature of the challenge, the contest generated much interest. As many as 150 people requested or downloaded the test harness. Requests came in from Japan, Germany, Brazil, Mexico, Belgium, and France.

The contest officially began on June 15th, with a September 15th deadline for submissions. In retrospect, three months allotted for the contest, which seemed like a generous amount of time, became a tight deadline. Many hours were spent during the third month, as contestants followed trails into the blind alleys of disappointing algorithms or the brick walls of machine constraints.

Of the initial group, about 20 readers pursued the challenge past the initial stages. The difficulty of the problem winnowed this down to fewer than ten in the last month of the contest. The final week was a frenzy of activity, similar to the end of a marathon. Just before midnight on September 15th, for example, we received a call from a contestant who was ready to drive to the post office at the local airport so that the submission would be postmarked by the 15th. For those who requested it, we granted extensions (of a couple of days) freely. The extra time could not compensate for those whose particular approach did not pan out.

Ultimately, five entrants made the cut: Don Branson, Philipp Hanes, Kipton Moravec, David Randolph, and Allen Stenger. Their submissions were diverse, in terms of approaches to the problem: neural net (Moravec), fuzzy logic (Hanes), Pearson statistical correlation (Randolph), incremental improvements to Avitzur's recognizer (Stenger), and a range of derived features (Branson). Other interesting approaches that did not make it to the finish line included Fourier analysis, functional link nets, and various statistical tests.

About the only thing in common was that all entries were written in C and almost all were developed on PCs. Processing speeds ranged from 100 characters per second (cps) to four days for processing the alphabet. Our test platform was a 25-MHz 80386-based PC running the DDJ test harness. In addition to the sample data distributed with the harness, we collected additional data from DDJ staffers.

Table 1 shows the entries ranked by accuracy, Table 2 ranks them by speed, Table 3 shows the size of each program, and Table 4 shows the final rankings. The following sections discuss each of the entries in turn.

Table 1: Rankings by percent accuracy. Stenger/DDJ entry represents Stenger's version modified by DDJ to use all data points rather than filtering out some points. Hanes's entry aborted with divide-by-zero exception on files C and D. Moravec's entry could not process the data within the allotted time.

                                Data Sets
--------------------------------------------------
                Average   A      B      C      D
  Randolph        92.4   95.2   91.7   92.3   90.5
  Branson         91.7   98.6   89.9   87.7   90.6
  Stenger/DDJ     82.5   93.1   74.9   84.9   77.2
  Stenger         65.4   99.7   97.2   38.5   26.3
  DDJ (Avitzur)   62.0   97.3   91.2   37.2   22.2
  Hanes           48.9   98.8   96.8    --     --
  Moravec                 --     --     --     --

Table 2: Rankings by speed (on 25-MHz 386) in cps. Stenger's entry running on a Mac IIci averages 166 cps. Hanes's entry did not run all tests. It averaged 14.8 cps on the tests it was able to complete. Moravec's entries did not run all the tests.

  DDJ (Avitzur)    106.5
  Stenger          105.5
  Hanes              7.4
  Randolph           2.6
  Branson            0.8
  Moravec           --

Table 3: Size (in lines of code) of recognizer module.

  Randolph           520
  Hanes              764
  DDJ (Avitzur)      925
  Stenger           1041
  Moravec           1542
  Branson           2282

Table 4: The finish line (weighted rankings).

  Randolph          4.75
  Branson           4.35
  Stenger           4.10
  Hanes             3.25
  Moravec           1.05

Practice Over Theory

Allen Stenger came tantalizingly close to winning, if you consider only results on the initial data set. Al's first task, however, was to port the DDJ entry to the Macintosh. Actually, Ron Avitzur's recognizer and test harness were originally developed on the Macintosh. However, when DDJ ported the code to the PC, certain changes were made that broke the Macintosh-specific code. These incompatibilities were discovered in compiling the code on the Mac, independently by both DDJ and Allen. (We distributed these fixes to those who requested Macintosh-format diskettes.)

Al then took a pragmatic, methodical approach, eschewing theory in favor of practice. He began with Ron Avitzur's recognizer, identified specific weaknesses or anomalies, and then fixed each one, boosting accuracy while maintaining speed. He writes, "I made some changes to the original design, added a new position function, and then added some ad hoc checks to distinguish certain characters."

He noticed that the P4 and P8 routines in Avitzur's recognizer do not include the last point of the stroke, which "seems to be an error." Also, Avitzur's recognizer "uses a cumulating bounding-box scheme, in which the Nth stroke carries the bounding box for strokes 1 through N. I modified this calculation to propagate the gesture's bounding box back into all the strokes."

Avitzur's recognizer weighs each stroke in a character differently; later strokes carry more weight. Stenger changed this scheme to weigh all strokes equally. Al also added a new element to the feature set of a character: the centroid of a stroke. He defines a centroid as "the average of all the points in the (simplified) stroke, which gives an indication of where most of the ink is. The centroid is quantized to a 4x4 grid."

Finally, to push his accuracy to an astonishing 99.7 percent (on the "clean" data file), he added some ad hoc checks: "Although the centroids are not the same for each copy of the character, we can say a priori from the structure of the character that certain centroid values should never appear, and we can use this to disqualify some wrong choices. For example, if the character to guess has a stroke with a centroid in the left column of the 4x4 grid (that is, if the character is 'F') we know the character cannot be an 'I' since I's centroids should always be in the middle two columns."

One reason for the high accuracy of Stenger's recognizer (and of Avitzur's original version, which achieved 97 percent accuracy on the "clean" data file), is that the algorithms were tuned to the sample data. This was a result of circumstance rather than design. When Ron Avitzur was writing his program, the only data he had easy access to was that of his own handwriting.

In the documentation accompanying the test harness, we said: "Our sample recognizer works well with our current set of data, but will likely stumble on other valid data that it has not previously encountered. In judging the contest, therefore, we will attempt to run all recognizers on as broad a data set as possible, including any data that you submit with your entry. Although not a requirement, it is in your interest to submit a sample of handwriting data (in the binary format used by the test harness), so that you can both show your recognizer in the best possible light as well as stump other people's recognizers."

Our new data stumped both Avitzur's and Stenger's versions of the DDJ recognizer. As Table 1 shows, the accuracy plummeted from 99.7 percent to below 40 percent on the new samples. As you can tell from the screen shot in Figure 1, the handwriting samples are not radically different from Ron's handwriting. However, the writing is a bit more elaborate, with more strokes and more curves than Ron's sparse style. Data was gathered using the same Wacom 510C digitizer used in the original data set; the difference is, instead of a Mac, we used a PC running Microsoft Windows, which collects points at a different rate than on the Mac.

Just for fun, we modified Stenger's recognizer to not throw away data points during the initial filtering process. The results are shown in Table 1. Our ad hoc modification boosted accuracy on the new data while reducing it on the original data -- which is why Ron Avitzur had created the simplify() routine in the first place. The overall accuracy improved, but not enough to overtake the top two finalists.

The Mother of all Neural Nets

Moving from the 100 cps speeds to a different time scale entirely, we arrive at Kipton Moravec's entry. Kip cheerfully introduced his submission with, "Here it is: the mother of all neural nets!"

Kip's back-prop neural net is massive, using two hidden levels plus an input level and an output level. The input level has 148 nodes representing character features. The two intermediate levels have 440 nodes each, while the final output level has 65 nodes, one for each character.

Kipton freely admits: "The learning is sloooow. There is a lot of floating-point math. A coprocessor can make a big difference! My 33-MHz 386 is a little over 100 times slower than my 33-MHz 486. The other advantage of the 486 is the built-in cache." One reason for the slowness is the size of the neural net, which will not fit into the 640K RAM of a PC running in real mode. Therefore the data must be written out to disk, slowing things by orders of magnitude.

We initially tried running Moravec's program on the same 386 used for the other tests. After several days of waiting, we took Kip's advice and commandeered a 33-MHz 486 and used 6 Mbytes as a disk cache. After a couple more days, the program halted with a disk error. We suspected this was due to an incompatibility between differing versions of HIMEM.SYS and SMARTDRV.EXE. We updated the software and tried again, but at press time, the program was still grinding away on the initial data set, after seven days.

Despite these problems, Kip's code is well worth looking over, if only for the routines that derive character features. Among the 1600 lines of code are routines that derive seven moments shown to be invariant to rotation, scaling, and translation.

Fuzzy Logic Becomes Brittle

Handwritten on the disk containing Philipp Hanes's entry was the note: "Recognize this, buster!" His submission looked very promising indeed when processing the original data samples, achieving 98.8 percent accuracy on the "clean" sample.

His recognition engine uses fuzzy logic, setting up an array of 55 attributes per stroke (as well a few attributes that are global to the character). In his earlier approaches, Hanes followed a few blind alleys: "The choice of values to use is the crucial part. It was pretty easy to recognize 80 percent of the letters. 90 percent wasn't too bad, either. But every percentage point beyond that was an uphill battle. A few things I had considered perfectly reasonable turned out actually to be detrimental to my end results. In particular, any calculations having to do with angles and with changes in angles (first and second derivatives) seemed to make the results worse rather than better.... I tried quite a few statistical techniques, all of which did nothing at all. Most made it much worse." These metrics included variance, standard deviation, squaring errors, and discarding extreme values.

Hanes identified an area that was problematic for other contenders as well: "The other thing that has a surprising effect on recognition accuracy is the simplify() routine.... Simplifying more than a minimal amount has negative effects on recognition. On the other hand, going through all the points does slow things down quite a bit. I went for higher accuracy." Even so, his recognizer is quite fast, bested only by Ron Avitzur's (and Allen Stenger's) blazing speed, as shown in Table 2.

One comment from Hanes, unfortunately, proved to be prescient: "It's quite possible to give it data that will crash the recognizer.... It is vaguely possible that something might, by chance, add up to exactly 0 at some point.... This could be looked at as a major design flaw." Sadly, this is what occurred when processing the additional samples from DDJ, aborting the program with a divide-by-zero error, and thus lowering the previously high score. We examined the code, looking for any obvious errors that could be fixed simply, but such was not the case.

A No-nonsense Approach

Don Branson's entry is among the best performing, contending with David Randolph's for first place in accuracy. Don's submission is also the largest, weighing in at 2200 lines of code. Despite the voluminous size, the approach seems to be reasonably straightforward.

During training, a set of features or attributes is derived for each character. Prior to deriving attributes, each stroke is passed through a smoothing routine. Short strokes are discarded. Branson breaks down the character not just into strokes, but also into enclosed regions, which carry a set of attributes similar to those borne by strokes. The derived features include stroke orientation and position, as well as other attributes that quantify a component's rough spatial relationship to other character components. Like Allen Stenger, Branson calculates the centroid for each stroke.

Although Branson almost tied with David Randolph for accuracy, Branson's entry runs at less than half the speed of Randolph's. Also, Branson's code is four times the size of Randolph's. For these reasons, the winner's nod must go to David Randolph.

Table 4 shows weighted scores for the contestants. Our scoring system ranks all entries in each of three categories: accuracy, speed, and "elegance." All coding styles being roughly equal, not to mention subject to taste, we defined "elegance" as being inversely proportional to program size (that is, less is more). We emphasized accuracy over speed, and both of these over "elegance." Accordingly, our weighting was 80 percent for accuracy, 15 percent for speed, and 5 percent for coding conciseness. First place in a category got 6 points, second place 5 points, and so on down the line. These values were multiplied by category weights.

Table 5: The prizes.

  First Prize: Apple Powerbook 100

  Subsequent prizes (contestants pick from these, in finish-line order)
  Borland C++ 3.1 with Application Frameworks
  Microsoft C7, including Windows SDK
  Watcom C 9.0/386
  Nu-Mega Bounds Checker
  Multiscope Debugger 2.0 from Symantec
  Visual Basic 2.0 Professional Edition from Microsoft
  Borland Pascal 7.0 with Objects

By the Book, Literally

David Randolph's computer, a PC/AT clone, quite possibly might be the most modest hardware configuration used in the contest. He writes, "Because of the primitive nature of my own computer, I have not had very good luck trying to display the samples sent with the contest materials, but that eliminates any preconceived notions about the handwriting that I am analyzing. For all I know, I could have been recognizing Kanji!"

These constraints turned out to be advantageous, because they led to the choice of a simple, fast algorithm that worked well on a variety of data, at speeds fast enough to edge out Don Branson's entry and gain first place in the contest.

David's entry defines a 50-element array for each character. This size was chosen "because it is all my own computer will handle with 640K." For each character, multiple-stroke information is concatenated into a single, variable-length stroke, which is then fitted into a pair of fixed-size arrays (one for the X coordinates, the other for the Y coordinates). The recognizer relies on one test, the Pearson Correlation formula, to determine if two distributions are statistically similar. David writes: "I use it to see if two pen strokes follow a similar path, by comparing pen coordinates which should appear in roughly the same place at the same time." The X arrays are compared using the formula, then the Y arrays, and then the results are averaged.

Interestingly, the code implementing the correlation function was borrowed from page 506 of Numerical Recipes in C by Press, Flannery et al. (Cambridge University Press, 1988). Listing One shows this routine, as well as its caller.

David adds: "The more complex the characters, the better it works. If you need to examine especially detailed handwriting, this method will hold its own well.... It is forgiving when the writer makes slips or odd movements of the pen."

Conclusion

Although we constrained the problem significantly by specifying character-at-a-time recognition, the contestants still faced a very difficult task.

Ron Avitzur's original recognizer was disarmingly short and straightforward, comprising 900 lines of code and using easily understood algorithms. However, it was the result of many months of work in chasing down more complicated approaches and then abandoning them. It seems that our contestants had to retrace some of these steps to arrive at the same conclusions.

The contestants' code shows the tremendous variety of approaches to this problem. The blind alleys are just as interesting as the successful approaches.




[LISTING ONE]
<a name="0039_0011">
_DDJ HANDPRINTING RECOGNITION CONTEST WRAP-UP_
by Ray Valdes


/****************************************************************\
|   RECOG.C
|   Recognizer for Handprinted Text.
+-----------------------------------------------------------------
|   This is a handwriting recognition engine for
|   Ray Valdes' contest.
|   This recognizer was written by David Randolph.
|   (c) Copyright 1992 by David Randolph.
|   All rights reserved.
\****************************************************************/

/****************************************************************\
This file is my submission for the handwritten character recognition
contest.  Using the DDJ_HWX.DAT file, it achieves above 90.8%
accuracy when trained on the first three samples depending
upon the setting of the rGA_BUF_SIZE constant, and much higher
accuracy when scanning the CLEAN.DAT file.  Unlike Ron Avitzur's
method, this algorithm does not benefit quite as much from training
on additional samples, but can still perform strongly (85+%) after
training on only one sample.  As one might expect, it performs much
more accurately, even when analyzing poor-quality handwriting, if the
training samples are clean.

Unfortunately, I have no samples of my own to send, but that may be
for the best.  Because of the primitive nature of my own computer, I
have not had very good luck trying to see the samples sent with the
contest materials, but that eliminates any preconceived notions about
the handwriting I am analyzing.  For all I know, I could have been
recognizing Kanji!

The algorithm:
This is an on-line algorithm which cheats by taking advantage of the
direction information inherent in tablet data.  Here is what it does:

1.  The multiple-stroke information is concatenated into a single
    variable-length stroke.

2.  The variable-length stroke is fitted into a pair of fixed-size
    arrays of size rGA_BUF_SIZE.  The left half of the pair is for
    the X coordinates, the right half for the matching Y coordinates.
    I have #defined rGA_BUF_SIZE to 50 because that is all my own
    computer will handle with 640k, but other platforms with more
    memory might be able to increase its size.  Keep in mind that
    the larger rGA_BUF_SIZE is, the longer the program takes to run.

3.  During the training phase, each sample is run through steps one
    and two and inserted into a linked list along with the character
    code and the number of strokes in the character.  The same character
    may appear multiple times in the list.  This allows the program to
    train on multiple styles simultaneously, although I would not
    recommend doing this.

4.  During the guessing phase, each sample is run through steps one and
    two and compared with each sample in the linked list which has the
    same number of strokes.

5.  The comparison:  is done using the Pearson Correlation formula.  The
    X arrays are compared using the formula, then the Y arrays, and then
    the results are averaged.  The correlation formula is a test to see
    if two distributions are similar.  I use it to see if two pen strokes
    follow a similar path by comparing pen coordinates which should appear
    in roughly the same place at the same time.  By using this method, I
    am making some assumptions: A) that the computer has been trained on
    the handwriting of the person it is analyzing, and B) that that person
    writes a given character consistently (ie. always starting at the top or
    bottom, etc.).


ADVANTAGES of the algorithm:
1.  It may be trained on the handwriting of multiple people at the same
    time.
2.  It is generic.  It is not tuned to work on a specific set of data and
    should work consistently regardless of what it is recognizing.
3.  The more complex the characters, the better it works.  If you need
    to examine especially detailed handwriting, this method will hold
    its own well.
4.  The program can return the set of characters which most closely
    match the test character along with their scores.  These characters
    may be used to guide a spelling checker layer which can pick the
    best one based on context.
5.  It is forgiving when the writer makes slips or odd movements with the
    pen.


DISADVANTAGES of the algorithm:
1.  It cannot train on one person's handwriting and then necessarily
    recognize another's.
2.  One of the odd quirks of this algorithm is that the more complex the
    character, the easier it is to recognize because it has more features.
    The reverse is true of simple characters, which give it trouble.  In
    fact, the character in the test data which stumped the program the
    most was the minus sign!  It does best with characters that have
    curves.
3.  The algorithm must work under the assumption that there is some
    foreknowledge about the orientation of a character.  But this is
    an assumption that all analyzers must make, or else one might not
    be able to distinguish between an "N" or a "Z", a "W" or an "M",
    a "p" or a "d", etc.
4.  It is an on-line algorithm, dependent upon the ordering of the coordinates
    in the sample data.  It cannot easily be translated to an off-line
    algorithm, which only uses a scanned picture of the character during the
    training and recognition phase, although it is possible.
5.  The algorithm does not always distinguish upper and lowercase characters
    (ie. it cannot always tell the difference between a "w" and a "W").
    I do not believe that a low-level recognizer should be expected to
    distinguish case without the help of a context-sensitive interpreter.
    I would like to ask that case-sensitivity not be made a requirement
    for this contest, since it cannot be practically achieved at this
    level, at least not when using real-world data.


WHY IT WORKS:
Since I am not mathematically inclined, I really have no technical
explanation for why the Pearson correlation formula works better
than other methods I have tried, other than to say it is highly tolerant
of error, so long as each character it trains on is sufficiently
different.  With a 50-element array of points used to define 65 characters,
the leeway allowed before an input character is mistaken for another is
formidable.  The larger the array and the more detailed the input,
the fewer the mistakes.


Although the contest places results over elegance, I have tried to keep the
code as simple and straightforward as possible to keep the algorithm clear.
I have made no changes to the harness code, so for testing, you only need
to plug in the recog.c file.

ONE NOTE before you compile.  If you have a floating point processor, you
may want to convert all the variables in the "correlate" function which
are of type LONG back to type DOUBLE.  I had to make them integers because
the floating point emulator software was too slow for my taste.

This code was created on a PC using Turbo C++ 1.0 running under DOS 3.2.

\****************************************************************/

/*** Include files ***/

#include "h_config.h"

#ifdef TURBO_C
#include <stdlib.h>
#include <math.h>
#endif
#ifdef MSOFT_C7
#include <search.h>
#include <math.h>
#endif

#include "h_stddef.h"
#include "h_mem.h"
#include "h_list.h"
#include "h_recog.h"

/****************************************************************/

#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>


/****************************************************************/


/* RECOG internal definitions */

  /** This defines the length of the "gesture_xy_array", in which a         */
  /** character is stored                                                   */

#define rGA_BUF_SIZE 50

  /** This defines the number of dimensions in the gesture_xy_array, which  */
  /** should always be 2, one for X and one for Y                           */

#define rNUM_COORDINATES 2

  /** These constants define where in the array mentioned above the X and Y */
  /** information should be stored                                          */

#define X_VALUE 0
#define Y_VALUE 1


/* The structure which contains the linked list of trained samples */

struct prototype
{
  INT16 char_num,                 /* The character code (ie "A")             */
        num_strokes,              /* The number of strokes in the character  */
        gesture_xy_array [rNUM_COORDINATES] [rGA_BUF_SIZE];  /* the character*/
  struct prototype *proto_next_ptr;     /* a pointer to the next record      */
};

typedef struct prototype PROTOTYPE, *PROTOTYPE_PTR;



private PROTOTYPE_PTR proto_top = NULL;  /* Always points to list top */

private PROTOTYPE_PTR proto_bot = NULL;  /* Always points to list bot */
                                         /* during training           */


/** Definitions of PRIVATE functions **/

private float correlate (INT16 *, INT16 *, int);
private void FatalError (char *);
private INT16 round (float);



/************************* Start of Code ************************/



/****************************************************************/
/* This is a dummy function; it is called by the harness        */
/* program, but never used                                      */

public void rec_InitRecognizer(void)
{

}


/****************************************************************/
/* Below is the TRAIN half of the program.  This function is    */
/* given a character, along with some information about the     */
/* character such as how many strokes it has.  The function     */
/* takes the given character and converts it into its own       */
/* format and adds it to a linked list of sample characters.    */

public void rec_Train(lpList gesture,INT16 char_code)
{
  INT16 i , k , total_input_length = 0 , convert_stroke_length ,
        array_index_ptr ;
  PROTOTYPE_PTR proto_current;


  /* display character values */

#ifdef DEBUG_TRACE
#ifndef USE_BGI_GRAPHICS
  printf("\n>For char %d (%c), number strokes = %d",char_code,char_code,
             gesture->num_items);
#endif
#endif

  /* Allocate memory for a new link */

  proto_current = (PROTOTYPE_PTR) malloc ( sizeof (PROTOTYPE) );
  if (proto_current == (PROTOTYPE_PTR) NULL)
    FatalError("\n\nFATAL ERROR!  Not Enough Memory to Train.\n\n");

  if (proto_top == (PROTOTYPE_PTR) NULL)
    proto_top = proto_current;
  else
    proto_bot->proto_next_ptr = proto_current;

  proto_bot = proto_current;

  /* set current link's next ptr to NULL */

  proto_current->proto_next_ptr = (PROTOTYPE_PTR) NULL;

  /* Load data into to new structure....*/

  proto_current->char_num = char_code;
  proto_current->num_strokes = gesture->num_items;

  /* Get the total length of the set of strokes because we are going to  */
  /* concatenate them into a single fixed-size array                     */

  for (i = 0; i < gesture->num_items; i++)
  {
    lpList stroke = gesture->items[i];
    total_input_length += stroke->num_items;
  }

  /* NOW, for each stroke, for the x and y values separately,            */
  /*   take the stroke and                                               */
  /*   force it to fit in an array of length rGA_BUF_SIZE, either by     */
  /*   stretching or compressing it...                                   */
  /* To stretch, each point in the stroke is copied into multiple        */
  /*   elements of the new array.                                        */
  /* To compress, only a sampling of new points in the stroke are copied */
  /*   to the new array.                                                 */

  array_index_ptr = 0;
  for (i = 0; i < gesture->num_items; i++)
  {

    /* Get next stroke */

    lpList stroke = gesture->items[i];

    /* Fit the stroke into its slot in the fixed array by converting its */
    /* size                                                              */

    convert_stroke_length = round(( (float) stroke->num_items /
                                    (float) total_input_length ) *
                                    (float) rGA_BUF_SIZE);

    /* Fill each empty slot in the gesture_xy_array */

    for (k = 0; (k < convert_stroke_length) &&
                (array_index_ptr < rGA_BUF_SIZE); k++, array_index_ptr++)
    {
      VHPoint p = *(lpVHPoint)&stroke->items[(int) ((float)
                   ((float) stroke->num_items / (float) convert_stroke_length) *
                   (float) (k))];
      proto_current->gesture_xy_array[X_VALUE][array_index_ptr] = p.h;
      proto_current->gesture_xy_array[Y_VALUE][array_index_ptr] = p.v;
    }
  }

}



/******************************************************************/
/* This is the GUESS portion of the program.  It is passed an     */
/* unknown character, which is compared with the linked template  */
/* list.  The best fit is the guess.  The program is passed the   */
/* character and pointers to three guesses are passed back.  This */
/* function only attempts to make a first guess.                  */

public void rec_Guess(lpList gesture,
            LPINT16 g1,LPINT16 g2,LPINT16 g3,
            LPINT16 w1,LPINT16 w2,LPINT16 w3)
{

  PROTOTYPE_PTR compare_ptr = (PROTOTYPE_PTR) NULL;
  float mean_compare;
  INT16 num_strokes, i, j, k, total_input_length = 0, convert_stroke_length,
        array_index_ptr,
        gesture_xy_array[rNUM_COORDINATES][rGA_BUF_SIZE];

  *g1 = *g2 = *g3 = 0;
  *w1 = *w2 = *w3 = 0;

  num_strokes = gesture->num_items;

  /* Get the total length of the set of strokes because we are going to  */
  /* concatenate them into a single fixed-size array                     */

  for (i = 0; i < gesture->num_items; i++)
  {
    lpList stroke = gesture->items[i];
    total_input_length += stroke->num_items;
  }

  /* NOW, for each stroke, for the x and y values separately,            */
  /*   take the stroke and                                               */
  /*   force it to fit in an array of length rGA_BUF_SIZE, either by     */
  /*   stretching or compressing it...                                   */
  /* To stretch, each point in the stroke is copied into multiple        */
  /*   elements of the new array.                                        */
  /* To compress, only a sampling of new points in the stroke are copied */
  /*   to the new array.                                                 */

  array_index_ptr = 0;
  for (i = 0; i < gesture->num_items; i++)
  {

    /* Get next stroke */

    lpList stroke = gesture->items[i];

    /* Fit the stroke into its slot in the fixed array by converting its */
    /* size                                                              */

    convert_stroke_length = round(( (float) stroke->num_items /
                                    (float) total_input_length ) *
                                    (float) rGA_BUF_SIZE);

    /* Fill each empty slot in the gesture_xy_array */

    for (k = 0; (k < convert_stroke_length) &&
                (array_index_ptr < rGA_BUF_SIZE); k++, array_index_ptr++)
    {
      VHPoint p = *(lpVHPoint)&stroke->items[(int) ((float)
                   ((float) stroke->num_items / (float) convert_stroke_length) *
                   (float) (k))];
      gesture_xy_array[X_VALUE][array_index_ptr] = p.h;
      gesture_xy_array[Y_VALUE][array_index_ptr] = p.v;
    }
  }

  compare_ptr = proto_top;

  /* Starting from the top compare the test character with each character */
  /*   in the template linked list                                        */

  while (compare_ptr != (PROTOTYPE_PTR) NULL)
  {
    /* Don't even bother to compare characters if they do not have */
    /*  the same number of strokes                                 */

    if (compare_ptr->num_strokes == num_strokes)
    {

      mean_compare = 0.0;

      mean_compare += correlate (gesture_xy_array[X_VALUE],
                                 compare_ptr->gesture_xy_array[X_VALUE],
                                 (int) (rGA_BUF_SIZE));
      mean_compare += correlate (gesture_xy_array[Y_VALUE],
                                 compare_ptr->gesture_xy_array[Y_VALUE],
                                 (int) (rGA_BUF_SIZE));

      mean_compare /= rNUM_COORDINATES;

      if ((mean_compare * 100) > *w1)
      {
        *w1 = mean_compare * 100;
        *g1 = compare_ptr->char_num;
      }

    } /* end if */

    compare_ptr = compare_ptr->proto_next_ptr;

  } /* end while */

#ifdef DEBUG_TRACE
#ifndef USE_BGI_GRAPHICS
  printf ("\n### CHAR finally evaluated as %c with prob %d",
        *g1, *w1);
#endif
#endif

}





/*
  The correlate function below was borrowed from the "pearsn" function
   found in the book "Numerical Recipies in C", edited by Press, Flannery,
   Teukolsky, and Vetterling (Cambridge University Press, Cambridge 1989)
   on page 506.
*/

/****************************************************************/
/* This function computes the correlation value between two distributions */

private float correlate (x,y,n)
INT16 x[], y[];
int n;
{
  int j;

  /* NOTE: You may want to change the word 'long' below to 'double' if
           you have a floating point processor.  It should speed things up. */

  long yt, xt, syy=0, sxy=0, sxx=0, ay=0, ax=0;
  float r;

  for (j=0; j < n; j++)
  {
    ax += x[j];
    ay += y[j];
  }

  ax /= n;
  ay /= n;

  for (j=0; j < n; j++)
  {
    xt = x[j] - ax;
    yt = y[j] - ay;
    sxx += xt * xt;
    syy += yt * yt;
    sxy += xt * yt;
  }
  r = (double) sxy / sqrt( (double) ((double) sxx * (double) syy) );
  return (float) r;
}



/****************************************************************/
/* This is a dummy function; it is called by the harness        */
/* program, but never used                                      */

public void rec_EndTraining(void)
{
}



/****************************************************************/
/* Fatal Error Handler.  Displays error message and terminates. */

private void FatalError (char* msg)
{
#ifdef USE_BGI_GRAPHICS
  gr_Close();
#endif

  fprintf (stderr,"\n*** Fatal Error:");
  fprintf (stderr, msg);
  exit (0);
}



/****************************************************************/
/* A small function to round a float to an int                  */

private INT16 round (float x)
{
  return ( (float) (x) - (float) ( (int) (x) ) >= 0.5 ) ? (int) (x) + 1 :
                                                          (int) (x);
}


/************************* END OF RECOG.C ***********************/













Copyright © 1993, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.