The Applet itself and a detailed description follow. The accompanying paper is also available at :

http://techhouse.brown.edu/dmorris/JOHN/JOHN.html

Also note that this takes a minute to load, but it really will load. I swear.

From here on I will assume you have the Applet properly loaded, and will make reference to various GUI components. In fact, the following is a description of each of the components and the specifics of the network implementation where appropriate; this seems an appropriate approach for description of the project. Examples, conclusions, and interesting observations will be interspersed throughout the discussion of the components. If you just want to skip to the primary conclusions, look for the **PROJECT CONCLUSION** fields; you can't miss them.

If you're having trouble loading the Applet, all of the following information and more, along with a screen shot, is available at :

I. The input fields :

- The left grid represents input to the network. Draw all over it with the mouse. Left button to draw, right button to erase (ctrl-button or alt-button will always emulate the second and third buttons in Java). Have a blast. The initial goal of the project was to employ the Hopfield net for character recognition, but there is nothing inherent to the network that makes it better suited for the letters of the English language than for smiley faces or scribbles.

__PROJECT CONCLUSION #1 :__ if I were really implementing a character recognition algorithm, a simple neural network would be inadequate, and a great deal of image parsing would be required (well beyond the simple scaling employed here). Furthermore, even if I were going to implement a neural-net-based OCR procedure, a Hopfield net would likely

- CLEAR INPUT : Clears the input field. Duh.

- SCALE : Scales the pattern on the input field to approximately fill both the x and y dimensions. This is the most fundamental preprocessing required for any intelligent shape (character) recognition. Note that the x and y proportions are not constrained, so a scaled dot is effectively the same as a scaled line. Nonetheless, this removes many of the peculiarities of human-drawn letters (I state again, the
**original**goal here was character recognition, but the project as implemented is a more abstract study of Hopfield dynamics) (necessary to conclude that it's not great for character recognition.)

- NOISE % : Adds random noise to the pattern on the input field, according to the percentage provided. Note that 100% noise corresponds to a complete inversion of the image, whereas 50% noise is complete randomness.

__PROJECT CONCLUSION #2 :__ The system is much better at recognizing a figure that has had considerable noise applied to it than it is at discerning a hand-drawn shape. In fact, it is quite good at reconstructing noisy patterns that are unrecognizable to the human eye, but is completely incompetent at building a figure that differs in shape from the prototype, even for figures easily recognizable to a literate reader.

As a quick example, try drawing a simple figure (my favorite is an 'x' that is 5 squares from corner to corner). Click 'scale' (because it looks nicer that way), then click 'train'. Then set the noise field to 20%, and click the 'noise' button. Likely you would never recognize the figure in front of you as an 'x'. However, clicking 'propagate' should fully reconstruct the figure.

This is, of course, a silly demonstration, because only one figure has been stored. But it does show that on some level, the network is capable of reconstructing badly damaged prototypes that are beyond human recognition (but still really bad at handwritten characters). Similar properties are indeed demonstrated for larger training sets. Perhaps this network would thus be well-suited to recovering damaged typewritten characters into digital text (fuzzy-looking faxes are a great example).

II. The output fields :

- The grid on the right represents the output field for network propagation. You can draw on this one too, but it doesn't affect anything. Any figure on the output field will be cleared before the network propagates.

- TRAIN : This causes the figure on the input grid to be 'stored' in the network's weight matrix. In short, for each element on the field, a connection weight will be
**increased**by one for elements with the same value (on or off) and**decreased**by one for elements with the opposite value in the input pattern. An important consequence of this mechanism is that there is no inherent distinction between 'on' and 'off', the network is learning patterns of 'matching' and 'non-matching' grid elements.

This leads to a unique property of the Hopfield Network : the possibility of a stable state at the **inverse** of a stored pattern. For example, train the network on any arbitrary figure. Set the input field's noise to 50% (complete randomness), and propagate a few times. You'll note that, for random input, the original learned pattern and its inverse occur equally often as output. Hence both represent equally stable states.

This is an inconvenience, and will affect the performance of the network when the only performance criterion is the absolute number of 'correct' elements (see numerical performance analysis below). But the 'inverse pattern' effect probably does not have tremendous implications for character recognition, where **shape** is the primary parameter. In order to apply the Hopfield Network to character recognition, a post-processing algorithm would be required to compare the network output to prototype characters. It is trivial to deal with inverse patterns at this stage.

- PROPAGATE : This propagates the pattern on the input field through the weight matrix, displaying the output on the output field. This of course represents the heart of the Hopfield algorithm, allowing the network to progress to that trained state which is most accessible to the initial pattern.

For each iteration, a random element *a* will be chosen from the field. The connection weight between *a* and each other element *b* will be multiplied by the value of *b* to obtain a weighted sum of connection influences. Provided the 'binary nonlinearization' option is checked below (more on this later), the new value of *a* will simply be +1 or -1, bearing the sign of the weighted sum. It is relevant to note here that in the Hopfield model, the previous state of an element has no influence whatsoever on its current state. This seems logical, as it would not be sensible for an element to have a connection weight other than 1.0 with itself, and self-weights would thus not influence the network in any useful way.

Iteration ceases when 10 * N (in this case 2250) iterations have passed without a change in an element's value. This is somewhat arbitrary, but provides reasonable confidence that the network is 'finished.' The total number of iterations required (including the 2250 uninteresting iterations at the tail end of propagation) will be displayed in the message window.

Virtually all instructions to propagate result in stable states within a few thousand iterations (again provided binary nonlinearization is applied, again more on this later). The required number of iterations for a given pattern may well be an indication of certainty (or proximity of the initial pattern to a stored state), again useful for a character recognition application where a rejection criteria is essential.

Unfortunately, the stable states reached often do not correspond to trained patterns; these 'spurious stable states' are a major drawback of the Hopfield network. One will often observe stable patterns that represent superpositions of trained patterns and/or their inverses. The occurrence of these states is **dramatically** reduced when error-correction is applied (discussed below), especially for larger numbers of stored figures.

III. The animation fields :

- The animation checkbox tells JOHN whether or not you'd like to watch the network change during propagation. This of course slows the process down tremendously, but it allows the user to observe the network's progression. This is especially useful in cases where performance is poor, where one might like to know which facets of the input pattern were responsible for the choice of output state. The ITERATIONS PER DISPLAY and INTERVAL fields specify the details of the animation. Pretty straightforward. No neural network magic here.

IV. The pattern set fields :

- The major extension of the simple Hopfield Network is the ability to apply Widrow-Hoff error-correction to a set of patterns. This set would ideally represent prototypes for an alphabet to be learned.

- STORE PATTERN : Widrow-Hoff correction requires that all patterns to be learned be presented simultaneously; one cannot apply error-correction on a 'one-at-a-time' basis. The STORE PATTERN button stores the current pattern to an internal data structure (I advise scaling them first if you're playing with characters or digits). The field next door displays the number of patterns currently stored, waiting to be learned.

- TRAIN SET : This causes the weight matrix to be trained on the patterns previously stored, with Widrow-Hoff error correction. The parameters required for Widrow-Hoff iteration are entered below, in the LEARNING CONSTANT, LEARNING TRIALS, and RANDOM SEED fields. The details of these values and of the Widrow-Hoff implementation will be discussed below with the numerical portion of the Applet.

It is important to note that large numbers of learning trials can take a while, but will vastly improve the stability of the learned patterns (the autoassociative capabilities of the network). And again, the error-correction process is stable provided binary nonlinearization is applied (I still promise to discuss this later).

__PROJECT CONCLUSION #3:__ Widrow-Hoff correction, or some similar supervised learning algorithm, is so effective as to be ESSENTIAL to any potential memory/recall applications of the Hopfield net. The numerical demonstration discussed below will make it very clear that pattern recognition with simple Hebbian learning is, so to speak, really hokey. Widrow-Hoff learning, however, can allow for perfect autoassociation - and thus good recovery from noise that does not damage figure shape - for numbers of patterns on the same order as the number of letters in the alphabet or digits available in Western numbering.

- CLEAR SET : Clears the stored patterns without training the matrix on them. Note that this does not clear the current matrix; use CLEAR WEIGHTS for that.

V. The file fields :

- These components are useful, and would be
**essential**to a useful character recognition application (for loading a prototype set), but since most browsers don't allow file access from Applets, I won't even bother discussing them. This is, however, a good time to note that a Java application class was written to load the Applet and allow file or network i/o. If you'd like to play with that, feel free to download the entirety of JOHN, with compiled classes and source code, at :http://techhouse.brown.edu/dmorris/JOHN/JOHN.zip

The StinterNetLocal.class file is a Java 1.1 class that will load the above Applet with security restrictions disabled. I'll put a quarter on the fact that no one's interested enough to click that link. If you do, or even if you can just convince your browser to relax security enough to read a URL (which IE4.0 is SUPPOSED to let you do), an error-corrected set of weights representing the 10 Arabic digits in block form is available at :

http://techhouse.brown.edu/dmorris/digitweights.txt

This is essentially the data file that represents my 'character-recognition' prototypes. If you are in a position to read files, simply enter the above URL in the FILENAME field, and click 'READ WEIGHTS'. Then watch digits that you probably didn't intend to draw emerge as stable patterns. This would be a central feature of the program (and I would try much harder to make it more accessible) if character recognition was really all that good. Also, read weights at your own risk... since most browsers won't support it, I haven't been able to fully debug it and funny things happen sometimes...

VI. The correction trial fields :

- The bottom portion of the Applet represents a numerical demonstration of the error-correcting procedure and of the general limitations on the network. This is necessary to demonstrate that anything useful is actually occurring, as it is very difficult to draw any objective conclusions by playing with the input and output fields. This portion of course has no bearing on the original character recognition scheme, but it can provide a rather impressive demonstration.

- Number of patterns : This field represents the number of patterns that will be presented to the network and correction scheme for autoassociation. Patterns are generated randomly before a "pre-correction" error value is assessed.

- Learning constant : The Widrow-Hoff learning constant, for numerical demonstration and for set training.

- Learning trials : The number of Widrow-Hoff learning trials to be applied to the network. A single trial represents one propagation of a pattern through the network, followed by correction applied individually to each element in the weight matrix.

- Random seed : The seed for the random number generator used for randomly selecting elements during propagation and for random selection of patterns for training. Note that the random number generator is reseeded between the initial evaluation of the uncorrected network and the final error count. Hence the final and initial values actually can be compared.

- Run correction trial : This initiates the actual numerical demonstration. First, random patterns will be generated. The network will be trained on each of them by Hebbian learning alone. Each pattern will then be propagated through the network, and the total number of individual incorrect output elements will be counted and displayed. Learning trials will then be applied as described above (this will take a minute for a few hundred learning trials, the progress indicator should keep moving). The error count will then be repeated, with the same random seed, and the new error rate will be displayed.
The results should be rather convincing, provided sufficient learning trials are applied. For example, try running the simulation with a learning constant of .2, 15 patterns, a random seed of 5555, and 350 learning trials. Actually you might not want to actually try it, since it will take five minutes or so. But you will see that Hebbian learning alone gives 1671 errors (varying slightly, perhaps, with other implementations of the random number generator), and Widrow-Hoff learning with these parameters corrects the system to

**perfect**autoassociation.See project conclusion #3... I just wanted to reiterate how useless the Hopfield network would be for OCR with no correction, and how much potential is added with Widrow-Hoff weight adjustments. Perfect autoassociation is, of course, a LONG way from useful pattern recognition. But it is certainly a prerequisite, and perhaps the only of the many prerequisites for OCR that can be achieved solely with the Hopfield net.**PROJECT CONCLUSION #4:**Incidentally, for a very small number of learning trials and a small learning constant, you may find that no change at all occurs; weight changes need to be large enough to actually change the sign of elements during propagation. So don't hold it against me if you run a trial with an LC of .2 and 20 learning trials, and nothing happens. In other words, good things come to those who can wait a few iterations.

VII. Nonlinearization to binary elements :

- Un-checking the binary-nonlinearization box allow element values to vary continuously, instead of 'clipping' all weighted sums to +1 or -1. This has been proposed an explored as a variation on the Hopfield network, although Hopfield's original model was strictly binary. In this case, rather than apply a simple clipping to the weighted sum of connection values for each element during propagation, the hyperbolic tangent of that sum (times a constant) will be taken. This allows a continuous distribution of values, but constrains values between +1 and -1. This 'saturation effect' is both numerically desirable and, if we are concerned with neural modeling, biologically plausible.
Unfortunately, the behavior of the network is VERY unpredictable with continuous-valued units, especially when weights are not constrained to symmetry (see below). As expected, when it works, one sees more effective learning with the Widrow-Hoff algorithm for a given number of iterations. This is because 'corrections' can be made when more information than the sign of the weighted connection sum is available.

I encourage you to try this only at risk of your own patience, as it does have a tendency to result in infinite propagation. This is a consequence of allowing very small values, which can lead to oscillations between signs. Hence the 10*N iterations necessary to declare the network 'stable' will never be reached.

The binary nature of the elements in the traditional Hopfield network may seem a simplification, but it does not prevent high-quality autoassociation and it avoids potentially oscillatory states.**PROJECT CONCLUSION #5:**

VIII. Constraining weight symmetry :

- Checking this box will force any corrections made to any weight values to be applied simultaneously to the inverse weight value. That is, weight i,j-->a,b always matches weight a,b-->i,j when this option is selected. This seems rather logical, as the on-off relationship between two elements is always symmetric. Not surprisingly, constraining the symmetry of the network provides a significant performance increase. The perfect autoassociation that was achieved for error correction with the above parameters required ~350 iterations without constraining weights to symmetry. The same performance requires only 200 iterations with the symmetry requirement.
While constraining the weights in the network to symmetry may intuitively seem to place limitations on possible weight solutions, it results in an immediately observable performance increase and should be applied whenever there is a limit on possible iterations.**PROJECT CONCLUSION #6:**Note that Hebbian learning alone inherently results in symmetric weights, with no applied constraints.

And so that's about it. An interesting exploration, though a disappointment with regard to character recognition. A task for graphics-types...

Return to the Cobbler's home page.