Columnist: October 12, 1995

First, let's back up a step. We learn by being taught, but we also learn by observing others doing what we seek to do. A parallel exists in fuzzy-rule-based system design: If an expert isn't available to define system operation, but data exist that demonstrate correct system operation, base your design on that data. The reduction of data to the rules and MBFs that make up the system design may be performed manually by the designer, or automatically, using neural networks, genetic algorithms, or similar techniques.
The data can come from a number of sources: by monitoring the operation of a similar, even flawed, system; by monitoring a system with a human operator; or from an analytic model.
Basics: A system can be considered a mapping of inputs to outputs, with its response represented as a function:
outputs=f(inputs).
This function is the basis for model-based systems, where the model is a mathematical expression of system operation.
USC Professor Bart Kosko has shown that a fuzzy-rule-based system can approximate any continuous function of any input dimensionality to any degree of accuracy. The challenge to the system designer becomes identifying what constitutes the function of an optimally (or perhaps only correctly) operating system.
An obvious question is "If I can collect data representing correct system operation, why can't I use a standard data-reduction technique to identify a function that approximates the dataand then use that function as the system model?" The obvious answer is that you can, but at the expense of representational simplicity and ease of subsequent modification.
The MBFs and rules of a fuzzy-rule-based system tend to "make sense," even when generated automatically. MBFs for each input cover its entire possible range (for example, the input speed has its possible range covered by its values very_slow, slow, medium_fast, fast, and so forth), and rule conditions logically combine these input values. Changing the output of a fuzzy system for a given set of specific input values simply means finding the appropriate rule and changing its output actionor perhaps creating a new rule unique to the desired fuzzy input values. Conversely, changing a high-order multivariable polynomial to change its output in only a small region of its input space is nowhere near as simple.
Let's get back to the fuzzy system. Fig 1 shows how empirically gathered input/output data points relate to MBFs and rules. For an easy visualization, assume a single-input/ single-output system; the horizontal axis is the input, and the vertical axis the output.
As the similar, even flawed, system (or the system with human operator or the system's analytic model) operates, you sample inputs and outputs as ordered pairs, each sample resulting in a single data point in Fig 1. The procedure would be identical for a system with multiple inputs and outputs, but, of course, the number of elements in each sample would be greater.
As the number of sample points increases, the full collection of points represents the desired response function. You can estimate MBFs and rules from a small number of response points, but doing so is likely to result in rather coarse system operation. The greater the number of consistent data points, well distributed over the input space, the better the resulting MBFs and rules.
The next step, which Fig 1 also illustrates, is to draw overlapping ellipsoids around clusters of data points. These ellipsoids are projected into rectangles, with the dimensions of the rectangles defining the dimensions of the MBFs. MBFs overlap regularly; the slopes of crossing edges are equal (but of opposite sign), and edges cross at their 50% value.
Recognizing there is no popular defuzzification technique that generates a smooth (that is, not "stair-stepped") response function from fuzzy-output MBFs, with input MBFs positioned as shown, and that a weighted singleton approach can accurately simulate the more popular centroid defuzzification technique, Fig 2 shows the same sample set used to generate input MBFs, but with output singletons replacing output MBFs. As before, the rules are merely the mapping of inputs to outputs. Fig 2 shows the resulting response function. The design is complete.
Caution: At a conference, I once observed a three-input fuzzy controller that a neural network had designed from training data. Within the fairly narrow operating limits its designers defined, it did a good job, but the narrow operating limits greatly limited its worth as a system. When I asked why they didn't demonstrate the system over a wider range, I was shown a plot of the full response function (one input was held constant, with the response function viewed as a 3-D projection). The function had regularly spaced waves of varying intensity undulating across its surface. These waves would have resulted in erratic system behavior, which the designers felt inappropriate for their demonstration.
The designers were intent on eliminating their waves, and eventually they did. Later, they reported to me that the problem related to the data used to train the system. The original data set was incomplete, and, in attempting to interpolate between data clusters to fill in the gaps, a designer had made a mistake in the interpolation algorithm he used. The ridges in the response function resulted from correct data; the valleys corresponded to the misinterpolated data. In their rush to get something up and running for demonstration, the designers had failed to see the error.
The lessonother than leaving more time to work out the wrinkles in a system to be demonstrated (yeah, right)is that a fuzzy-rule-based system designed by a neural net or genetic algorithm is only as good as its training data. Take time to ensure that your data is both correct and complete.
The need to be correct is obvious. To get a handle on what constitutes "completeness," consider dividing the input space of a given system into four regions of operation: normal, extended, abnormal, and impossible.
Normal: A system resides in its normal operating region most of the time. The normal operating region centers around a currently commanded operating point and includes the response to most minor perturbations the system will see, such as noise and commands to new operating points. Putting a system through its paces while monitoring an expert's actions will result in a set of input/output response points that largely represent the normal operating region.
Extended: A system enters its extended operating region less frequently than its normal mode, but sufficiently often that the designer must still account for its operation. Examples of extended-operational regions for a controller are start-up and controlled power-down.
Abnormal: Over its lifetime, a system rarely enters its abnormal region, but you must not equate the decrease in relative frequency with a decrease in importance. Training data for this region are often nonexistent. Lotfi Zadeh's example of an abnormal operating region for the theoretical autopilot of an automobile is when a child steps off the curb in front of the vehicle. In system design, the decision of how to account for an abnormal operating region is part of the design process.
Impossible: An impossible region occurs when a given set of inputs cannot occur over the lifetime of the system. However, remember the maxim, "nothing's impossible."
Typically, you can generate data points for normal- and extended-region operation by correctly driving system inputs and (potentially) externally forcing outputs. For abnormal-region operation, you can extrapolate MBFs and rules either manually or automatically. Handle impossible regions in one of two ways: Either ignore them or treat them as being merely improbable, and assign control actions that drive the system back toward normal.
One final comment: Designing with an expert and designing with data are not mutually exclusive approaches. An expert may formulate initial rules and may fine tune a system designed with data, as was done in the example cited above. Data may be used both to validate and to fill gaps in an expert's knowledge, or to help resolve differences between conflicting experts.
This discussion has been limited by space constraints. It's likely that we'll return to the topic in the future. Thank you for all your feedback and comments, which are always welcome. I enjoy the interaction.