Evaluation contributions

Next: Future directions Up: Contributions Previous: Reading contributions

Evaluation contributions

My work has made two contributions in the area of evaluation. A number of the various subfields of artificial intelligence make use of empirical evaluation to test the models which result from the theories being produced. For example, there are standard data sets in machine learning which can be used to test a new model against established baseline performance criteria. However, other subfields have tended to be more ``scruffy'' in nature; therefore, they have less well-defined evaluation methodologies. My work has demonstrated that it is possible to empirically evaluate a traditionally ``scruffy'' area of artificial intelligence, text comprehension.

Rather than rely on researcher-generated stories and subsequent researcher evaluation of performance, my work has utilized published science-fiction stories and experts in the field of reading evaluation in order to evaluate the model's performance in relation to an established set of readers and their abilities. By having a model capable of reading a number of stories, it was possible to perform direct reading evaluation. The methodology should be applicable to other subfields, as long as complex models can be produced. Naturally, it is important to isolate aspects of the model's performance which diverges from the human norm due to uncontrolled elements, such as noting that ISAAC's performance on literal comprehension questions is better than average due to its superior memory retrieval.

In addition, I have been able to do what I call a direct theory evaluation by appealing to ideas from ecological psychology in order to begin to see the significance of the work, from an objective point of view. It is possible to analyze the theoretical elements of my work directly and determine the overall validity and significance of it. Part of this determination comes from the model instantiation while part of it comes from observing how the theory relates to other theories.

Next: Future directions Up: Contributions Previous: Reading contributions

Kenneth Moorman
11/4/1997