Questions which participants failed to answer

Next: Self-evaluation of agents Up: Baseline model performance Previous: Question type

Questions which participants failed to answer

Although I am not willing to make strong conclusions simply from the types of questions as presented in the previous section, it is somewhat revealing to examine the questions which ISAAC failed to answer correctly in order to determine if there is a particular style of question which is problematic for the model. In most cases, ISAAC's mistakes involved literary concepts which are beyond the scope of this project. For example, questions dealing with the way in which an author ``built suspense'' went completely unanswered. Similarly, questions dealing with irony were answered at a more simplistic level than the evaluators were looking for. In some cases, ISAAC failed to provide the evaluator with the full range of response expected. And, in some other questions, ISAAC provided an inference in order to answer the question which the evaluator felt was unjustified.

Human participants also exhibited many of the characteristics which ISAAC did, with respect to not providing what the evaluators considered a ``full'' answer. Additionally, the students often missed the literal comprehension questions. I hypothesize that this is due to memory issues; students were not allowed to consult the text as they were answering the questions. As such, incorrect memory retrieval could hamper the production of the correct answers. While ISAAC's memory is inspired by human memory and is not ``perfect'' in any sense of the word, it is likely that its memory is better at retrieving factual information contained in a text than the humans.

Next: Self-evaluation of agents Up: Baseline model performance Previous: Question type

Kenneth Moorman
11/4/1997