next up previous index
Next: Questions which participants failed Up: Baseline model performance Previous: Repeated questions across evaluators

Question type

  Another option for analysis is to consider the effect which question type had on the scores of the participants. For instance, reading teachers and researchers have historically broken comprehension down into three levels of comprehension (see, [#!read:littext2!#]). The   literal comprehension level is when the reader understands the literal meaning of the text. The next level,   reading between the lines, is when the reader can accurately reason about the intents and purposes of the author of the text. Characters are identified with, truth value can be assigned, various points of view may be recognized, and the reader is capable of understanding metaphor and   irony. Finally, there is the level of   reading beyond the lines: the reader fully incorporates the material into their own background, learns new concepts from it, can place it within historical relevance, and has a complete understanding of the text.

For my own work, I have chosen to consider five levels of comprehension. Consider this simple story:

John hit Mary. Mary ran into the house, crying. A minute later, Mary's mother came out of the house and scolded John.

The following represents the possible levels of comprehension:

 
1.
Literal comprehension is the simplest form of comprehension. It involves understanding a text sufficiently to answer direct questions concerning it. In the John and Mary story above, a level of literal comprehension would be adequate to answer questions such as Who hit Mary? (John) and Who ran into the house? ( Mary). It would also allow questions requiring reference to be answered. For example, it would allow the reader to see that Mary was crying, Mary's mother did the scolding, and that the mother came out of the same house that Mary ran into.  
2.
Inference is the next level of comprehension and adds conceptual knowledge to the literal level. It would allow the reader to handle questions such as What sex is Mary? ( female) and Did John touch Mary? (yes). These answers are not directly in the story but can be inferred from the facts that Mary is generally a girl's name and that in order to hit someone, you have to touch them. Notice that since these are inferred answers, they may not be the correct ones. Later information in the story may contradict these conclusions; beyond the literal level, the answers to any question becomes more interpretative.

 

3.
Scenario comprehension is the next level. This level represents the reader comprehending the entire story as a cohesive set of events and seeing the underlying causal patterns. At this point in the comprehension scale, the reader would be able to supply a title for the story (in this case, perhaps The Quarrel) and to answer more abstract questions, such as Should Mary's mother have scolded John? and What ages do you think the characters are?

  The above levels of comprehension represent roughly what previous story understanding systems were capable of accomplishing. The remaining levels represent higher comprehension levels which human readers possess:

1.
Scenario comprehension can be carried a step further to arrive at the level of incorporation. At this level, the reader comprehends the story as a cohesive set of events, can see how the story relates to other experiences in their own life, and can use the experience for reasoning about future events which are similar.  
2.
There is one final level of comprehension, historical relevance. This is the most elusive of the levels and requires the most sophistication on the part of the reader. Some stories' comprehension rely on the reader understanding the historical period in which they were written.     As an example, Orwell's 1984 (book:1984) is better understood if the reader is aware of the sociopolitical climate of the late 1940s, the time in which it was written.

As would be expected, the type of question had a significant impact on the score for that question. To determine this, an ANOVA was performed; the results are shown in Table 4. In this case, the scores for each participant was broken down to a question-by-question level of detail. Literal comprehension questions and incorporation questions were the easiest to answer; inference and scenario comprehension questions were the more difficult ones to handle. Further analysis was not performed on these data--the data do not fit the assumptions of the ANOVA model well. This leads to a somewhat lessened confidence in the results provided. The results show that the different question types were at different levels of difficulty; the question type also influenced the overall score of the readers. I present the means for question type for the humans as a set and for ISAAC as a set (i.e., all humans averaged across all stories and all evaluators and the same for ISAAC) in Table 5.


 
Table 4: Results of the ANOVA, question type
Source DF F
evaluator 3 1.76
story 2 0.71
agent 10 2.89**
question type 3 7.94***
Error 850  
Total 868  
3|l|** p < 0.01    
3|l|*** p < 0.001    


 
Table 5: Means of various question types
Question Type Human Mean Std. Dev. ISAAC Mean Std. Dev.
Literal 0.8245 0.02918 0.9674 0.08018
Inference 0.6220 0.04111 0.6786 0.11295
Scenario 0.7276 0.02261 0.8358 0.06213
Incorporation 0.8438 0.03472 0.7717 0.09539


next up previous index
Next: Questions which participants failed Up: Baseline model performance Previous: Repeated questions across evaluators
Kenneth Moorman
11/4/1997