Next: Questions which participants failed
Up: Baseline model performance
Previous: Repeated questions across evaluators
Another option for analysis
is to consider the effect which
question type had on the scores of the participants.
For instance, reading teachers
and researchers have historically broken
comprehension
down
into three levels of comprehension (see,
[#!read:littext2!#]).
The
literal comprehension level is when the reader understands the
literal meaning of the text.
The next level,
reading between the
lines, is when the reader can accurately reason about the intents and
purposes of the author of the text. Characters are identified with,
truth value can be assigned, various points of view may be recognized,
and the reader is capable of understanding metaphor and
irony.
Finally, there is the level of
reading beyond the lines: the reader fully incorporates the
material into their own background, learns new concepts from it, can
place it within historical relevance, and has a complete understanding
of the text.
For my own work, I have chosen to consider five levels of
comprehension. Consider this simple story:
John hit Mary. Mary ran into the house, crying. A minute later,
Mary's mother came out of the house and scolded John.
The following represents the possible levels of comprehension:
- 1.
- Literal comprehension is the simplest form of comprehension. It
involves understanding a text sufficiently to answer direct questions
concerning it. In the John and Mary story above, a level of literal
comprehension would be adequate to answer questions such as Who
hit Mary? (John) and Who ran into the house? (
Mary). It would also allow questions requiring reference to be
answered. For example, it would
allow the reader to see that Mary was
crying, Mary's mother did the scolding, and that the
mother came out of the same house that Mary ran into.
- 2.
- Inference is the next level of comprehension and adds
conceptual knowledge to the literal level. It would allow
the reader to handle questions such as What sex is Mary? (
female) and Did John touch Mary? (yes). These answers
are not directly in the story but can be inferred from the facts that
Mary is generally a girl's name and that in order to hit
someone, you have to touch them. Notice that since these
are inferred answers, they may not be the correct ones. Later
information in the story may contradict these conclusions; beyond
the literal level, the answers to any question becomes more
interpretative.
- 3.
-
Scenario comprehension is the next level. This level represents the
reader comprehending the entire story as a cohesive set of events and
seeing the underlying causal patterns. At this point in the
comprehension scale, the reader would be able to supply a title for
the story (in this case, perhaps The Quarrel) and to answer more
abstract questions, such as Should Mary's mother have scolded
John? and What ages do you think the characters are?
The above levels of comprehension represent roughly what previous
story understanding systems were capable of accomplishing. The
remaining levels represent higher comprehension levels which human
readers possess:
- 1.
- Scenario comprehension can be carried a step further
to arrive at the level of incorporation. At this level, the
reader comprehends the story as a cohesive set of events, can see how
the story relates to other experiences in their own life, and can use
the experience for reasoning about future events which are similar.
- 2.
- There is one final level of comprehension, historical relevance.
This is the most elusive of the levels and requires the most
sophistication on the part of the reader. Some stories' comprehension
rely on the reader understanding the historical period in which they
were written.
As an example, Orwell's 1984 (book:1984) is
better understood if the reader is aware of the sociopolitical climate
of the late 1940s, the time in which it was written.
As would be expected, the type of question had a significant impact on
the score for that question. To determine this, an ANOVA
was performed; the results are shown in
Table 4. In this case, the scores for each participant
was broken down to a question-by-question level of detail.
Literal comprehension questions and incorporation
questions were the easiest to answer; inference and
scenario comprehension questions were the more difficult
ones to handle. Further analysis was not performed
on these data--the data do not fit the assumptions
of the ANOVA model well. This leads to a somewhat lessened confidence
in the results provided. The results show that the different question
types were at different levels of difficulty;
the question type also influenced the overall score of the readers.
I present the means for question type for the humans
as a set and for ISAAC as a set (i.e., all humans averaged across
all stories and all evaluators and the same for ISAAC) in
Table 5.
Table 4:
Results of the ANOVA, question type
Source |
DF |
F |
evaluator |
3 |
1.76 |
story |
2 |
0.71 |
agent |
10 |
2.89** |
question type |
3 |
7.94*** |
Error |
850 |
|
Total |
868 |
|
3|l|** p < 0.01 |
|
|
3|l|*** p < 0.001 |
|
|
Table 5:
Means of various question types
Question Type |
Human Mean |
Std. Dev. |
ISAAC Mean |
Std. Dev. |
Literal |
0.8245 |
0.02918 |
0.9674 |
0.08018 |
Inference |
0.6220 |
0.04111 |
0.6786 |
0.11295 |
Scenario |
0.7276 |
0.02261 |
0.8358 |
0.06213 |
Incorporation |
0.8438 |
0.03472 |
0.7717 |
0.09539 |
Next: Questions which participants failed
Up: Baseline model performance
Previous: Repeated questions across evaluators
Kenneth Moorman
11/4/1997