Expanding the answer choices
The body of a logic puzzle question contains a
(unique) wh-term (typically “which of the following”),
a modality (such as “must be true” or
“could be true”), and (possibly) an added condition.
Each answer choice is expanded by substituting
its SL form for the wh-term in the question
body. For example, the expansion for answer
choice (A) of question 1 in Figure 1 would
be the SL form corresponding to: “If sculpture
D is exhibited . . . , then [Sculpture C is exhibited
in room 1 ] must be true”.
Translating SL to FOL
To translate an SL representation to pure FOL,
we eliminate event variables by replacing an SL
form 9e.P(e)^R1(e, t1)^..^Rn(e, tn) with the
FOL form P(t1, .., tn). An ordering is imposed
on role names to guarantee that arguments are
always used in the same order in relations. Numeric
quantifiers are encoded in FOL in the obvious
way, e.g., Q(¸2, x, ', Ã) is translated to
9x19x2. x1 6= x2 ^('^Ã)[x1/x]^('^Ã)[x2/x].
Each expanded answer choice contains one
modal operator. Modals are moved outward
of negation as usual, and outward of conditionals
by changing A ! ¤B to ¤(A ! B) and
A ! §B to §(A^B). A modal operator in the
outermost scope can then be interpreted as a
directive to the reasoning module to test either
entailment (¤) or consistency (§) between the
preamble and the expanded answer choice.
Using FOL reasoners
There are two reasons for using both theorem
provers and model builders. First, they
are complementary reasoners: while a theorem
prover is designed to demonstrate the inconsistency
of a set of FOL formulas, and so can
find the correct answer to “must be true” questions
through proof by contradiction, a model
builder is designed to find a satisfying model,
and is thus suited to finding the correct answer
to “could be true” questions.7 Second, a
reasoner may take a very long time to halt on
some queries, but the complementary reasoner
may still be used to answer the query in the
context of a multiple-choice question through
a process of elimination. Thus, if the model
builder is able to show that the negations of four
choices are consistent with the preamble (indicating
they are not entailed), then it can be
concluded that the remaining choice is entailed
by the preamble, even if the theorem prover has
not yet found a proof.
We use the Otter 3.3 theorem prover and
the MACE 2.2 model builder (McCune, 1998).8
The reasoning module forks parallel subprocesses,
two per answer choice (one for Otter,
one for MACE). If a reasoner succeeds for an answer
choice, the choice is marked as correct or
incorrect, and the dual sub-process is killed. If
all answer-choices but one are marked incorrect,
the remaining choice is marked correct even if
its sub-processes did not yet terminate.
The body of a logic puzzle question contains a
(unique) wh-term (typically “which of the following”),
a modality (such as “must be true” or
“could be true”), and (possibly) an added condition.
Each answer choice is expanded by substituting
its SL form for the wh-term in the question
body. For example, the expansion for answer
choice (A) of question 1 in Figure 1 would
be the SL form corresponding to: “If sculpture
D is exhibited . . . , then [Sculpture C is exhibited
in room 1 ] must be true”.
Translating SL to FOL
To translate an SL representation to pure FOL,
we eliminate event variables by replacing an SL
form 9e.P(e)^R1(e, t1)^..^Rn(e, tn) with the
FOL form P(t1, .., tn). An ordering is imposed
on role names to guarantee that arguments are
always used in the same order in relations. Numeric
quantifiers are encoded in FOL in the obvious
way, e.g., Q(¸2, x, ', Ã) is translated to
9x19x2. x1 6= x2 ^('^Ã)[x1/x]^('^Ã)[x2/x].
Each expanded answer choice contains one
modal operator. Modals are moved outward
of negation as usual, and outward of conditionals
by changing A ! ¤B to ¤(A ! B) and
A ! §B to §(A^B). A modal operator in the
outermost scope can then be interpreted as a
directive to the reasoning module to test either
entailment (¤) or consistency (§) between the
preamble and the expanded answer choice.
Using FOL reasoners
There are two reasons for using both theorem
provers and model builders. First, they
are complementary reasoners: while a theorem
prover is designed to demonstrate the inconsistency
of a set of FOL formulas, and so can
find the correct answer to “must be true” questions
through proof by contradiction, a model
builder is designed to find a satisfying model,
and is thus suited to finding the correct answer
to “could be true” questions.7 Second, a
reasoner may take a very long time to halt on
some queries, but the complementary reasoner
may still be used to answer the query in the
context of a multiple-choice question through
a process of elimination. Thus, if the model
builder is able to show that the negations of four
choices are consistent with the preamble (indicating
they are not entailed), then it can be
concluded that the remaining choice is entailed
by the preamble, even if the theorem prover has
not yet found a proof.
We use the Otter 3.3 theorem prover and
the MACE 2.2 model builder (McCune, 1998).8
The reasoning module forks parallel subprocesses,
two per answer choice (one for Otter,
one for MACE). If a reasoner succeeds for an answer
choice, the choice is marked as correct or
incorrect, and the dual sub-process is killed. If
all answer-choices but one are marked incorrect,
the remaining choice is marked correct even if
its sub-processes did not yet terminate.