Tuesday, January 28, 2014

Why Logic Puzzles?


Logic puzzles have a number of attractive characteristics
as a target domain for research placing
a premium on precise inference.
First, whereas for humans the language understanding
part of logic puzzles is trivial but
the reasoning is difficult, for computers it is
clearly the reverse. It is straightforward for a
computer to solve a formalized puzzle, so the
research effort is on the NLP parts rather than
a difficult back-end AI problem. Moreover, only
a small core of world knowledge (prominently,
temporal and spatial entailments) is typically
crucial to solving the task.
Second, the texts employ everyday language:
there are no domain-restrictions on syntactic
and semantic constructions, and the situations
described by the texts are diverse.
Third, and most crucial, answers to puzzle
questions never explicitly appear in the text and

Preamble: Six sculptures – C, D, E, F, G, and H
– are to be exhibited in rooms 1, 2, and 3 of an art
gallery. The exhibition conforms to the following
conditions:
(1) Sculptures C and E may not be exhibited in
the same room.
(2) Sculptures D and G must be exhibited in the
same room.
(3) If sculptures E and F are exhibited in the same
room, no other sculpture may be exhibited in that
room.
(4) At least one sculpture must be exhibited in each
room, and no more than three sculptures may be
exhibited in any room.
Question 1: If sculpture D is exhibited in room
3 and sculptures E and F are exhibited in room 1,
which of the following may be true?
(A) Sculpture C is exhibited in room 1.
(B) No more than 2 sculptures are exhibited in
room 3.
(C) Sculptures F and H are exhibited in the same
room.
(D) Three sculptures are exhibited in room 2.
(E) Sculpture G is exhibited in room 2.
Question 2: If sculptures C and G are exhibited
in room 1, which of the following may NOT be a
complete list of the sculpture(s) exhibited in room
2?
(A) Sculpture D (B) Sculptures E and H (C). . .
must be logically inferred from it, so there is
very little opportunity to use existing superficial
analysis methods of information-extraction and
question-answering as a substitute for deep understanding.
A prerequisite for successful inference
is precise understanding of semantic phenomena
like modals and quantifiers, in contrast
with much current NLP work that just ignores
such items. We believe that representations
with a well-defined model-theoretic semantics
are required.
Finally, the task has a clear evaluation metric
because the puzzle texts are designed to yield
exactly one correct answer to each multiplechoice
question. Moreover, the domain is another
example of “found test material” in the
sense of (Hirschman et al., 1999): puzzle texts
were developed with a goal independent of the
evaluation of natural language processing systems,
and so provide a more realistic evaluation
framework than specially-designed tests such as
TREC QA.
While our current system is not a real world
application, we believe that the methods being
developed could be used in applications such as
a computerized office assistant that must understand
requests such as: “Put each file containing
a task description in a different directory.”

No comments:

Post a Comment