Traditional approaches to natural language understanding
(Woods, 1973; Warren and Pereira,
1982; Alshawi, 1992) provided a good account
of mapping from surface forms to semantic representations,
when confined to a very limited
vocabulary, syntax, and world model, and resulting
low levels of syntactic/semantic ambiguity.
It is, however, difficult to scale these
methods to unrestricted, general-domain natural
language input because of the overwhelming
problems of grammar coverage, unknown words,
unresolvable ambiguities, and incomplete domain
knowledge. Recent work in NLP has
consequently focused on more robust, broadcoverage
techniques, but with the effect of
overall shallower levels of processing. Thus,
state-of-the-art work on probabilistic parsing
(e.g., (Collins, 1999)) provides a good solution
to robust, broad coverage parsing with automatic
and frequently successful ambiguity resolution,
but has largely ignored issues of semantic
interpretation. The field of Question Answering
(Pasca and Harabagiu, 2001; Moldovan et al.,
2003) focuses on simple-fact queries. And socalled
semantic parsing (Gildea and Jurafsky,
2002) provides as end output only a flat classification
of semantic arguments of predicates,
ignoring much of the semantic content, such as
quantifiers.
A major research question that remains unanswered
is whether there are methods for getting
from a robust “parse-anything” statistical
parser to a semantic representation precise
enough for knowledge representation and automated
reasoning, without falling afoul of the
same problems that stymied the broad application
of traditional approaches. This paper
presents initial work on a system that addresses
this question. The chosen task is solving logic
puzzles of the sort found in the Law School Admission
Test (LSAT) and the old analytic section
of the Graduate Record Exam (GRE) (see
Figure 1 for a typical example). The system integrates
statistical parsing, “on-the-fly” combinatorial
synthesis of semantic forms, scope- and
reference-resolution, and precise semantic representations
that support the inference required
for solving the puzzles. Our work complements
research in semantic parsing and TRECstyle
Question Answering by emphasizing complex
yet robust inference over general-domain
NL texts given relatively minimal lexical and
knowledge-base resources.
(Woods, 1973; Warren and Pereira,
1982; Alshawi, 1992) provided a good account
of mapping from surface forms to semantic representations,
when confined to a very limited
vocabulary, syntax, and world model, and resulting
low levels of syntactic/semantic ambiguity.
It is, however, difficult to scale these
methods to unrestricted, general-domain natural
language input because of the overwhelming
problems of grammar coverage, unknown words,
unresolvable ambiguities, and incomplete domain
knowledge. Recent work in NLP has
consequently focused on more robust, broadcoverage
techniques, but with the effect of
overall shallower levels of processing. Thus,
state-of-the-art work on probabilistic parsing
(e.g., (Collins, 1999)) provides a good solution
to robust, broad coverage parsing with automatic
and frequently successful ambiguity resolution,
but has largely ignored issues of semantic
interpretation. The field of Question Answering
(Pasca and Harabagiu, 2001; Moldovan et al.,
2003) focuses on simple-fact queries. And socalled
semantic parsing (Gildea and Jurafsky,
2002) provides as end output only a flat classification
of semantic arguments of predicates,
ignoring much of the semantic content, such as
quantifiers.
A major research question that remains unanswered
is whether there are methods for getting
from a robust “parse-anything” statistical
parser to a semantic representation precise
enough for knowledge representation and automated
reasoning, without falling afoul of the
same problems that stymied the broad application
of traditional approaches. This paper
presents initial work on a system that addresses
this question. The chosen task is solving logic
puzzles of the sort found in the Law School Admission
Test (LSAT) and the old analytic section
of the Graduate Record Exam (GRE) (see
Figure 1 for a typical example). The system integrates
statistical parsing, “on-the-fly” combinatorial
synthesis of semantic forms, scope- and
reference-resolution, and precise semantic representations
that support the inference required
for solving the puzzles. Our work complements
research in semantic parsing and TRECstyle
Question Answering by emphasizing complex
yet robust inference over general-domain
NL texts given relatively minimal lexical and
knowledge-base resources.
No comments:
Post a Comment