explanation-based learning of data oriented parsing

August 20, 1997

Robust stochastic parsing using the inside-outside algorithm

December 19, 1994

84% Match

University of Cambridge Briscoe, University of Cambridge Ted, ... , Nick University of Cambridge

Computation and Language

The paper describes a parser of sequences of (English) part-of-speech labels which utilises a probabilistic grammar trained using the inside-outside algorithm. The initial (meta)grammar is defined by a linguist and further rules compatible with metagrammatical constraints are automatically generated. During training, rules with very low probability are rejected yielding a wide-coverage parser capable of ranking alternative analyses. A series of corpus-based experiments descri...

Find SimilarView on arXiv

Dependency Parsing with Dynamic Bayesian Network

March 27, 2007

84% Match

Virginia Savova, Leonid Peshkin

Computation and Language

Artificial Intelligence

Exact parsing with finite state automata is deemed inappropriate because of the unbounded non-locality languages overwhelmingly exhibit. We propose a way to structure the parsing task in order to make it amenable to local classification methods. This allows us to build a Dynamic Bayesian Network which uncovers the syntactic dependency structure of English sentences. Experiments with the Wall Street Journal demonstrate that the model successfully learns from labeled data.

Find SimilarView on arXiv

A Linear Observed Time Statistical Parser Based on Maximum Entropy Models

June 11, 1997

84% Match

Adwait University of Pennsylvania Ratnaparkhi

Computation and Language

This paper presents a statistical parser for natural language that obtains a parsing accuracy---roughly 87% precision and 86% recall---which surpasses the best previously published results on the Wall St. Journal domain. The parser itself requires very little human intervention, since the information it uses to make parsing decisions is specified in a concise and simple manner, and is combined in a fully automatic way under the maximum entropy framework. The observed running ...

Find SimilarView on arXiv

Efficient probabilistic top-down and left-corner parsing

August 21, 2000

84% Match

Brian Roark, Mark Johnson

Computation and Language

This paper examines efficient predictive broad-coverage parsing without dynamic programming. In contrast to bottom-up methods, depth-first top-down parsing produces partial parses that are fully connected trees spanning the entire left context, from which any kind of non-local dependency or partial semantic interpretation can in principle be read. We contrast two predictive parsing approaches, top-down and left-corner parsing, and find both to be viable. In addition, we find ...

Find SimilarView on arXiv

Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle

April 1, 2019

84% Match

Maximin Coavoux, Shay B. Cohen

Computation and Language

We introduce a novel transition system for discontinuous constituency parsing. Instead of storing subtrees in a stack --i.e. a data structure with linear-time sequential access-- the proposed system uses a set of parsing items, with constant-time random access. This change makes it possible to construct any discontinuous constituency tree in exactly $4n - 2$ transitions for a sentence of length $n$. At each parsing step, the parser considers every item in the set to be combin...

Find SimilarView on arXiv

Learning unification-based grammars using the Spoken English Corpus

June 28, 1994

84% Match

Miles Dept. of Computer Science, University of York, York YO1 5DD, England Osborne, Derek Dept. of Computer Science, University of York, York, YO1 5DD, England Bridge

Computation and Language

This paper describes a grammar learning system that combines model-based and data-driven learning within a single framework. Our results from learning grammars using the Spoken English Corpus (SEC) suggest that combined model-based and data-driven learning can produce a more plausible grammar than is the case when using either learning style isolation.

Find SimilarView on arXiv

Integrative Semantic Dependency Parsing via Efficient Large-scale Feature Selection

January 23, 2014

84% Match

Hai Zhao, Xiaotian Zhang, Chunyu Kit

Computation and Language

Semantic parsing, i.e., the automatic derivation of meaning representation such as an instantiated predicate-argument structure for a sentence, plays a critical role in deep processing of natural language. Unlike all other top systems of semantic dependency parsing that have to rely on a pipeline framework to chain up a series of submodels each specialized for a specific subtask, the one presented in this article integrates everything into one model, in hopes of achieving des...

Find SimilarView on arXiv

Heuristics and Parse Ranking

August 28, 1995

84% Match

B. Srinivas, Christine Doran, Seth Kulick

Computation and Language

There are currently two philosophies for building grammars and parsers -- Statistically induced grammars and Wide-coverage grammars. One way to combine the strengths of both approaches is to have a wide-coverage grammar with a heuristic component which is domain independent but whose contribution is tuned to particular domains. In this paper, we discuss a three-stage approach to disambiguation in the context of a lexicalized grammar, using a variety of domain independent heur...

Find SimilarView on arXiv

Robust Probabilistic Predictive Syntactic Processing

May 9, 2001

84% Match

Brian Roark

Computation and Language

This thesis presents a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The parser builds fully connected derivations incrementally, in a single pass from left-to-right across the string. We argue that the parsing approach that we have adopted is well-motivated from a psycholinguistic perspective, as a model that captures probabilistic dependencies between lexical items, as part of the process of bui...

Find SimilarView on arXiv

Supervised Grammar Induction Using Training Data with Limited Constituent Information

May 2, 1999

84% Match

Rebecca Hwa

Computation and Language

Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is prohibitively expensive. This work aims to improve the induction strategy when there are few labels in the training data. We show that the most informative linguistic constituents are the higher nodes in the parse trees, typically denoting complex noun phrases and sentential clauses. They account fo...

Find SimilarView on arXiv