May 24, 1995
Similar papers 2
September 10, 2023
Despite recent advances in Natural Language Processing (NLP), hierarchical discourse parsing in the framework of Rhetorical Structure Theory remains challenging, and our understanding of the reasons for this are as yet limited. In this paper, we examine and model some of the factors associated with parsing difficulties in previous work: the existence of implicit discourse relations, challenges in identifying long-distance relations, out-of-vocabulary items, and more. In order...
December 21, 2007
Because of the wide variety of contemporary practices used in the automatic syntactic parsing of natural languages, it has become necessary to analyze and evaluate the strengths and weaknesses of different approaches. This research is all the more necessary because there are currently no genre- and domain-independent parsers that are able to analyze unrestricted text with 100% preciseness (I use this term to refer to the correctness of analyses assigned by a parser). All thes...
August 28, 2017
Discourse parsing has long been treated as a stand-alone problem independent from constituency or dependency parsing. Most attempts at this problem are pipelined rather than end-to-end, sophisticated, and not self-contained: they assume gold-standard text segmentations (Elementary Discourse Units), and use external parsers for syntactic features. In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as ...
May 9, 2001
This thesis presents a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The parser builds fully connected derivations incrementally, in a single pass from left-to-right across the string. We argue that the parsing approach that we have adopted is well-motivated from a psycholinguistic perspective, as a model that captures probabilistic dependencies between lexical items, as part of the process of bui...
April 12, 1996
We describe an implemented system for robust domain-independent syntactic parsing of English, using a unification-based grammar of part-of-speech and punctuation labels coupled with a probabilistic LR parser. We present evaluations of the system's performance along several different dimensions; these enable us to assess the contribution that each individual part is making to the success of the system as a whole, and thus prioritise the effort to be devoted to its further enha...
April 1, 2022
In the PDTB-3, several thousand implicit discourse relations were newly annotated \textit{within} individual sentences, adding to the over 15,000 implicit relations annotated \textit{across} adjacent sentences in the PDTB-2. Given that the position of the arguments to these \textit{intra-sentential implicits} is no longer as well-defined as with \textit{inter-sentential implicits}, a discourse parser must identify both their location and their sense. That is the focus of the ...
June 7, 1994
The availability of large on-line text corpora provides a natural and promising bridge between the worlds of natural language processing (NLP) and machine learning (ML). In recent years, the NLP community has been aggressively investigating statistical techniques to drive part-of-speech taggers, but application-specific text corpora can be used to drive knowledge acquisition at much higher levels as well. In this paper we will show how ML techniques can be used to support kno...
September 2, 1999
This paper describes how robust parsing techniques can be fruitful applied for building a query generation module which is part of a pipelined NLP architecture aimed at process natural language queries in a restricted domain. We want to show that semantic robustness represents a key issue in those NLP systems where it is more likely to have partial and ill-formed utterances due to various factors (e.g. noisy environments, low quality of speech recognition modules, etc...) and...
August 2, 1995
In this paper we present a robust parsing algorithm based on the link grammar formalism for parsing natural languages. Our algorithm is a natural extension of the original dynamic programming recognition algorithm which recursively counts the number of linkages between two words in the input sentence. The modified algorithm uses the notion of a null link in order to allow a connection between any pair of adjacent words, regardless of their dictionary definitions. The algorith...
October 9, 1995
We describe an approach to robust domain-independent syntactic parsing of unrestricted naturally-occurring (English) input. The technique involves parsing sequences of part-of-speech and punctuation labels using a unification-based grammar coupled with a probabilistic LR parser. We describe the coverage of several corpora using this grammar and report the results of a parsing experiment using probabilities derived from bracketed training data. We report the first substantial ...