July 1, 1996
Similar papers 4
June 14, 1999
Grammatical relationships are an important level of natural language processing. We present a trainable approach to find these relationships through transformation sequences and error-driven learning. Our approach finds grammatical relationships between core syntax groups and bypasses much of the parsing phase. On our training and test set, our procedure achieves 63.6% recall and 77.3% precision (f-score = 69.8).
October 16, 2024
Sophisticated grammatical error detection/correction tools are available for a small set of languages such as English and Chinese. However, it is not straightforward -- if not impossible -- to adapt them to morphologically rich languages with complex writing rules like Turkish which has more than 80 million speakers. Even though several tools exist for Turkish, they primarily focus on spelling errors rather than grammatical errors and lack features such as web interfaces, err...
February 9, 1995
In this paper we present an interactive spelling correction system for Modern Greek. The entire system is based on a morphological lexicon. Emphasis is given to the development of the lexicon, especially as far as storage economy, speed efficiency and dictionary coverage is concerned. Extensive research was conducted from both the computer engineering and linguisting fields, in order to describe inflectional morphology as economically as possible.
June 22, 1999
This article investigates the use of Transformation-Based Error-Driven learning for resolving part-of-speech ambiguity in the Greek language. The aim is not only to study the performance, but also to examine its dependence on different thematic domains. Results are presented here for two different test cases: a corpus on "management succession events" and a general-theme corpus. The two experiments show that the performance of this method does not depend on the thematic domai...
May 25, 2023
Grammatical error correction systems improve written communication by detecting and correcting language mistakes. To help language learners better understand why the GEC system makes a certain correction, the causes of errors (evidence words) and the corresponding error types are two key factors. To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence words and grammatical error types. We propose several baselines and analysis to...
November 16, 2000
This paper explores the usefulness of a technique from software engineering, code instrumentation, for the development of large-scale natural language grammars. Information about the usage of grammar rules in test and corpus sentences is used to improve grammar and testsuite, as well as adapting a grammar to a specific genre. Results show that less than half of a large-coverage grammar for German is actually tested by two large testsuites, and that 10--30% of testing time is ...
May 11, 2022
In Grammatical Error Correction, systems are evaluated by the number of errors they correct. However, no one has assessed whether all error types are equally important. We provide and apply a method to quantify the importance of different grammatical error types to humans. We show that some rare errors are considered disturbing while other common ones are not. This affects possible directions to improve both systems and their evaluation.
October 20, 1994
This paper presents the XTAG system, a grammar development tool based on the Tree Adjoining Grammar (TAG) formalism that includes a wide-coverage syntactic grammar for English. The various components of the system are discussed and preliminary evaluation results from the parsing of various corpora are given. Results from the comparison of XTAG against the IBM statistical parser and the Alvey Natural Language Tool parser are also given.
September 23, 2023
We present the latest version of the Spanish Resource Grammar (SRG). The new SRG uses the recent version of Freeling morphological analyzer and tagger and is accompanied by a manually verified treebank and a list of documented issues. We also present the grammar's coverage and overgeneration on a small portion of a learner corpus, an entirely new research line with respect to the SRG. The grammar can be used for linguistic research, such as for empirically driven development ...
March 29, 2011
Existing grammar frameworks do not work out particularly well for controlled natural languages (CNL), especially if they are to be used in predictive editors. I introduce in this paper a new grammar notation, called Codeco, which is designed specifically for CNLs and predictive editors. Two different parsers have been implemented and a large subset of Attempto Controlled English (ACE) has been represented in Codeco. The results show that Codeco is practical, adequate and effi...