An Efficient Compiler for Weighted Rewrite Rules

June 20, 1996

Mehryar AT&T Research Mohri, Richard Bell Laboratories Sproat

Computer Science

Computation and Language

Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compiling rewrite rules into FSTs. We show the algorithm to be simpler and more efficient than existing algorithms. Further, many of our applications demand the ability to compile weighted rules into weighted FSTs, transducers generalized by providing transitions with weights. We have extended the algorithm to allow for this.

Transducers from Rewrite Rules with Backreferences

April 15, 1999

91% Match

Dale Gerdemann, Noord Gertjan van

Computation and Language

Context sensitive rewrite rules have been widely used in several areas of natural language processing, including syntax, morphology, phonology and speech processing. Kaplan and Kay, Karttunen, and Mohri & Sproat have given various algorithms to compile such rewrite rules into finite-state transducers. The present paper extends this work by allowing a limited form of backreferencing in such rules. The explicit use of backreferencing leads to more elegant and general solutions.

Find SimilarView on arXiv

Compilation of Weighted Finite-State Transducers from Decision Trees

June 14, 1996

91% Match

Richard Bell Laboratories Sproat, Michael AT&T Research Riley

Computation and Language

We report on a method for compiling decision trees into weighted finite-state transducers. The key assumptions are that the tree predictions specify how to rewrite symbols from an input string, and the decision at each tree node is stateable in terms of regular expressions on the input string. Each leaf node can then be treated as a separate rule where the left and right contexts are constructable from the decisions made traversing the tree from the root to the leaf. These ru...

Find SimilarView on arXiv

Weighted Automata in Text and Speech Processing

March 29, 2005

89% Match

Mehryar Mohri, Fernando Pereira, Michael Riley

Computation and Language

Human-Computer Interaction

Finite-state automata are a very effective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We briefly describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an efficient composition algorithm for weighted transducers, and give examples illustrating the value of determ...

Find SimilarView on arXiv

A Flexible Rule Compiler for Speech Synthesis

March 23, 2004

88% Match

Wojciech Skut, Stefan Ulrich, Kathrine Hammervold

Computation and Language

Artificial Intelligence

We present a flexible rule compiler developed for a text-to-speech (TTS) system. The compiler converts a set of rules into a finite-state transducer (FST). The input and output of the FST are subject to parameterization, so that the system can be applied to strings and sequences of feature-structures. The resulting transducer is guaranteed to realize a function (as opposed to a relation), and therefore can be implemented as a deterministic device (either a deterministic FST o...

Find SimilarView on arXiv

Speech Recognition by Composition of Weighted Finite Automata

March 7, 1996

87% Match

Fernando C. N. AT&T Research Pereira, Michael D. AT&T Research Riley

Computation and Language

We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for o...

Find SimilarView on arXiv

A Bimachine Compiler for Ranked Tagging Rules

July 19, 2004

87% Match

Wojciech Skut, Stefan Ulrich, Kathrine Hammervold

Computation and Language

This paper describes a novel method of compiling ranked tagging rules into a deterministic finite-state device called a bimachine. The rules are formulated in the framework of regular rewrite operations and allow unrestricted regular expressions in both left and right rule contexts. The compiler is illustrated by an application within a speech synthesis system.

Find SimilarView on arXiv

Algorithms for Speech Recognition and Language Processing

August 27, 1996

87% Match

Mehryar AT&T Laboratories Mohri, Michael AT&T Laboratories Riley, Richard Bell Laboratories Sproat

Computation and Language

Speech processing requires very efficient methods and algorithms. Finite-state transducers have been shown recently both to constitute a very useful abstract model and to lead to highly efficient time and space algorithms in this field. We present these methods and algorithms and illustrate them in the case of speech recognition. In addition to classical techniques, we describe many new algorithms such as minimization, global and local on-the-fly determinization of weighted a...

Find SimilarView on arXiv

Compact Representations by Finite-State Transducers

July 5, 1994

85% Match

Mehryar IGM-Ladl, Paris, France Mohri

Computation and Language

Finite-state transducers give efficient representations of many Natural Language phenomena. They allow to account for complex lexicon restrictions encountered, without involving the use of a large set of complex rules difficult to analyze. We here show that these representations can be made very compact, indicate how to perform the corresponding minimization, and point out interesting linguistic side-effects of this operation.

Find SimilarView on arXiv

Efficient Semiring-Weighted Earley Parsing

July 6, 2023

85% Match

Andreas Opedal, Ran Zmigrod, Tim Vieira, ... , Eisner Jason

Computation and Language

Data Structures and Algorith...

Formal Languages and Automat...

This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups. Our presentation includes a known worst-case runtime improvement from Earley's $O (N^3|G||R|)$, which is unworkable for the large grammars that arise in natural language processing, to $O (N^3|G|)$, which matches the runtime of CKY on a binarized version of the grammar $G$. Here $N$ is the length of the sentence, $|R|$ is th...

Find SimilarView on arXiv

Finite-State Approximation of Phrase-Structure Grammars

March 8, 1996

85% Match

Fernando C. N. AT&T Research Pereira, Rebecca N. AT&T Research Wright

Computation and Language

Phrase-structure grammars are effective models for important syntactic and semantic aspects of natural languages, but can be computationally too demanding for use as language models in real-time speech recognition. Therefore, finite-state models are used instead, even though they lack expressive power. To reconcile those two alternatives, we designed an algorithm to compute finite-state approximations of context-free grammars and context-free-equivalent augmented phrase-struc...

Find SimilarView on arXiv