ID: 2106.16171

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

June 30, 2021

View on ArXiv

Similar papers 3

Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks

January 26, 2021

92% Match
Hyunjin Choi, Judong Kim, Seongho Joe, ... , Gwon Youngjune
Computation and Language
Artificial Intelligence

In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training. A source of cross-lingual transfer can be as straightforward as lexical overlap between languages (e.g., use of the same scripts, shared subwords) that naturally forces text embeddings to occupy a similar representation space. Recently introduced cross-lingual language model (XLM) pretraining brings out neur...

Find SimilarView on arXiv

Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

December 20, 2023

92% Match
Tannon Kew, Florian Schottmann, Rico Sennrich
Computation and Language

The vast majority of today's large language models are English-centric, having been pretrained predominantly on English text. Yet, in order to meet user expectations, models need to be able to respond appropriately in multiple languages once deployed in downstream applications. Given limited exposure to other languages during pretraining, cross-lingual transfer is important for achieving decent performance in non-English settings. In this work, we investigate just how much mu...

Find SimilarView on arXiv

mT5: A massively multilingual pre-trained text-to-text transformer

October 22, 2020

92% Match
Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, ... , Raffel Colin
Computation and Language

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. We also describe a ...

Find SimilarView on arXiv

When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models

October 24, 2020

91% Match
Benjamin Muller, Antonis Anastasopoulos, ... , Seddah Djamé
Computation and Language

Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-the-art performance in NLP. Still, it remains unclear how this approach should be applied for unseen languages that are not covered by any available large-scale multilingual language model and for which only a small amount of raw data is generally available. In this work, by comparing multilingual and monolingual models, we show that such models behave ...

Find SimilarView on arXiv

Language Models are Few-shot Multilingual Learners

September 16, 2021

91% Match
Genta Indra Winata, Andrea Madotto, Zhaojiang Lin, Rosanne Liu, ... , Fung Pascale
Computation and Language
Artificial Intelligence

General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre...

Find SimilarView on arXiv

How Do Multilingual Encoders Learn Cross-lingual Representation?

July 12, 2022

91% Match
Shijie Wu
Computation and Language

NLP systems typically require support for more than one language. As different languages have different amounts of supervision, cross-lingual transfer benefits languages with little to no training data by transferring from other languages. From an engineering perspective, multilingual NLP benefits development and maintenance by serving multiple languages with a single system. Both cross-lingual transfer and multilingual NLP rely on cross-lingual representations serving as the...

Find SimilarView on arXiv

An Efficient Approach for Studying Cross-Lingual Transfer in Multilingual Language Models

March 29, 2024

91% Match
Fahim Faisal, Antonios Anastasopoulos
Computation and Language

The capacity and effectiveness of pre-trained multilingual models (MLMs) for zero-shot cross-lingual transfer is well established. However, phenomena of positive or negative transfer, and the effect of language choice still need to be fully understood, especially in the complex setting of massively multilingual LMs. We propose an \textit{efficient} method to study transfer language influence in zero-shot performance on another target language. Unlike previous work, our approa...

Find SimilarView on arXiv

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

February 23, 2022

91% Match
Lianzhe Huang, Shuming Ma, Dongdong Zhang, ... , Wang Houfeng
Computation and Language

Prompt-based tuning has been proven effective for pretrained language models (PLMs). While most of the existing work focuses on the monolingual prompts, we study the multilingual prompts for multilingual PLMs, especially in the zero-shot cross-lingual setting. To alleviate the effort of designing different prompts for multiple languages, we propose a novel model that uses a unified prompt for all languages, called UniPrompt. Different from the discrete prompts and soft prompt...

Find SimilarView on arXiv

Few-shot Learning with Multilingual Language Models

December 20, 2021

91% Match
Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, ... , Li Xian
Computation and Language
Artificial Intelligence

Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting their cross-lingual generalization. In this work, we train multilingual generative language models on a corpus covering a diverse set of languages, and study their few- and zero-shot learning capabilities in a wide range of tasks. Our larg...

Find SimilarView on arXiv

The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

April 13, 2022

91% Match
Pavel Efimov, Leonid Boytsov, ... , Braslavski Pavel
Computation and Language

Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. (2020) proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a typologically diverse set of languages (S...

Find SimilarView on arXiv