Revisiting the Primacy of English in Zer...

A Primer on Pretrained Multilingual Language Models

July 1, 2021

93% Match

Sumanth Doddapaneni, Gowtham Ramesh, Mitesh M. Khapra, ... , Kumar Pratyush

Computation and Language

Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R, \textit{etc.} have emerged as a viable option for bringing the power of pretraining to a large number of languages. Given their success in zero-shot transfer learning, there has emerged a large body of work in (i) building bigger \MLLMs~covering a large number of languages (ii) creating exhaustive benchmarks covering a wider variety of tasks and languages for evaluating \MLLMs~ (iii) analysing the performance of...

Find SimilarView on arXiv

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

May 1, 2020

93% Match

Anne Lauscher, Vinit Ravishankar, ... , Glavaš Goran

Computation and Language

Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance. Current downstream evaluations, however, verify their efficacy predominantly in transfer settings involving languages with sufficient amounts of pretraining data, and with lexically and typologically close languages. In this work, we analyze t...

Find SimilarView on arXiv

Model Selection for Cross-Lingual Transfer

October 13, 2020

93% Match

Yang Chen, Alan Ritter

Computation and Language

Machine Learning

Transformers that are pre-trained on multilingual corpora, such as, mBERT and XLM-RoBERTa, have achieved impressive cross-lingual transfer capabilities. In the zero-shot transfer setting, only English training data is used, and the fine-tuned model is evaluated on another target language. While this works surprisingly well, substantial variance has been observed in target language performance between different fine-tuning runs, and in the zero-shot setup, no target-language d...

Find SimilarView on arXiv

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

September 1, 2019

93% Match

Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Arivazhagan, Jason Riesa, Ankur Bapna, ... , Raman Karthik

Computation and Language

The recently proposed massively multilingual neural machine translation (NMT) system has been shown to be capable of translating over 100 languages to and from English within a single model. Its improved translation performance on low resource languages hints at potential cross-lingual transfer capability for downstream tasks. In this paper, we evaluate the cross-lingual effectiveness of representations from the encoder of a massively multilingual NMT model on 5 downstream cl...

Find SimilarView on arXiv

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

December 31, 2020

93% Match

Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, ... , Schütze Hinrich

Computation and Language

Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT. Despite its growing popularity, little to no attention has been paid to standardizing and analyzing the design of few-shot experiments. In this work, we highlight a fundamental risk posed by this shortcoming, illustrating that the model exhibits a high degree of sensitivity to the selection of few shots. We conduct a large-scale experimental s...

Find SimilarView on arXiv

Zero-Shot Cross-Lingual Transfer with Meta Learning

March 5, 2020

93% Match

Farhad Nooralahzadeh, Giannis Bekoulis, ... , Augenstein Isabelle

Computation and Language

Learning what to share between tasks has been a topic of great importance recently, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as most languages in the world are under-resourced. Here, we consider the setting of training models on multiple different languages at the same time, when little or no data is available for languages other than English. We show that this challe...

Find SimilarView on arXiv

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models

February 3, 2024

93% Match

Sara Rajaee, Christof Monz

Computation and Language

Artificial Intelligence

Machine Learning

Recent advances in training multilingual language models on large datasets seem to have shown promising results in knowledge transfer across languages and achieve high performance on downstream tasks. However, we question to what extent the current evaluation benchmarks and setups accurately measure zero-shot cross-lingual knowledge transfer. In this work, we challenge the assumption that high zero-shot performance on target tasks reflects high cross-lingual ability by introd...

Find SimilarView on arXiv

Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model

September 15, 2019

93% Match

Tsung-yuan Hsu, Chi-liang Liu, Hung-yi Lee

Computation and Language

Machine Learning

Because it is not feasible to collect training data for every language, there is a growing interest in cross-lingual transfer learning. In this paper, we systematically explore zero-shot cross-lingual transfer learning on reading comprehension tasks with a language representation model pre-trained on multi-lingual corpus. The experimental results show that with pre-trained language representation zero-shot learning is feasible, and translating the source data into the target ...

Find SimilarView on arXiv

How multilingual is Multilingual BERT?

June 4, 2019

93% Match

Telmo Pires, Eva Schlinger, Dan Garrette

Computation and Language

Artificial Intelligence

Machine Learning

In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, we present a large number of probing experiments, showing that transfer is possible even to languages in di...

Find SimilarView on arXiv

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

May 12, 2022

93% Match

Kabir Ahuja, Shanu Kumar, ... , Choudhury Monojit

Computation and Language

Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages, though the performance varies from language to language depending on the pivot language(s) used for fine-tuning. In this work, we build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem. We jointly train predictive models for different tasks which ...

Find SimilarView on arXiv

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

A Primer on Pretrained Multilingual Language Models

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

Model Selection for Cross-Lingual Transfer

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

Zero-Shot Cross-Lingual Transfer with Meta Learning

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models

Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model

How multilingual is Multilingual BERT?

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models