Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

June 30, 2021

Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks

February 19, 2024

91% Match

Nadezhda Chirkova, Vassilina Nikoulina

Computation and Language

Artificial Intelligence

Zero-shot cross-lingual generation implies finetuning of the multilingual pretrained language model on a generation task in one language and then using it to make predictions for this task in other languages. Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model. In this work we compare various approaches proposed from the literature in unified settings, also including alternative...

Find SimilarView on arXiv

One For All & All For One: Bypassing Hyperparameter Tuning with Model Averaging For Cross-Lingual Transfer

October 16, 2023

91% Match

Fabian David Schmidt, Ivan Vulić, Goran Glavaš

Computation and Language

Multilingual language models enable zero-shot cross-lingual transfer (ZS-XLT): fine-tuned on sizable source-language task data, they perform the task in target languages without labeled instances. The effectiveness of ZS-XLT hinges on the linguistic proximity between languages and the amount of pretraining data for a language. Because of this, model selection based on source-language validation is unreliable: it picks model snapshots with suboptimal target-language performanc...

Find SimilarView on arXiv

Probing Multilingual Language Models for Discourse

June 9, 2021

91% Match

Murathan Kurfalı, Robert Östling

Computation and Language

Pre-trained multilingual language models have become an important building block in multilingual natural language processing. In the present paper, we investigate a range of such models to find out how well they transfer discourse-level knowledge across languages. This is done with a systematic evaluation on a broader set of discourse-level tasks than has been previously been assembled. We find that the XLM-RoBERTa family of models consistently show the best performance, by s...

Find SimilarView on arXiv

mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations

May 23, 2023

91% Match

Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, ... , Ruder Sebastian

Computation and Language

Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings. To address these challenges, we propose mmT5, a modular multilingual sequence-to-sequence model. mmT5 utilizes language-specific modules during pre-training, which disentangle language-specific information from language-agnostic information. We identify representation drift during fine-tuning as a ...

Find SimilarView on arXiv

Multilingual Transfer Learning for QA Using Translation as Data Augmentation

December 10, 2020

91% Match

Mihaela Bornea, Lin Pan, Sara Rosenthal, ... , Sil Avirup

Computation and Language

Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space. Our first strategy augments the original English training data with machine translation-generated data. This r...

Find SimilarView on arXiv

On the Cross-lingual Transferability of Monolingual Representations

October 25, 2019

91% Match

Mikel Artetxe, Sebastian Ruder, Dani Yogatama

Computation and Language

Artificial Intelligence

Machine Learning

State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions. We evaluate this hypothesis by designing an alternative approach that transfers a monolingual model to new languages at the lexical level. More concretely...

Find SimilarView on arXiv

Unsupervised Cross-lingual Representation Learning at Scale

November 5, 2019

91% Match

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, ... , Stoyanov Veselin

Computation and Language

This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +14.6% average accuracy on XNLI, +13% average F1 scor...

Find SimilarView on arXiv

ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

February 23, 2024

91% Match

Antoine Louis, Vageesh Saxena, ... , Spanakis Gerasimos

Computation and Language

Information Retrieval

State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial task-specific fine-tuning across multiple languages, often perform poorly in lang...

Find SimilarView on arXiv

Towards Zero-Shot Multilingual Synthetic Question and Answer Generation for Cross-Lingual Reading Comprehension

October 22, 2020

91% Match

Siamak Shakeri, Noah Constant, ... , Xue Linting

Computation and Language

Artificial Intelligence

Machine Learning

We propose a simple method to generate multilingual question and answer pairs on a large scale through the use of a single generative model. These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target languages. Our proposed multi-task training of the generative model only requires the labeled training samples in English, thus removing the need for such samples in the target languages, making it applicable to far more languages...

Find SimilarView on arXiv

Understanding Calibration for Multilingual Question Answering Models

November 15, 2023

91% Match

Yahan Yang, Soham Dan, ... , Lee Insup

Computation and Language

Machine Learning

Multilingual pre-trained language models are incredibly effective at Question Answering (QA), a core task in Natural Language Understanding, achieving high accuracies on several multilingual benchmarks. However, little is known about how well they are calibrated. In this paper, we study the calibration properties of several pre-trained multilingual large language models (LLMs) on a variety of question-answering tasks. We perform extensive experiments, spanning both extractive...

Find SimilarView on arXiv