Revisiting the Primacy of English in Zer...

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

October 27, 2021

91% Match

Ameet Deshpande, Partha Talukdar, Karthik Narasimhan

Computation and Language

Machine Learning

While recent work on multilingual language models has demonstrated their capacity for cross-lingual zero-shot transfer on downstream tasks, there is a lack of consensus in the community as to what shared properties between languages enable such transfer. Analyses involving pairs of natural languages are often inconclusive and contradictory since languages simultaneously differ in many linguistic aspects. In this paper, we perform a large-scale empirical study to isolate the e...

Find SimilarView on arXiv

Cross-lingual Pre-training Based Transfer for Zero-shot Neural Machine Translation

December 3, 2019

91% Match

Baijun Ji, Zhirui Zhang, Xiangyu Duan, Min Zhang, ... , Luo Weihua

Computation and Language

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer lea...

Find SimilarView on arXiv

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

January 26, 2021

91% Match

Benjamin Muller, Yanai Elazar, ... , Seddah Djamé

Computation and Language

Multilingual pretrained language models have demonstrated remarkable zero-shot cross-lingual transfer capabilities. Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen during the fine-tuning. Despite promising results, we still lack a proper understanding of the source of this transfer. Using a novel layer ablation technique and analyses of the model's internal representations, we show that multilingual BE...

Find SimilarView on arXiv

Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings

October 23, 2022

91% Match

Iker García-Ferrero, Rodrigo Agerri, German Rigau

Computation and Language

Zero-resource cross-lingual transfer approaches aim to apply supervised models from a source language to unlabelled target languages. In this paper we perform an in-depth study of the two main techniques employed so far for cross-lingual zero-resource sequence labelling, based either on data or model transfer. Although previous research has proposed translation and annotation projection (data-based cross-lingual transfer) as an effective technique for cross-lingual sequence l...

Find SimilarView on arXiv

The Impact of Language Adapters in Cross-Lingual Transfer for NLU

January 31, 2024

91% Match

Jenny Kunz, Oskar Holmström

Computation and Language

Modular deep learning has been proposed for the efficient adaption of pre-trained models to new tasks, domains and languages. In particular, combining language adapters with task adapters has shown potential where no supervised data exists for a language. In this paper, we explore the role of language adapters in zero-shot cross-lingual transfer for natural language understanding (NLU) benchmarks. We study the effect of including a target-language adapter in detailed ablation...

Find SimilarView on arXiv

BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer

May 24, 2023

91% Match

Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, ... , Hajishirzi Hannaneh

Computation and Language

Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructions. BUFFET is designed to establish a rigorous and equitable evaluation framewo...

Find SimilarView on arXiv

ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters

October 25, 2023

91% Match

Vipul Rathore, Rajdeep Dhingra, ... , Mausam

Computation and Language

We tackle the problem of zero-shot cross-lingual transfer in NLP tasks via the use of language adapters (LAs). Most of the earlier works have explored training with adapter of a single source (often English), and testing either using the target LA or LA of another related language. Training target LA requires unlabeled data, which may not be readily available for low resource unseen languages: those that are neither seen by the underlying multilingual language model (e.g., mB...

Find SimilarView on arXiv

Cross-lingual Transfer of Monolingual Models

September 15, 2021

91% Match

Evangelia Gogoulou, Ariel Ekgren, ... , Sahlgren Magnus

Computation and Language

Machine Learning

Recent studies in zero-shot cross-lingual learning using multilingual models have falsified the previous hypothesis that shared vocabulary and joint pre-training are the keys to cross-lingual generalization. Inspired by this advancement, we introduce a cross-lingual transfer method for monolingual models based on domain adaptation. We study the effects of such transfer from four different languages to English. Our experimental results on GLUE show that the transferred models ...

Find SimilarView on arXiv

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

April 19, 2019

91% Match

Shijie Wu, Mark Dredze

Computation and Language

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks...

Find SimilarView on arXiv

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

October 16, 2021

91% Match

Guanhua Chen, Shuming Ma, Yun Chen, Dongdong Zhang, Jia Pan, ... , Wei Furu

Computation and Language

This paper demonstrates that multilingual pretraining and multilingual fine-tuning are both critical for facilitating cross-lingual transfer in zero-shot translation, where the neural machine translation (NMT) model is tested on source languages unseen during supervised training. Following this idea, we present SixT+, a strong many-to-English NMT model that supports 100 source languages but is trained with a parallel dataset in only six source languages. SixT+ initializes the...

Find SimilarView on arXiv

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

Cross-lingual Pre-training Based Transfer for Zero-shot Neural Machine Translation

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings

The Impact of Language Adapters in Cross-Lingual Transfer for NLU

BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer

ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters

Cross-lingual Transfer of Monolingual Models

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation