DRAFT: Dense Retrieval Augmented Few-sho...

Topic-DPR: Topic-based Prompts for Dense Passage Retrieval

October 10, 2023

92% Match

Qingfa Xiao, Shuangyin Li, Lei Chen

Computation and Language

Artificial Intelligence

Prompt-based learning's efficacy across numerous natural language processing tasks has led to its integration into dense passage retrieval. Prior research has mainly focused on enhancing the semantic understanding of pre-trained language models by optimizing a single vector as a continuous prompt. This approach, however, leads to a semantic space collapse; identical semantic information seeps into all representations, causing their distributions to converge in a restricted re...

Find SimilarView on arXiv

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

May 18, 2023

92% Match

Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, ... , Zhang Chao

Computation and Language

Information Retrieval

Machine Learning

With the development of large language models (LLMs), zero-shot learning has attracted much attention for various NLP tasks. Different from prior works that generate training data with billion-scale natural language generation (NLG) models, we propose a retrieval-enhanced framework to create training data from a general-domain unlabeled corpus. To realize this, we first conduct contrastive pretraining to learn an unsupervised dense retriever for extracting the most relevant d...

Find SimilarView on arXiv

Retrieval Augmented Zero-Shot Text Classification

June 21, 2024

91% Match

Tassallah Abdullahi, Ritambhara Singh, Carsten Eickhoff

Information Retrieval

Zero-shot text learning enables text classifiers to handle unseen classes efficiently, alleviating the need for task-specific training data. A simple approach often relies on comparing embeddings of query (text) to those of potential classes. However, the embeddings of a simple query sometimes lack rich contextual information, which hinders the classification performance. Traditionally, this has been addressed by improving the embedding model with expensive training. We intro...

Find SimilarView on arXiv

Multi-task Retrieval for Knowledge-Intensive Tasks

January 1, 2021

91% Match

Jean Maillard, Vladimir Karpukhin, Fabio Petroni, Wen-tau Yih, Barlas Oğuz, ... , Ghosh Gargi

Computation and Language

Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking. Although neural retrieval outperforms traditional methods like tf-idf and BM25, its performance degrades considerably when applied to out-of-domain data. Driven by the question of whether a neural retrieval model can be universal and perform robustly on a wide variety of problems, we propose a multi-task trained model. Our approach not only ...

Find SimilarView on arXiv

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

July 14, 2022

91% Match

Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, ... , Tang Jie

Computation and Language

Information Retrieval

Machine Learning

Prompt tuning attempts to update few task-specific parameters in pre-trained models. It has achieved comparable performance to fine-tuning of the full parameter set on both language understanding and generation tasks. In this work, we study the problem of prompt tuning for neural text retrievers. We introduce parameter-efficient prompt tuning for text retrieval across in-domain, cross-domain, and cross-topic settings. Through an extensive analysis, we show that the strategy c...

Find SimilarView on arXiv

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

June 30, 2023

91% Match

Aaron Mueller, Kanika Narang, Lambert Mathias, ... , Firooz Hamed

Computation and Language

Large language models show impressive results on few-shot NLP tasks. However, these models are memory and computation-intensive. Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner; however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks. To overcome this issue, we propose meta-training with demonstration retri...

Find SimilarView on arXiv

RAFT: A Real-World Few-Shot Text Classification Benchmark

September 28, 2021

91% Match

Neel Alex, Eli Lifland, Lewis Tunstall, Abhishek Thakur, Pegah Maham, C. Jess Riedel, Emmie Hine, Carolyn Ashurst, Paul Sedille, Alexis Carlier, ... , Stuhlmüller Andreas

Computation and Language

Artificial Intelligence

Machine Learning

Large pre-trained language models have shown promise for few-shot learning, completing text-based tasks given only a few task-specific examples. Will models soon solve classification tasks that have so far been reserved for human research assistants? Existing benchmarks are not designed to measure progress in applied settings, and so don't directly answer this question. The RAFT benchmark (Real-world Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an e...

Find SimilarView on arXiv

Retrieval-Augmented Generation: Is Dense Passage Retrieval Retrieving?

February 16, 2024

91% Match

Benjamin Reichman, Larry Heck

Computation and Language

Information Retrieval

Dense passage retrieval (DPR) is the first step in the retrieval augmented generation (RAG) paradigm for improving the performance of large language models (LLM). DPR fine-tunes pre-trained networks to enhance the alignment of the embeddings between queries and relevant textual data. A deeper understanding of DPR fine-tuning will be required to fundamentally unlock the full potential of this approach. In this work, we explore DPR-trained models mechanistically by using a comb...

Find SimilarView on arXiv

Promptagator: Few-shot Dense Retrieval From 8 Examples

September 23, 2022

91% Match

Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, ... , Chang Ming-Wei

Computation and Language

Information Retrieval

Much recent research on information retrieval has focused on how to transfer from one task (typically with abundant supervised data) to various other tasks where supervision is limited, with the implicit assumption that it is possible to generalize from one task to all the rest. However, this overlooks the fact that there are many diverse and unique retrieval tasks, each targeting different search intents, queries, and search domains. In this paper, we suggest to work on Few-...

Find SimilarView on arXiv

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

November 17, 2021

91% Match

Aleksandra Edwards, Asahi Ushio, Jose Camacho-Collados, ... , Preece Alun

Computation and Language

Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity. State-of-the-art generative language models have been shown to provide significant gains across different NLP tasks. However, their applicability to data augmentation for text classification tasks in few-shot settings have not been fully explored, especially for specialised domains. In this paper, we leverage GPT-2 (Radfor...

Find SimilarView on arXiv

DRAFT: Dense Retrieval Augmented Few-shot Topic classifier Framework

Topic-DPR: Topic-based Prompts for Dense Passage Retrieval

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

Retrieval Augmented Zero-Shot Text Classification

Multi-task Retrieval for Knowledge-Intensive Tasks

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

RAFT: A Real-World Few-Shot Text Classification Benchmark

Retrieval-Augmented Generation: Is Dense Passage Retrieval Retrieving?

Promptagator: Few-shot Dense Retrieval From 8 Examples

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification