Contextual Memory Reweaving in Large Lan...

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data

October 15, 2024

91% Match

Seiji Maekawa, Hayate Iso, Nikita Bhutani

Computation and Language

The rapid increase in textual information means we need more efficient methods to sift through, organize, and understand it all. While retrieval-augmented generation (RAG) models excel in accessing information from large document collections, they struggle with complex tasks that require aggregation and reasoning over information spanning across multiple documents--what we call holistic reasoning. Long-context language models (LCLMs) have great potential for managing large-sc...

Find SimilarView on arXiv

FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval

October 28, 2024

91% Match

Jinlin Wang, Suyuchen Wang, Ziwen Xia, Sirui Hong, Yun Zhu, ... , Wu Chenglin

Computation and Language

Artificial Intelligence

Large Language Models (LLMs) are proficient at retrieving single facts from extended contexts, yet they struggle with tasks requiring the simultaneous retrieval of multiple facts, especially during generation. This paper identifies a novel "lost-in-the-middle" phenomenon, where LLMs progressively lose track of critical information throughout the generation process, resulting in incomplete or inaccurate retrieval. To address this challenge, we introduce Find All Crucial Texts ...

Find SimilarView on arXiv

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

October 8, 2024

91% Match

Bowen Jin, Jinsung Yoon, ... , Arik Sercan O.

Computation and Language

Artificial Intelligence

Machine Learning

Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources. The increasing capacity of LLMs to process longer input sequences opens up avenues for providing more retrieved information, to potentially enhance the quality of generated outputs. It is plausible to assume that a larger retrieval set would contain more relevant information (higher recall), that might result in improved performance. However, our empirical finding...

Find SimilarView on arXiv

Lost in the Middle: How Language Models Use Long Contexts

July 6, 2023

91% Match

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, ... , Liang Percy

Computation and Language

While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do ...

Find SimilarView on arXiv

Reliable, Adaptable, and Attributable Language Models with Retrieval

March 5, 2024

90% Match

Akari Asai, Zexuan Zhong, Danqi Chen, Pang Wei Koh, Luke Zettlemoyer, ... , Yih Wen-tau

Computation and Language

Artificial Intelligence

Machine Learning

Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability. However, they still face practical challenges such as hallucinations, difficulty in adapting to new data distributions, and a lack of verifiability. In this position paper, we advocate for retrieval-augmented LMs to replace parametric LMs as the next generation of LMs. By incorporating large-scale datastores during inference, retrieval-augmented LMs ...

Find SimilarView on arXiv

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

February 3, 2025

90% Match

Xinyan Guan, Jiali Zeng, Fandong Meng, Chunlei Xin, Yaojie Lu, Hongyu Lin, Xianpei Han, ... , Zhou Jie

Artificial Intelligence

Computation and Language

Information Retrieval

Large Language Models (LLMs) have shown remarkable potential in reasoning while they still suffer from severe factual hallucinations due to timeliness, accuracy, and coverage of parametric knowledge. Meanwhile, integrating reasoning with retrieval-augmented generation (RAG) remains challenging due to ineffective task decomposition and redundant retrieval, which can introduce noise and degrade response quality. In this paper, we propose DeepRAG, a framework that models retriev...

Find SimilarView on arXiv

Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization

January 15, 2024

90% Match

Ninglu Shao, Shitao Xiao, ... , Zhang Peitian

Computation and Language

Large language models (LLMs) are in need of sufficient contexts to handle many critical applications, such as retrieval augmented generation and few-shot learning. However, due to the constrained window size, the LLMs can only access to the information within a limited context. Although the size of context window can be extended by fine-tuning, it will result in a substantial cost in both training and inference stage. In this paper, we present Extensible Tokenization as an al...

Find SimilarView on arXiv

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

December 16, 2024

90% Match

Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, ... , Dou Zhicheng

Computation and Language

Artificial Intelligence

Information Retrieval

Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimization of retrieval and generation. To address these issues, we propose \textbf{Ret...

Find SimilarView on arXiv

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

September 11, 2023

90% Match

Mansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel Hudson, André Bauer, ... , Foster Ian

Computation and Language

Artificial Intelligence

Machine Learning

Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users ...

Find SimilarView on arXiv

Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models

February 3, 2024

90% Match

Xindi Wang, Mahsa Salmani, Parsa Omidi, Xiangyu Ren, ... , Eshaghi Armaghan

Computation and Language

Machine Learning

Recently, large language models (LLMs) have shown remarkable capabilities including understanding context, engaging in logical reasoning, and generating responses. However, this is achieved at the expense of stringent computational and memory requirements, hindering their ability to effectively support long input sequences. This survey provides an inclusive review of the recent techniques and methods devised to extend the sequence length in LLMs, thereby enhancing their capac...

Find SimilarView on arXiv

Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data

FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

Lost in the Middle: How Language Models Use Long Contexts

Reliable, Adaptable, and Attributable Language Models with Retrieval

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models