In Search of Needles in a 11M Haystack: ...

Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs

February 18, 2025

91% Match

Joon Park, Kyohei Atarashi, ... , Kashima Hisashi

Computation and Language

This paper addresses the challenge of comprehending very long contexts in Large Language Models (LLMs) by proposing a method that emulates Retrieval Augmented Generation (RAG) through specialized prompt engineering and chain-of-thought (CoT) reasoning. While recent LLMs support over 100,000 tokens in a single prompt, simply enlarging context windows has not guaranteed robust multi-hop reasoning when key details are scattered across massive input. Our approach treats the model...

Find SimilarView on arXiv

XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

April 8, 2024

91% Match

Xuanfan Ni, Hengyi Cai, Xiaochi Wei, Shuaiqiang Wang, ... , Li Piji

Computation and Language

Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks but are constrained by their small context window sizes. Various efforts have been proposed to expand the context window to accommodate even up to 200K input tokens. Meanwhile, building high-quality benchmarks with much longer text lengths and more demanding tasks to provide comprehensive evaluations is of immense practical interest to facilitate long context understanding research of L...

Find SimilarView on arXiv

Does RAG Really Perform Bad For Long-Context Processing?

February 17, 2025

91% Match

Kun Luo, Zheng Liu, Peitian Zhang, Hongjin Qian, ... , Liu Kang

Computation and Language

The efficient processing of long context poses a serious challenge for large language models (LLMs). Recently, retrieval-augmented generation (RAG) has emerged as a promising strategy for this problem, as it enables LLMs to make selective use of the long context for efficient computation. However, existing RAG approaches lag behind other long-context processing methods due to inherent limitations on inaccurate retrieval and fragmented contexts. To address these challenges, we...

Find SimilarView on arXiv

Augmenting Language Models with Long-Term Memory

June 12, 2023

91% Match

Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, ... , Wei Furu

Computation and Language

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history. We design a novel decoupled network architecture with the original backbone LLM frozen as a memory encoder and an adaptive residual side-network as a memo...

Find SimilarView on arXiv

Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction

February 4, 2025

90% Match

Frederick Dillon, Gregor Halvorsen, Simon Tattershall, ... , Vanderpool Gareth

Computation and Language

Memory retention challenges in deep neural architectures have ongoing limitations in the ability to process and recall extended contextual information. Token dependencies degrade as sequence length increases, leading to a decline in coherence and factual consistency across longer outputs. A structured approach is introduced to mitigate this issue through the reweaving of latent states captured at different processing layers, reinforcing token representations over extended seq...

Find SimilarView on arXiv

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

April 17, 2024

90% Match

Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, ... , Schütze Hinrich

Computation and Language

While current large language models (LLMs) demonstrate some capabilities in knowledge-intensive tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with infrequent knowledge and temporal degradation. In addition, the uninterpretable nature of parametric memorization makes it challenging to understand and prevent hallucination. Parametric memory pools and model editing are only partial solutions. Retrieval Augment...

Find SimilarView on arXiv

A Survey on Long Text Modeling with Transformers

February 28, 2023

90% Match

Zican Dong, Tianyi Tang, ... , Zhao Wayne Xin

Computation and Language

Modeling long texts has been an essential technique in the field of natural language processing (NLP). With the ever-growing number of long documents, it is important to develop effective modeling methods that can process and analyze such texts. However, long texts pose important research challenges for existing text models, with more complex semantics and special characteristics. In this paper, we provide an overview of the recent advances on long texts modeling based on Tra...

Find SimilarView on arXiv

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

February 7, 2024

90% Match

Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, ... , Sun Maosong

Computation and Language

Artificial Intelligence

Machine Learning

Large language models (LLMs) have emerged as a cornerstone in real-world applications with lengthy streaming inputs, such as LLM-driven agents. However, existing LLMs, pre-trained on sequences with restricted maximum length, cannot generalize to longer sequences due to the out-of-domain and distraction issues. To alleviate these issues, existing efforts employ sliding attention windows and discard distant tokens to achieve the processing of extremely long sequences. Unfortuna...

Find SimilarView on arXiv

Long Context RAG Performance of Large Language Models

November 5, 2024

90% Match

Quinn Leng, Jacob Portes, Sam Havens, ... , Carbin Michael

Machine Learning

Computation and Language

Retrieval Augmented Generation (RAG) has emerged as a crucial technique for enhancing the accuracy of Large Language Models (LLMs) by incorporating external information. With the advent of LLMs that support increasingly longer context lengths, there is a growing interest in understanding how these models perform in RAG scenarios. Can these new long context models improve RAG performance? This paper presents a comprehensive study of the impact of increased context length on RA...

Find SimilarView on arXiv

$\text{Memory}^3$: Language Modeling with Explicit Memory

July 1, 2024

90% Match

Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, ... , E Weinan

Computation and Language

Artificial Intelligence

Machine Learning

The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size...

Find SimilarView on arXiv

In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss

Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs

XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

Does RAG Really Perform Bad For Long-Context Processing?

Augmenting Language Models with Long-Term Memory

Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

A Survey on Long Text Modeling with Transformers

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Long Context RAG Performance of Large Language Models

$\text{Memory}^3$: Language Modeling with Explicit Memory