Contextual Memory Reweaving in Large Language Models Using Layered Latent State Reconstruction

February 4, 2025

Frederick Dillon, Gregor Halvorsen, Simon Tattershall, Magnus Rowntree, Gareth Vanderpool

Computer Science

Computation and Language

Memory retention challenges in deep neural architectures have ongoing limitations in the ability to process and recall extended contextual information. Token dependencies degrade as sequence length increases, leading to a decline in coherence and factual consistency across longer outputs. A structured approach is introduced to mitigate this issue through the reweaving of latent states captured at different processing layers, reinforcing token representations over extended sequences. The proposed Contextual Memory Reweaving framework incorporates a Layered Latent State Reconstruction mechanism to systematically integrate past contextual embeddings without introducing external memory modules. Experimental results demonstrate improvements in recall accuracy across a range of sequence lengths, with notable gains in the retention of rarely occurring tokens and numerical reasoning consistency. Further analysis of computational efficiency indicates that the additional processing overhead remains within acceptable thresholds, enabling scalability across different model sizes. Evaluations in long-form text generation and ambiguous query resolution highlight the capacity of memory reweaving to enhance continuity and reduce inconsistencies over extended outputs. Attention weight distributions reveal more structured allocation patterns, suggesting that reweaved latent states contribute to improved contextual awareness. The findings establish a framework for refining memory retention mechanisms in language models, addressing long-standing challenges in handling complex, multi-step reasoning tasks.

Structured Token Retention and Computational Memory Paths in Large Language Models

February 5, 2025

94% Match

Jonathan Delena, Augustin Moreau, ... , Chatterton Frederick

Computation and Language

Memory retention mechanisms play a central role in determining the efficiency of computational architectures designed for processing extended sequences. Conventional methods for token management often impose fixed retention thresholds or rely on uniform attention weight distributions, leading to inefficient memory utilization and premature information loss in extended sequence modeling. Structured Token Retention (STR) introduces a probabilistic selection framework that dynam...

Find SimilarView on arXiv

Structured Context Recomposition for Large Language Models Using Probabilistic Layer Realignment

January 29, 2025

93% Match

Jonathan Teel, Jocasta Cumberbatch, ... , Baskerville Quentin

Computation and Language

Extended sequence generation often leads to degradation in contextual consistency due to the inability of conventional self-attention mechanisms to effectively retain long-range dependencies. Existing approaches, including memory compression and retrieval-augmented conditioning, introduce computational trade-offs that either increase inference latency or impose additional storage overhead. Structured Context Recomposition (SCR) introduces a probabilistic layer realignment str...

Find SimilarView on arXiv

M+: Extending MemoryLLM with Scalable Long-Term Memory

February 1, 2025

93% Match

Yu Wang, Dmitry Krotov, Yuanzhe Hu, Yifan Gao, Wangchunshu Zhou, Julian McAuley, Dan Gutfreund, ... , He Zexue

Computation and Language

Equipping large language models (LLMs) with latent-space memory has attracted increasing attention as they can extend the context window of existing language models. However, retaining information from the distant past remains a challenge. For example, MemoryLLM (Wang et al., 2024a), as a representative work with latent-space memory, compresses past information into hidden states across all layers, forming a memory pool of 1B parameters. While effective for sequence lengths u...

Find SimilarView on arXiv

R$^3$Mem: Bridging Memory Retention and Retrieval via Reversible Compression

February 21, 2025

92% Match

Xiaoqiang Wang, Suyuchen Wang, ... , Liu Bang

Computation and Language

Artificial Intelligence

Memory plays a key role in enhancing LLMs' performance when deployed to real-world applications. Existing solutions face trade-offs: explicit memory designs based on external storage require complex management and incur storage overhead, while implicit memory designs that store information via parameters struggle with reliable retrieval. In this paper, we propose R$^3$Mem, a memory network that optimizes both information Retention and Retrieval through Reversible context comp...

Find SimilarView on arXiv

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

April 17, 2024

92% Match

Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, ... , Schütze Hinrich

Computation and Language

While current large language models (LLMs) demonstrate some capabilities in knowledge-intensive tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with infrequent knowledge and temporal degradation. In addition, the uninterpretable nature of parametric memorization makes it challenging to understand and prevent hallucination. Parametric memory pools and model editing are only partial solutions. Retrieval Augment...

Find SimilarView on arXiv

MemLong: Memory-Augmented Retrieval for Long Text Modeling

August 30, 2024

92% Match

Weijie Liu, Zecheng Tang, Juntao Li, ... , Zhang Min

Computation and Language

Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have yielded remarkable success across diverse fields. However, handling long contexts remains a significant challenge for LLMs due to the quadratic time and space complexity of attention mechanisms and the growing memory consumption of the key-value cache during generation. This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context languag...

Find SimilarView on arXiv

Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation

January 23, 2025

92% Match

Derek Yotheringhay, Alistair Kirkland, ... , Whitesteeple Josiah

Computation and Language

Artificial Intelligence

Transformative innovations in model architectures have introduced hierarchical embedding augmentation as a means to redefine the representation of tokens through multi-level semantic structures, offering enhanced adaptability to complex linguistic inputs. Autonomous structural memory manipulation further advances this paradigm through dynamic memory reallocation mechanisms that prioritize critical contextual features while suppressing less relevant information, enabling scala...

Find SimilarView on arXiv

Context-Aware Semantic Recomposition Mechanism for Large Language Models

January 29, 2025

92% Match

Richard Katrix, Quentin Carroway, ... , Heathfield Matthias

Computation and Language

Artificial Intelligence

Context-aware processing mechanisms have increasingly become a critical area of exploration for improving the semantic and contextual capabilities of language generation models. The Context-Aware Semantic Recomposition Mechanism (CASRM) was introduced as a novel framework designed to address limitations in coherence, contextual adaptability, and error propagation in large-scale text generation tasks. Through the integration of dynamically generated context vectors and attenti...

Find SimilarView on arXiv

Long-Range Tasks Using Short-Context LLMs: Incremental Reasoning With Structured Memories

December 25, 2024

92% Match

Dulhan Jayalath, James Bradley Wendt, Nicholas Monath, ... , Gunel Beliz

Artificial Intelligence

Long-range tasks require reasoning over long inputs. Existing solutions either need large compute budgets, training data, access to model weights, or use complex, task-specific approaches. We present PRISM, which alleviates these concerns by processing information as a stream of chunks, maintaining a structured in-context memory specified by a typed hierarchy schema. This approach demonstrates superior performance to baselines on diverse tasks while using at least 4x smaller ...

Find SimilarView on arXiv

Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons

August 6, 2024

92% Match

Yifei Wang, Yuheng Chen, Wanting Wen, Yu Sheng, ... , Zeng Daniel Dajun

Computation and Language

Artificial Intelligence

In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manua...

Find SimilarView on arXiv