Recursive Language Models: A New Approach for Processing Extremely Long Prompts Beyond Standard Context Windows

schmuhblaster

4mo ago· 1 min readenInsight

75/100

Toasty

Bagelometer↗

A bagel you'd recommend to a friend without hedging.

Score75TypeanalysisSentimentpositive

Summary

Researchers propose Recursive Language Models (RLMs), a novel inference strategy that enables large language models to process prompts far beyond their standard context windows. RLMs treat long prompts as external environments, allowing LLMs to programmatically examine, decompose, and recursively call themselves over prompt snippets. The approach successfully handles inputs up to two orders of magnitude beyond standard context limits and outperforms base LLMs and existing long-context methods across diverse tasks while maintaining comparable or lower computational costs.

Key quotes

· 3 pulled

We propose Recursive Language Models (RLMs), a general inference strategy that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt.

We find that RLMs successfully handle inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of base LLMs and common long-context scaffolds across four diverse long-context tasks.

RLMs have comparable (or cheaper) cost per query while handling inputs far beyond standard context limits.

Snippet from the RSS feed

We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference strategy that treats long prompts as part of an external enviro

You might also wanna read

RICP: A Teacher-Student Framework for Retrieved In-Context Principles from Mistakes in LLMs

This paper introduces Retrieved In-Context Principles (RICP), a novel teacher-student framework for improving Large Language Models (LLMs) t

arxiv.org·4d ago

RTP-LLM: Alibaba's High-Performance Inference Engine for Large Language Model Deployment

This paper presents RTP-LLM, a high-performance inference engine developed by Alibaba for industrial-scale deployment of Large Language Mode

arxiv.org·1d ago

PromptEmbedder: A Dual-LLM Framework for Efficient, Architecture-Agnostic Text Embedding

The article presents PromptEmbedder, a novel dual-LLM framework for efficient and transferable text embedding. It addresses the bottleneck o

arxiv.org·3d ago