Recursive Language Models: A New Approach for Processing Extremely Long Prompts Beyond Standard Context Windows
By
schmuhblaster
A bagel you'd recommend to a friend without hedging.
Summary
Researchers propose Recursive Language Models (RLMs), a novel inference strategy that enables large language models to process prompts far beyond their standard context windows. RLMs treat long prompts as external environments, allowing LLMs to programmatically examine, decompose, and recursively call themselves over prompt snippets. The approach successfully handles inputs up to two orders of magnitude beyond standard context limits and outperforms base LLMs and existing long-context methods across diverse tasks while maintaining comparable or lower computational costs.
Key quotes
· 3 pulledWe propose Recursive Language Models (RLMs), a general inference strategy that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt.
We find that RLMs successfully handle inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of base LLMs and common long-context scaffolds across four diverse long-context tasks.
RLMs have comparable (or cheaper) cost per query while handling inputs far beyond standard context limits.
You might also wanna read
RICP: A Teacher-Student Framework for Retrieved In-Context Principles from Mistakes in LLMs
This paper introduces Retrieved In-Context Principles (RICP), a novel teacher-student framework for improving Large Language Models (LLMs) t
RTP-LLM: Alibaba's High-Performance Inference Engine for Large Language Model Deployment
This paper presents RTP-LLM, a high-performance inference engine developed by Alibaba for industrial-scale deployment of Large Language Mode
PromptEmbedder: A Dual-LLM Framework for Efficient, Architecture-Agnostic Text Embedding
The article presents PromptEmbedder, a novel dual-LLM framework for efficient and transferable text embedding. It addresses the bottleneck o
