All Topics

Technology

Design

Programming

Science

News

Gaming

Entertainment

Business

Finance

Sports

Health

Food

Travel

Art

Music

Books

Education

Politics

Personal

kweezar

1 article on Hacker News: Front Page

Appears on

Hacker News

Hacker News: Front Page

Articles1

TurboQuant: Compressing AI Vectors to 2-4 Bits Using Random Rotations

TurboQuant is a novel compression technique for AI vectors (KV caches, embeddings, attention keys) that compresses each coordinate to 2-4 bits per number without losing accuracy. The key insight is that in high dimensions, a random rotation transforms input vectors into ones with known coordinate distributions, enabling provably near-optimal distortion with

arkaung.github.io1mo ago