Gemma 4 E2B Browser Tool Generates Diagrams from Text Prompts Using WebGPU
By
teamchong
1mo ago· 1 min readenNews
70/100
Toasty
Bagelometer↗
Properly proved. Has structure, has flavour, has a point.
Score70TypenewsSentimentpositive
Summary
The article describes TurboQuant Prompt → Diagram, a browser-based tool that uses Gemma 4 E2B to generate diagrams from text prompts entirely in desktop Chrome. The system converts LLM output into compact code for Excalidraw diagrams, compresses the KV cache 2.4× using the TurboQuant algorithm implemented in WGSL compute shaders, and achieves 30+ tokens per second on M1 hardware without requiring servers or API keys. The demo requires Chrome 134+ with WebGPU subgroups and shader-f16 support.
Key quotes
· 4 pulledDescribe any diagram, Gemma 4 E2B generates it as Excalidraw — entirely in your browser.
The TurboQuant algorithm (polar + QJL) compresses the KV cache ~2.4× so longer conversations fit in GPU memory.
This demo reimplements the TurboQuant algorithm in WGSL compute shaders so it runs on the GPU at 30+ tok/s.
Gemma 4 E2B runs locally in desktop Chrome. KV cache compressed 2.4× via WGSL compute shaders. 30+ tok/s on M1, no server, no API key.
Gemma 4 E2B runs locally in desktop Chrome. KV cache compressed 2.4× via WGSL compute shaders. 30+ tok/s on M1, no server, no API key. Chrome 134+ desktop only — needs WebGPU subgroups + shader-f16.
