The Invalid Surrogate Pair Bug: A Software Engineer's Tale of Emoji and Encoding
By
meysamazad
Slow-proofed and worth the wait. Worth its weight in flour.
Summary
A software engineer recounts their favorite bug story involving invalid surrogate pairs in Unicode/emoji handling. While migrating a legacy editor to a real-time collaborative experience using TipTap, ProseMirror, and Yjs CRDT, they encountered a bug where two emoji characters would cause the editor to break when entered together. The article explores the technical underpinnings of surrogate pairs in UTF-16 encoding, how emoji are represented, and the fascinating edge case that led to data loss. An interactive tool is provided for readers to explore the concepts.
Key quotes
· 3 pulledIf you're in the business of building things that run on computers long enough, I think you will eventually acquire a favorite bug story.
The bug: two emoji enter, none leave.
TipTap on top (itself a wrapper around ProseMirror), Yjs underneath handling the CRDT magic for real-time syncing. It worked well! Mostly.
You might also wanna read
NVIDIA releases open-source physical AI tools for robotics and autonomous vehicle development
NVIDIA has released a set of open-source "physical AI" skills and tools as part of the NVIDIA Agent Toolkit, designed to simplify robotics,
North Korean Group Famous Chollima Compromises Packagist Package to Target PHP Developers
A cybersecurity threat report detailing how the threat actor group "Famous Chollima" (linked to North Korea) targeted PHP developers by comp
hendryadrian.com·50m agoCentOS Stream vs AlmaLinux vs Rocky Linux vs Oracle Linux: A VPS Hosting Comparison
This article compares four Linux distributions—CentOS Stream, AlmaLinux, Rocky Linux, and Oracle Linux—as alternatives for VPS hosting follo
blog.radwebhosting.com·55m agoRunning Gemma 4 on a 2016 Xeon Server with No GPU: A Technical Walkthrough
The article describes running Gemma 4 (a 25B-parameter Mixture-of-Experts model) on a severely outdated server with a 2016 Intel Xeon E5-262
Suspicious hidden message discovered in jqwik testing library 1.10.0
A developer reports discovering a suspicious string in the jqwik testing library (version 1.10.0) that appears during Maven test runs. The s
NVIDIA Announces "Hack for Impact" London Event for Autonomous AI Agent Development
NVIDIA is hosting a "Hack for Impact" event in London, challenging participants to build autonomous agentic applications using open-source m
