RFC 9839: Defining Safer Unicode Character Subsets for Protocols and Data Structures
By
Bogdanp
Sesame, salt, and substance. A flagship bake.
Summary
RFC 9839 addresses the issue of which Unicode characters should be excluded from text fields in data structures and protocols, despite Unicode being generally beneficial. The authors explain why certain characters are problematic and propose three practical subsets for safer implementation.
Key quotes
· 3 pulledUnicode is good. If you're designing a data structure or protocol that has text fields, they should contain Unicode characters encoded in UTF-8.
There's another question, though: 'Which Unicode characters?' The answer is 'Not all of them, please exclude some.'
It explains which characters are bad, and why, then offers three plausible less-bad subsets that you might want to use.
You might also wanna read
Beyond HTTPS: Exploring Decentralized Internet Protocols Like Gemini, Gopher, and Finger
The article explores alternative internet protocols beyond HTTPS, focusing on URI schemes like Finger (1971), Gopher (1991), and Gemini (201
Beyond HTTPS: Exploring Decentralized Internet Protocols Like Gemini, Gopher, and Finger
The article explores alternative internet protocols beyond HTTPS, focusing on URI schemes like Finger (1971), Gopher (1991), and Gemini (201
Gemini Protocol Statistics and Current State Analysis (January 2026)
This article presents statistical data about the Gemini space, an alternative web protocol, including metrics on active capsules, URIs, and
RFC 3339 vs ISO 8601: A Technical Comparison of Date and Time Formats
This article provides a technical comparison between RFC 3339 and ISO 8601 date and time formats. It explains the differences, similarities,
RFC 8594: Specification for the Sunset HTTP Header Field
RFC 8594 defines the Sunset HTTP response header field, which provides a standardized way for web services to indicate when a URI or resourc
datatracker.ietf.org·9mo agoMoQ: A Unified Protocol for Real-Time Media Streaming Over QUIC
MoQ (Media over QUIC) is a new IETF standard that addresses the fragmentation in real-time internet media by providing a unified protocol fo
