All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Understanding String Length Calculations for Emojis Across Programming Languages

By

program

9mo ago· 64 min readenInsight

Summary

This article examines the complex issue of string length calculation for emojis and Unicode characters across different programming languages (JavaScript, Swift, Python). The author argues that JavaScript's approach of returning the number of UTF-16 code units (which gives 7 for "🤦🏼‍♂️") is actually reasonable, while criticizing Python 3's approach as the worst. The article provides technical explanations about Unicode encoding, grapheme clusters, and why different languages make different design choices for string length calculations.

Key quotes

· 4 pulled
From time to time, someone shows that in JavaScript the .length of a string containing an emoji results in a number greater than 1 (typically 2) and then proceeds to the conclusion that haha JavaScript is so broken
I will try to convince you that ridiculing JavaScript for this is less insightful than it first appears
Swift's approach to string length isn't unambiguously the best one
Python 3's approach is unambiguously the worst one, though
Snippet from the RSS feed
From time to time, someone shows that in JavaScript the .length of a string containing an emoji results in a number greater than 1 (typically 2) and then proceeds to the conclusion that haha JavaScript is so broken—and is rewarded with many likes. In this

You might also wanna read