All Topics

Technology

Art

How Typing Habits Like Typos and Filler Words Inflate AI Token Counts

ppipada

23d ago· 3 min readenInsight

75/100

Toasty

Bagelometer↗

Crisped on the outside, thoughtful enough on the inside.

Score75TypeanalysisSentimentneutral

Summary

This article explores how human typing habits—such as typos, shorthand, filler words, pasted IDs, and stray whitespace—can significantly impact token counts when using AI language model tokenizers. The author demonstrates this with a simple example: a 5-word prompt with 2 spelling mistakes used 13 tokens, while the corrected version used only 6 tokens. The article compares tokenization behavior between OpenAI and Claude's tokenizers, noting that Claude tends to produce more tokens for the same text. The core insight is that ordinary typing habits can change token counts without substantially changing intent, which has cost implications since providers bill per token.

Key quotes

· 3 pulled

I started noticing this on a tiny prompt: 5 words, 2 spelling mistakes, 13 tokens. I fixed the spelling and sent it again: 6 tokens, including the full stop.

In general Claude spits out more tokens on the same text compared to OpenAI in my usage.

Ordinary habits like typos, shorthand, filler words, pasted IDs, and stray whitespace can change token counts without changing intent much.

Snippet from the RSS feed

Normal human habits like swapped letters, fillers, shorthand, pasted IDs, boundary whitespace, and nearby word forms can change token count without changing intent much.

You might also wanna read

Tokenmaxxing drives up AI costs as enterprises confuse token usage with productivity

The article discusses "tokenmaxxing" — a trend where enterprises treat high AI token usage as a proxy for productivity, leading to inflated

bit.ly·8h ago

How AI-Generated Writing Is Flooding Everyday Communication — And Why It's Hard to Spot

The article explores how AI-generated writing has become pervasive in everyday communications, from text messages to professional correspond

theatlantic.com·17h ago

Companies Face Rising AI Costs as 'Tokenmaxxing' Drives Excessive Usage Without Measured Value

Companies are recognizing the financial burden of excessive AI usage, a phenomenon called "#tokenmaxxing," where employees maximize token co

eicker.news·1d ago

How to Spot AI-Generated Writing: Signs of Imperfection in Every Line

The article appears to be about identifying signs that text was written by AI, with the core insight being that AI-generated writing often h

apple.news·1d ago

How AI-Generated Writing Is Infiltrating Everyday Communication

The article explores how AI-generated writing has become pervasive in everyday communications, from text messages to professional correspond

theatlantic.com·2d ago

The Uncanny Valley of AI Writing: How Algorithmic Language Is Infiltrating Everyday Communication

The article explores how AI-generated writing has become pervasive in everyday communications, from text messages to professional correspond

theatlantic.com·1d ago