If you like this post, follow my journey on Twitter: https://x.com/random
Discussion: Hacker News
Anthropic just announced Claude Sonnet 4 has a million tokens of context.
OK, what does that mean exactly?
WTF is a token
A token isn't a word.
Think of it more as a Lego block.
Sometimes a token is a whole word ("dog"). Sometimes it's part of a word ("un" in "unprecedented"). Sometimes it's just punctuation.
On average, one token → 0.75 words in English.
Roughly:
1,000 tokens → 750 words → 2-3 pages
100,000 tokens → 75,000 words → a 300-page novel
1,000,000 tokens → 750,000 words → the entire Harry Potter series
WTF is context
Context is the model’s working memory, or everything it can "see" while responding.
It isn't just what you typed, but the entire conversation history that’s accessible to the model all at once.
The real magic happens through something called "attention".
The model isn’t reading left-to-right like you do. It’s using attention to access the entire context at once BEFORE generating each word.
Hitting the context limit means it literally can't remember anything more, and that determines whether the model can see your entire problem or just parts of it.
Why It Matters
As context limits grow, there’s one important to keep in mind:
Bigger context windows make prompting harder, not easier.
Yes, models are now able to analyze an entire codebase at once, but as a result it needs to work harder to filter out noise to generate good responses.
The winners in this new world of more context won't be those with the most data, but those who can craft prompts that take advantage of how attention works at scale.
Subscribe for more on AI, automation, and startups.