LLM Token Counter
Approximate token counts across the four major tokenizer families. All counts are estimates — use the provider's tokenizer for exact billing.
| Family | Chars | Words | Tokens (approx) | |
|---|---|---|---|---|
OpenAI (GPT-4 / o1 / o3) cl100k_base BPE family — English prose ≈ 4 chars/token | 0 | 0 | 0 | |
Anthropic (Claude 4 family) Anthropic tokenizer — slightly tighter than GPT | 0 | 0 | 0 | |
Google (Gemini 1.5 / 2.0) SentencePiece — close to GPT baseline | 0 | 0 | 0 | |
Meta (Llama 3 / 3.3) Tiktoken-derived BPE with code-friendly merges | 0 | 0 | 0 |
Approximate only. No browser-side library reproduces every provider's exact tokenizer. Numbers here are character-ratio heuristics adjusted for multi-byte text. They're typically within 10% of the real count for English prose, but can drift more on code or CJK input.
About LLM Token Counter
Approximate token counts for the four major tokenizer families — OpenAI (GPT-4/o-series), Anthropic (Claude 4), Google (Gemini 2.0), and Meta (Llama 3.3) — without shipping a multi-megabyte WASM tokenizer. Useful for quick budget checks and prompt sizing before you hit the API.
Accuracy note
Counts are character-ratio heuristics with a multi-byte adjustment. They are typically within 10% for English prose, but can drift more on code-heavy or CJK input. The table labels every count as approximate — never use these numbers for billing estimates without verifying against the provider's tokenizer.
Pipeline
- LLM Cost Calculator — pipe the token count here to estimate spend.
- Prompt Template Tester — render a template with variables, then count the result.
Frequently asked
- Why are the counts approximate?
- Each provider uses a different tokenizer. OpenAI uses cl100k_base BPE; Anthropic uses a proprietary tokenizer; Google uses SentencePiece; Meta uses a tiktoken-derived BPE. Shipping all four as WASM bundles would add ~4 MB to the page. Instead, this tool uses character-ratio heuristics (adjusted for multi-byte text) that are typically within 10% for English prose. For exact counts, use the provider's SDK.
- Why does code or CJK text have more tokens per character?
- BPE tokenizers learn merges from training data. Common English words become single tokens; rare words, code symbols, and CJK characters are split into smaller pieces. A Python function with many operators and identifiers can tokenize at 2–3 chars/token instead of the ~4 chars/token baseline for English prose.
- What is the context window and why does it matter?
- The context window is the maximum number of tokens a model can process in a single call — input + output combined. If your prompt plus expected output exceeds the window, the model will either refuse the request or silently truncate the input. Always leave headroom for the output.
- How do I get exact token counts?
- For OpenAI: use the tiktoken Python library or the JS port. For Anthropic: call the token counting API endpoint (beta). For Gemini: use the countTokens method in the Google AI SDK. For Llama: use the tokenizer.json from the model repository with the Hugging Face tokenizers library.