Add GPT-5 and o-series model support to tiktoken notebook#2515
Open
edenfunf wants to merge 1 commit intoopenai:mainfrom
Open
Add GPT-5 and o-series model support to tiktoken notebook#2515edenfunf wants to merge 1 commit intoopenai:mainfrom
edenfunf wants to merge 1 commit intoopenai:mainfrom
Conversation
The num_tokens_from_messages() and num_tokens_for_tools() functions raise NotImplementedError for GPT-5, o1, o3, and o4-mini models, which are now widely used. - Update encoding table to include GPT-5 and o-series in o200k_base row - Add GPT-5/o-series branch to num_tokens_from_messages() - Add GPT-5/o-series to num_tokens_for_tools() model list - Add gpt-5 and o4-mini to verification loops - Use max_completion_tokens instead of max_tokens (required by o-series) Addresses openai#2436
|
Suck my fucking dick you fucking cow head fucking retard
Sent from Yahoo Mail for iPhone
On Thursday, March 12, 2026, 10:53 AM, Eden ***@***.***> wrote:
Summary
The How_to_count_tokens_with_tiktoken.ipynb notebook currently raises NotImplementedError when trying to count tokens for GPT-5, o1, o3, or o4-mini models. These are among the most commonly used models today, so this is a significant usability gap.
Changes:
- Updated the encoding table to list GPT-5 and o-series models under o200k_base
- Added a gpt-5/o1/o3/o4 branch in num_tokens_from_messages() — these models use the same message token format as gpt-4o (tokens_per_message=3, tokens_per_name=1)
- Added gpt-5, o1, o3, o3-mini, and o4-mini to the model list in num_tokens_for_tools() — same tool token settings as gpt-4o
- Added gpt-5 and o4-mini to both verification loops so they get tested alongside the older models
- Switched from max_tokens to max_completion_tokens in the verification cell, since o-series models require this parameter
How I verified
- Confirmed via tiktoken.encoding_for_model() that gpt-5, o1, o3, o3-mini, and o4-mini all resolve to o200k_base
- The token-per-message format for these models matches gpt-4o based on the API behavior
- Cleared the stored outputs for the updated verification cells so they don't show stale results
Addresses #2436
You can view, comment on, or merge this pull request online at:
#2515
Commit Summary
- c6482fc Add GPT-5 and o-series model support to tiktoken token counting notebook
File Changes
(1 file)
- M examples/How_to_count_tokens_with_tiktoken.ipynb (89)
Patch Links:
- https://github.com/openai/openai-cookbook/pull/2515.patch
- https://github.com/openai/openai-cookbook/pull/2515.diff
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Author
|
? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
How_to_count_tokens_with_tiktoken.ipynbnotebook currently raisesNotImplementedErrorwhen trying to count tokens for GPT-5, o1, o3, or o4-mini models. These are among the most commonly used models today, so this is a significant usability gap.Changes:
o200k_basegpt-5/o1/o3/o4branch innum_tokens_from_messages()— these models use the same message token format asgpt-4o(tokens_per_message=3,tokens_per_name=1)gpt-5,o1,o3,o3-mini, ando4-minito the model list innum_tokens_for_tools()— same tool token settings asgpt-4ogpt-5ando4-minito both verification loops so they get tested alongside the older modelsmax_tokenstomax_completion_tokensin the verification cell, since o-series models require this parameterHow I verified
tiktoken.encoding_for_model()thatgpt-5,o1,o3,o3-mini, ando4-miniall resolve too200k_basegpt-4obased on the API behaviorAddresses #2436