Please share Gemini tokenize information

hyunjunseo · February 12, 2024, 12:55am

hello.

Thank you so much for the recently announced Gemini-Pro API availability.

We use a lot of APIs, and in the case of OpenAI, we expose cl100k_base so that users can pre-calculate the number of tokens and avoid API errors.

But Gemini-Pro doesn’t know anything about the token, so it has to rely on the character count.

Is it possible to share token information like OpenAI’s tiktoken?

Thank you for creating good model.

angel_perez · February 27, 2024, 1:50pm

With the Vertex AI SDK (python) – we compute the number of tokens (and characters) like:

Token Count Docs
It looks like this:

from vertexai.preview.generative_models import GenerativeModel
gemini_pro_model = GenerativeModel(“gemini-pro”)
print(gemini_pro_model.count_tokens(“why is sky blue?”))
I do miss having a local implementation that we could use like tiktoken, and it will be greater if it exists (I am not aware of it) —
I hope it helps.

PicardParis · July 4, 2024, 2:59pm

Hey!

You can now count tokens locally with the Vertex AI SDK for Python (starting with version 1.57.0).
Check out this Medium article for details: Counting Gemini text tokens locally.

hyunjunseo · July 5, 2024, 1:43am

Thank you so much

Now i can request just once not twice anymore ^^

myudak · July 15, 2024, 5:01am

how to do this with nodejs?

Topic		Replies	Views
Can we please get an offline token-counter so RAG chunkers can work reliably w/ embedding-exp-03-07 Custom ML & MLOps gemini-in-looker , vertex-ai-platform , vertex-ai-workbench	4	253	July 25, 2025
Gemini 1.0 Pro tekon count not 32K Custom ML & MLOps gemini-in-looker , vertex-ai-platform	4	166	March 26, 2024
Gemini pro api AI APIs gemini-in-looker , cloud-natural-language-api	1	38	January 23, 2024

Please share Gemini tokenize information

AI Suggested topics