nip.utils.hugging_face.count_tokens#
- nip.utils.hugging_face.count_tokens(rollouts: NestedArrayDict, agent_id: int, model_name: str) TokenCounts[source]#
- Count the number of tokens in the rollouts. - This function counts both the prompt and completion tokens for each rollout and round. It uses the Hugging Face tokenizer for the specified model. - For the prompt, it first takes the chat history and puts it into the chat template for the model. - Parameters:
- rollouts (NestedArrayDict) – - The rollouts nested array dictionary. Has keys: - (“agents”, “prompt”) (rollout round agent message field): The prompt messages passed to each model, as a chat history. 
- (“agents”, “raw_message”) (rollout round agent): The completion messages returned by each model. 
 
- agent_id (int) – The ID of the agent for which to count tokens. This is used to index into the rollouts dictionary. 
- model_name (str) – The name of the model to use for tokenization, typically a Hugging Face identifier. 
 
- Returns:
- token_count (TokenCount) – A dataclass containing the token counts for prompts, completions, and total tokens. The shape of all elements is (rollout round).