nip.code_validation.rollout_analysis.CodeValidationRolloutAnalyser#
- class nip.code_validation.rollout_analysis.CodeValidationRolloutAnalyser(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#
Base class for analysing code validation rollouts.
Methods Summary
__init__
(hyper_params, settings, ...[, ...])forward
(rollouts[, use_tqdm])Evaluate the rollouts.
Return an iterator over agent names and channel names to be analysed.
Attributes
client
The OpenAI client to use for interacting with the OpenAI API.
system_prompt_template_filename
The filename of the system prompt template.
name
Methods
- __init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#
- abstract forward(rollouts: NestedArrayDict, use_tqdm: bool = False) dict[tuple[str, str], MaskedArray] [source]#
Evaluate the rollouts.
- Parameters:
rollouts (NestedArrayDict) – The rollouts to evaluate.
use_tqdm (bool) – Whether to use tqdm for progress bars.
- Returns:
evaluations (dict[tuple[str, str], ma.MaskedArray]) – The evaluations. A dictionary indexed by agent name and channel name, where
evaluations[agent_name, channel_name]
is an array of evaluations of shape (…)