nip.code_validation.rollout_analysis.CodeValidationRolloutAnalyser#

class nip.code_validation.rollout_analysis.CodeValidationRolloutAnalyser(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#

Base class for analysing code validation rollouts.

Methods Summary

__init__(hyper_params, settings, ...[, ...])

forward(rollouts[, use_tqdm])

Evaluate the rollouts.

relevant_agents_and_channels()

Return an iterator over agent names and channel names to be analysed.

Attributes

client

The OpenAI client to use for interacting with the OpenAI API.

system_prompt_template_filename

The filename of the system prompt template.

name

Methods

__init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#
abstract forward(rollouts: NestedArrayDict, use_tqdm: bool = False) dict[tuple[str, str], MaskedArray][source]#

Evaluate the rollouts.

Parameters:
  • rollouts (NestedArrayDict) – The rollouts to evaluate.

  • use_tqdm (bool) – Whether to use tqdm for progress bars.

Returns:

evaluations (dict[tuple[str, str], ma.MaskedArray]) – The evaluations. A dictionary indexed by agent name and channel name, where evaluations[agent_name, channel_name] is an array of evaluations of shape (…)

abstract relevant_agents_and_channels() Iterator[tuple[str, str]][source]#

Return an iterator over agent names and channel names to be analysed.

Yields:
  • agent_name (str) – The name of the agent.

  • channel_name (str) – The name of the channel.