nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser#

class nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#

Base class rollout analysers which work in pure-text domains, calling APIs.

Parameters:

hyper_params (HyperParameters) – The parameters of the experiment.
settings (ExperimentSettings) – The experiment settings.
protocol_handler (ProtocolHandler) – The protocol handler, which controls in interaction between agents.
model_name (str) – The name of the model to use to analyse the rollouts. This will by accessed using and API.

Methods Summary

`__init__`(hyper_params, settings, ...[, ...])
`forward`(rollouts[, use_tqdm])	Evaluate the rollouts.
`relevant_agents_and_channels`()	Return an iterator over agent names and channel names to be analysed.

Attributes

`system_prompt_template_filename`	The filename of the system prompt template.
`name`

Methods

__init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#

abstract forward(rollouts: NestedArrayDict, use_tqdm: bool = False) → dict[tuple[str, str], MaskedArray][source]#

Evaluate the rollouts.

Parameters:

rollouts (NestedArrayDict) – The rollouts to evaluate.
use_tqdm (bool) – Whether to use tqdm for progress bars.

Returns:

evaluations (dict[tuple[str, str], ma.MaskedArray]) – The evaluations. A dictionary indexed by agent name and channel name, where evaluations[agent_name, channel_name] is an array of evaluations of shape (…)

abstract relevant_agents_and_channels() → Iterator[tuple[str, str]][source]#

Return an iterator over agent names and channel names to be analysed.

Yields:

agent_name (str) – The name of the agent.
channel_name (str) – The name of the channel.

nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser

Contents

nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser#