nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser#
- class nip.scenario_base.rollout_analysis.PureTextRolloutAnalyser(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#
Base class rollout analysers which work in pure-text domains, calling APIs.
- Parameters:
hyper_params (HyperParameters) – The parameters of the experiment.
settings (ExperimentSettings) – The experiment settings.
protocol_handler (ProtocolHandler) – The protocol handler, which controls in interaction between agents.
model_name (str) – The name of the model to use to analyse the rollouts. This will by accessed using and API.
Methods Summary
__init__
(hyper_params, settings, ...[, ...])forward
(rollouts[, use_tqdm])Evaluate the rollouts.
Return an iterator over agent names and channel names to be analysed.
Attributes
system_prompt_template_filename
The filename of the system prompt template.
name
Methods
- __init__(hyper_params: HyperParameters, settings: ExperimentSettings, protocol_handler: ProtocolHandler, model_name: str, *, use_dummy_api: bool = False)[source]#
- abstract forward(rollouts: NestedArrayDict, use_tqdm: bool = False) dict[tuple[str, str], MaskedArray] [source]#
Evaluate the rollouts.
- Parameters:
rollouts (NestedArrayDict) – The rollouts to evaluate.
use_tqdm (bool) – Whether to use tqdm for progress bars.
- Returns:
evaluations (dict[tuple[str, str], ma.MaskedArray]) – The evaluations. A dictionary indexed by agent name and channel name, where
evaluations[agent_name, channel_name]
is an array of evaluations of shape (…)