How an Experiment is Built#

The is an overview of the main steps taken when the run_experiment function is called, which is the main entry point for running an experiment.

We assume that code of the following form has been executed:

from nip import HyperParameters, run_experiment

hyper_params = HyperParameters(...)
run_experiment(hyper_params)

We see if the hyper-parameters need to be modified. There are two reasons we might want to do this, both controlled by the BaseRunParameters sub-parameters, located at hyper_params.base_run.
- If hyper_params.base_run is "parameters", we copy the parameters from a previous run stored in Weights & Biases. This is useful if we want resume a run.
- If hyper_params.base_run is "rerun_tests", we copy the parameters from a previous run stored in Weights & Biases, except for parameters which control how tests are run. This is useful if we have a previous run without tests, and we want to rerun it, just doing the testing loop.
We set up Weights & Biases, if the use_wandb argument of run_experiment is set to True.
An ExperimentSettings object is created, which contains various settings for the experiment not relevant to reproducibility (e.g. the GPU device number, and the Weights & Biases run).
The ScenarioInstance object is created, which contains all the components of the experiment. This is done by calling the build_scenario_instance function, which executes the following steps.
1. We build the ProtocolHandler, which will handle the interaction protocol. This is done by calling the build_protocol_handler function, which looks for the appropriate protocol handler for the parameters in the registry.
2. The train and test datasets are loaded, by initialising the appropriate Dataset class. When testing on the validation split (controlled by hyper_params.test_dataset_split; only supported for pure-text scenarios), the test dataset is loaded from the validation split of the training dataset.
3. We build the agents. Agents typically consist of multiple parts, and which parts get built depends on the hyper-parameters. Each agent specified in the AgentsParameters object located at hyper_params.agents is built in the following steps. Here agent_params = hyper_params.agents[agent_name] is an instance of AgentParameters.
  1. We set the seed based on hyper_params.seed and the agent name.
  2. Agents are either composed of parts, like bodies and heads, or are a single entity (a WholeAgent). Which of these options pertains, and which parts are built, is determined by the hyper-parameters. For example, TensorDict-based RL trainers require agents consisting of parts, with a policy head and a value head (see Creating a New Trainer for more information). These parts are built by initialising the appropriate AgentPart classes.
  3. An instance of an Agent dataclass is created, which holds all the parts of the agent.
  4. If we’re loading a checkpoint (i.e. agent_params.load_checkpoint_and_parameters is True), we load the checkpoint and parameters from the Weights & Biases run specified by agent_params.checkpoint_run_id. Otherwise, we let the agent’s weights be initialised randomly.
4. If set in the hyper-parameters, pretrained embeddings for each agent are loaded into the datasets. This is done by initialising the appropriate PretrainedModel class, and generating embeddings.
5. If the trainer and scenario are pure-text based (see TensorDict or Pure Text Trainer? and TensorDict or Pure Text Scenario?), we also build shared model groups (instances of PureTextSharedModelGroup). These provide an interface for dealing with agents which share an underlying model, allowing for running fine-tuning jobs on a group level rather than on an agent level.
6. For RL trainers, the following additional components are built.
  1. The train and test environments are built, by initialising the appropriate Environment class.
  2. The agent parts are combined into combined agent parts (instances of CombinedAgentPart). Each combined agent part contains the corresponding parts of all agents, so can be treated as a single actor in reinforcement learning (with observations and actions indexed by a new agent dimension). This allows working easily with the TorchRL library.
7. The trainer is built, by initialising the appropriate Trainer class.
8. Finally, the trainer is run by calling the train method of the trainer.