ei_cv.py

Contents

ei_cv.py#

Script for running Expert Iteration (EI) with the code validation task.

This script runs through a grid of hyperparameters, specified in the param_grid dict, and runs EI experiments for the code validation task for each.

Additional settings, like whether to log to W&B, the number of rollout workers to use, and whether to use the dummy API, can be set via command line arguments. Run the script with the --help flag to see all available arguments.

scripts/ei_cv.py#

Run Code Validation experiments with Expert Iteration, running from a hyperparameter grid in sequence.

usage: scripts/ei_cv.py [-h] [-d] [-v] [-q] [--use-wandb]
                        [--wandb-project WANDB_PROJECT]
                        [--wandb-entity WANDB_ENTITY] [--tag TAG]
                        [--gpu-num GPU_NUM] [--ignore-cache] [--no-pretrain]
                        [--combo-groups COMBO_GROUPS] [--combo-num COMBO_NUM]
                        [--num-skip NUM_SKIP]
                        [--num-rollout-workers NUM_ROLLOUT_WORKERS] [--dummy]
                        [run_infix]
run_infix#

Infix to add to the run ID to distinguish between different runs. Defaults to ‘test_{time_now}’ when using dummy API; otherwise raises an error.

-h, --help#

show this help message and exit

-d, --debug#

Print debug messages

-v, --verbose#

Print additional info messages

-q, --quiet#

Print less output

--use-wandb#

Whether to use W&B to log the experiment

--wandb-project <wandb_project>#

The name of the W&B project to use

--wandb-entity <wandb_entity>#

The name of the W&B entity to use

--tag <tag>#

An optional tag for the W&B run

--gpu-num <gpu_num>#

The (0-based) number of the GPU to use

--ignore-cache#

Ignore the dataset and model cache and rebuild from scratch.

--no-pretrain#

Don’t pretrain the agents, regardless of the hyperparameters

--combo-groups <combo_groups>#

Into how many groups to split the experiment combinations

--combo-num <combo_num>#

Which combo group to run this time

--num-skip <num_skip>#

The number of initial combos to skip. Useful to resume a group

--num-rollout-workers <num_rollout_workers>#

Number of workers to use for sampling rollouts. Defaults 0 when using dummy API, 8 otherwise.

--dummy#

Whether to use the dummy API for the agents. Useful for testing.