nip.parameters.trainers.TextRlParameters#

class nip.parameters.trainers.TextRlParameters(fine_tune_on_all_previous_rollouts: bool = False, verifier_guess_replacement_proportion: float = 0.0, verifier_guess_replacement_annealing: ~typing.Literal['none', 'linear', 'exponential'] = 'none', verifier_guess_replacement_annealing_rate: float = 0.1, save_transcripts: bool = True, transcript_format: ~typing.Literal['json', 'yaml'] = 'yaml', test_scheme: ~typing.Annotated[~typing.Literal['none', 'all', 'last', 'first_and_last'], <nip.parameters.base_run.BaseRunPreserve object at 0x7f21b478c810>] = 'none', test_on_whole_dataset: ~typing.Annotated[bool, <nip.parameters.base_run.BaseRunPreserve object at 0x7f21b478c850>] = True)[source]#

Additional parameters for the text-based RL trainers.

Parameters:
  • fine_tune_on_all_previous_rollouts (bool) – Whether to fine-tune the agents on the rollouts from all iterations so far. If False, only the rollouts from the current iteration are used.

  • verifier_guess_replacement_proportion (float) – When fine-tuning on the rollouts, replace the verifier’s guess with the true label for this proportion of the rollouts. This only changes the last message of the verifier, and leaves the rest of the transcript unchanged.

  • verifier_guess_replacement_annealing (Literal["none", "linear", "exponential"]) –

    The annealing schedule for the proportion of rollouts where the verifier’s guess is replaced. Possible values are:

    • ”none”: No annealing.

    • ”linear”: Linear annealing with rate verifier_guess_replacement_annealing_rate.

    • ”exponential”: Exponential annealing with base 1-verifier_guess_replacement_annealing_rate.

  • verifier_guess_replacement_annealing_rate (float) – The rate of annealing for the proportion of rollouts where the verifier’s guess is replaced.

  • save_transcripts (bool) – Whether to save the transcripts of the rollouts. Note that the raw rollouts are always saved, and the transcripts can be extracted from them. So this is mostly for convenience (and comes with a small processing overhead).

  • transcript_format (Literal["json", "yaml"]) – The format to save the transcripts in.

  • test_scheme (TestSchemeType) – When to run the test loop during training. See TestSchemeType for options.

  • test_on_whole_dataset (bool) – Whether to run the test loop on the whole dataset or only on a single iteration-worth of rollouts.

  • test_every_iteration (bool) – Whether to run the test loop after every iteration. If False, the test loop is only run after training is complete.

Methods Summary

__eq__(other)

Return self==value.

__init__([...])

__post_init__()

__repr__()

Return repr(self).

_get_param_class_from_dict(param_dict)

Try to get the parameter class from a dictionary of serialised parameters.

construct_test_params()

Construct a set of basic parameters for testing.

from_dict(params_dict[, ignore_extra_keys])

Create a parameters object from a dictionary.

get(address)

Get a value from the parameters object using a dot-separated address.

to_dict()

Convert the parameters object to a dictionary.

Attributes

fine_tune_on_all_previous_rollouts

save_transcripts

test_on_whole_dataset

test_scheme

transcript_format

verifier_guess_replacement_annealing

verifier_guess_replacement_annealing_rate

verifier_guess_replacement_proportion

Methods

__eq__(other)#

Return self==value.

__init__(fine_tune_on_all_previous_rollouts: bool = False, verifier_guess_replacement_proportion: float = 0.0, verifier_guess_replacement_annealing: ~typing.Literal['none', 'linear', 'exponential'] = 'none', verifier_guess_replacement_annealing_rate: float = 0.1, save_transcripts: bool = True, transcript_format: ~typing.Literal['json', 'yaml'] = 'yaml', test_scheme: ~typing.Annotated[~typing.Literal['none', 'all', 'last', 'first_and_last'], <nip.parameters.base_run.BaseRunPreserve object at 0x7f21b478c810>] = 'none', test_on_whole_dataset: ~typing.Annotated[bool, <nip.parameters.base_run.BaseRunPreserve object at 0x7f21b478c850>] = True) None#
__post_init__()[source]#
__repr__()#

Return repr(self).

classmethod _get_param_class_from_dict(param_dict: dict) type[ParameterValue] | None[source]#

Try to get the parameter class from a dictionary of serialised parameters.

Parameters:

param_dict (dict) – A dictionary of parameters, which may have come from a to_dict method. This dictionary may contain a _type key, which is used to determine the class of the parameter.

Returns:

param_class (type[ParameterValue] | None) – The class of the parameter, if it can be determined.

Raises:

ValueError – If the class specified in the dictionary is not a valid parameter class.

classmethod construct_test_params() BaseHyperParameters[source]#

Construct a set of basic parameters for testing.

classmethod from_dict(params_dict: dict, ignore_extra_keys: bool = False) BaseHyperParameters[source]#

Create a parameters object from a dictionary.

Parameters:
  • params_dict (dict) – A dictionary of the parameters.

  • ignore_extra_keys (bool, default=False) – If True, ignore keys in the dictionary that do not correspond to fields in the parameters object.

Returns:

hyper_params (BaseParameters) – The parameters object.

get(address: str) Any[source]#

Get a value from the parameters object using a dot-separated address.

Parameters:

address (str) – The path to the value in the parameters object, separated by dots.

Returns:

value (Any) – The value at the address.

Raises:

KeyError – If the address does not exist.

to_dict() dict[source]#

Convert the parameters object to a dictionary.

Turns enums into strings, and sub-parameters into dictionaries. Includes the is_random parameter if it exists.

Returns:

params_dict (dict) – A dictionary of the parameters.