nip.parameters.trainers.PureTextEiParameters#

class nip.parameters.trainers.PureTextEiParameters(rollout_selection_method: Literal['threshold', 'weighted_sampling'] = 'threshold', reward_threshold: float = 0.9, weighting_sample_size_factor: float = 0.5, weighting_minimum: float | None = None, weighting_use_replacement: bool = True, weighting_epsilon: float = 0.01)[source]#

Additional parameters for the Expert Iteration (EI) trainer.

See Anthony et al. [ATB17] for more information on EI.

Parameters:
  • rollout_selection_method (Literal["threshold", "weighted_sampling"]) –

    The method to use for selecting rollouts for fine-tuning. Possible values are:

    • ”threshold”: Rollouts are selected if their reward is above a certain threshold.

    • ”weighted_sampling”: Rollouts are selected with a probability proportional to their reward.

  • reward_threshold (float) – When using the threshold method, the threshold on the reward for a rollout to be added to the fine-tuning dataset.

  • weighting_sample_size_factor (float) – When using the weighted sampling method, the number of rollouts to sample is computed as this factor times the number of rollouts.

  • weighting_minimum (float | None) – When using the weighted sampling method, all rewards below this value are assigned this value before being used as weights. If None, no minimum is applied.

  • weighting_use_replacement (bool) – Whether to sample with replacement when using the weighted sampling method.

  • weighting_epsilon (float) – When using the weighted sampling method, this value, divided by the number of rollouts, is added to the normalised weights, which are then normalised again. This can be used to prevent the probabilities from becoming zero.

Methods Summary

__eq__(other)

Return self==value.

__init__([rollout_selection_method, ...])

__post_init__()

__repr__()

Return repr(self).

_get_param_class_from_dict(param_dict)

Try to get the parameter class from a dictionary of serialised parameters.

construct_test_params()

Construct a set of basic parameters for testing.

from_dict(params_dict[, ignore_extra_keys])

Create a parameters object from a dictionary.

get(address)

Get a value from the parameters object using a dot-separated address.

to_dict()

Convert the parameters object to a dictionary.

Attributes

reward_threshold

rollout_selection_method

weighting_epsilon

weighting_minimum

weighting_sample_size_factor

weighting_use_replacement

Methods

__eq__(other)#

Return self==value.

__init__(rollout_selection_method: Literal['threshold', 'weighted_sampling'] = 'threshold', reward_threshold: float = 0.9, weighting_sample_size_factor: float = 0.5, weighting_minimum: float | None = None, weighting_use_replacement: bool = True, weighting_epsilon: float = 0.01) None#
__post_init__()[source]#
__repr__()#

Return repr(self).

classmethod _get_param_class_from_dict(param_dict: dict) type[ParameterValue] | None[source]#

Try to get the parameter class from a dictionary of serialised parameters.

Parameters:

param_dict (dict) – A dictionary of parameters, which may have come from a to_dict method. This dictionary may contain a _type key, which is used to determine the class of the parameter.

Returns:

param_class (type[ParameterValue] | None) – The class of the parameter, if it can be determined.

Raises:

ValueError – If the class specified in the dictionary is not a valid parameter class.

classmethod construct_test_params() BaseHyperParameters[source]#

Construct a set of basic parameters for testing.

classmethod from_dict(params_dict: dict, ignore_extra_keys: bool = False) BaseHyperParameters[source]#

Create a parameters object from a dictionary.

Parameters:
  • params_dict (dict) – A dictionary of the parameters.

  • ignore_extra_keys (bool, default=False) – If True, ignore keys in the dictionary that do not correspond to fields in the parameters object.

Returns:

hyper_params (BaseParameters) – The parameters object.

get(address: str) Any[source]#

Get a value from the parameters object using a dot-separated address.

Parameters:

address (str) – The path to the value in the parameters object, separated by dots.

Returns:

value (Any) – The value at the address.

Raises:

KeyError – If the address does not exist.

to_dict() dict[source]#

Convert the parameters object to a dictionary.

Turns enums into strings, and sub-parameters into dictionaries. Includes the is_random parameter if it exists.

Returns:

params_dict (dict) – A dictionary of the parameters.