nip.parameters.trainers.PureTextEiParameters#
- class nip.parameters.trainers.PureTextEiParameters(rollout_selection_method: Literal['threshold', 'weighted_sampling'] = 'threshold', reward_threshold: float = 0.9, weighting_sample_size_factor: float = 0.5, weighting_minimum: float | None = None, weighting_use_replacement: bool = True, weighting_epsilon: float = 0.01)[source]#
Additional parameters for the Expert Iteration (EI) trainer.
See Anthony et al. [ATB17] for more information on EI.
- Parameters:
rollout_selection_method (Literal["threshold", "weighted_sampling"]) –
The method to use for selecting rollouts for fine-tuning. Possible values are:
”threshold”: Rollouts are selected if their reward is above a certain threshold.
”weighted_sampling”: Rollouts are selected with a probability proportional to their reward.
reward_threshold (float) – When using the threshold method, the threshold on the reward for a rollout to be added to the fine-tuning dataset.
weighting_sample_size_factor (float) – When using the weighted sampling method, the number of rollouts to sample is computed as this factor times the number of rollouts.
weighting_minimum (float | None) – When using the weighted sampling method, all rewards below this value are assigned this value before being used as weights. If
None
, no minimum is applied.weighting_use_replacement (bool) – Whether to sample with replacement when using the weighted sampling method.
weighting_epsilon (float) – When using the weighted sampling method, this value, divided by the number of rollouts, is added to the normalised weights, which are then normalised again. This can be used to prevent the probabilities from becoming zero.
Methods Summary
__eq__
(other)Return self==value.
__init__
([rollout_selection_method, ...])__repr__
()Return repr(self).
_get_param_class_from_dict
(param_dict)Try to get the parameter class from a dictionary of serialised parameters.
Construct a set of basic parameters for testing.
from_dict
(params_dict[, ignore_extra_keys])Create a parameters object from a dictionary.
get
(address)Get a value from the parameters object using a dot-separated address.
to_dict
()Convert the parameters object to a dictionary.
Attributes
reward_threshold
rollout_selection_method
weighting_epsilon
weighting_minimum
weighting_sample_size_factor
weighting_use_replacement
Methods
- __eq__(other)#
Return self==value.
- __init__(rollout_selection_method: Literal['threshold', 'weighted_sampling'] = 'threshold', reward_threshold: float = 0.9, weighting_sample_size_factor: float = 0.5, weighting_minimum: float | None = None, weighting_use_replacement: bool = True, weighting_epsilon: float = 0.01) None #
- __repr__()#
Return repr(self).
- classmethod _get_param_class_from_dict(param_dict: dict) type[ParameterValue] | None [source]#
Try to get the parameter class from a dictionary of serialised parameters.
- Parameters:
param_dict (dict) – A dictionary of parameters, which may have come from a
to_dict
method. This dictionary may contain a_type
key, which is used to determine the class of the parameter.- Returns:
param_class (type[ParameterValue] | None) – The class of the parameter, if it can be determined.
- Raises:
ValueError – If the class specified in the dictionary is not a valid parameter class.
- classmethod construct_test_params() BaseHyperParameters [source]#
Construct a set of basic parameters for testing.
- classmethod from_dict(params_dict: dict, ignore_extra_keys: bool = False) BaseHyperParameters [source]#
Create a parameters object from a dictionary.