nip.parameters.protocol.CommonProtocolParameters#
- class nip.parameters.protocol.CommonProtocolParameters(verifier_first: bool = True, randomize_prover_stance: bool = False, prover_reward: float = 1.0, prover_invalid_response_penalty: float | None = None, verifier_reward: float = 1.0, verifier_incorrect_penalty: float = -1.0, verifier_neither_accept_nor_reject_reward: float | None = None, verifier_terminated_penalty: float = -1.0, verifier_no_guess_reward: float = 0.0, shared_reward: bool = False, force_guess: Literal['zero', 'one', 'y'] | None = None, zero_knowledge: bool = False, verifier_decision_spectrum: Literal['accept_reject', 'likert_scale_4', 'likert_scale_5', 'likert_scale_6', 'likert_scale_7', 'likert_scale', 'likert_scale_no_undecided', 'out_of_10', 'out_of_100'] = 'accept_reject')[source]#
Common additional parameters for the interaction protocol.
- Parameters:
verifier_first (bool) – Whether the verifier goes first in the protocol.
randomize_prover_stance (bool) – Whether, for each datapoint, the verdict the prover arguing for is randomized. This is only relevant when there is a single prover, and when using a text-based protocol.
prover_reward (float) – The reward given to the prover when the verifier guesses “accept”.
prover_invalid_response_penalty (float | None) – The reward given to a prover when it gives an invalid response. If
None
, provers are not penalized for invalid responses. This is only relevant in pure-text scenarios, where the prover is expected to give a text response.verifier_reward (float) – The reward given to the verifier when it guesses correctly.
verifier_incorrect_penalty (float) – The penalty given to the verifier when it guesses incorrectly.
verifier_neither_accept_nor_reject_reward (float | None) – The reward given to the verifier when it neither accepts nor rejects. If
None
, the mid-point betweenverifier_reward
andverifier_incorrect_penalty
is used. This value is only relevant for text-based scenarios. Note that when using a verifier decision spectrum (seeverifier_decision_spectrum
), reward for intermediate decisions is computed by interpolating piece-wise linearly betweenverifier_incorrect_penalty
,verifier_neither_accept_nor_reject_reward
andverifier_reward
. So in this case you probably want to set this toNone
.verifier_terminated_penalty (float) – The reward given to the verifier if the episode terminates before it guesses.
verifier_no_guess_reward (float) – The reward given to the verifier if it does not make a guess in a round.
shared_reward (bool) – Whether to use a shared reward function, where the prover gets the same reward as the verifier. This overrides
prover_reward
.force_guess (GuessType, optional) – The guess to force the verifier to make. If not provided, the verifier makes a guess using its policy.
zero_knowledge (bool) – Whether to use a zero-knowledge version of the protocol.
verifier_decision_spectrum (VerifierDecisionSpectrumType) – The scale used by the verifier to make its decision. This allows for more nuanced decisions than just “accept” or “reject”. This is only relevant for text-based scenarios.
Methods Summary
__eq__
(other)Return self==value.
__init__
([verifier_first, ...])__repr__
()Return repr(self).
_get_param_class_from_dict
(param_dict)Try to get the parameter class from a dictionary of serialised parameters.
Construct a set of basic parameters for testing.
from_dict
(params_dict[, ignore_extra_keys])Create a parameters object from a dictionary.
get
(address)Get a value from the parameters object using a dot-separated address.
to_dict
()Convert the parameters object to a dictionary.
Attributes
force_guess
prover_invalid_response_penalty
prover_reward
randomize_prover_stance
shared_reward
verifier_decision_spectrum
verifier_first
verifier_incorrect_penalty
verifier_neither_accept_nor_reject_reward
verifier_no_guess_reward
verifier_reward
verifier_terminated_penalty
zero_knowledge
Methods
- __eq__(other)#
Return self==value.
- __init__(verifier_first: bool = True, randomize_prover_stance: bool = False, prover_reward: float = 1.0, prover_invalid_response_penalty: float | None = None, verifier_reward: float = 1.0, verifier_incorrect_penalty: float = -1.0, verifier_neither_accept_nor_reject_reward: float | None = None, verifier_terminated_penalty: float = -1.0, verifier_no_guess_reward: float = 0.0, shared_reward: bool = False, force_guess: Literal['zero', 'one', 'y'] | None = None, zero_knowledge: bool = False, verifier_decision_spectrum: Literal['accept_reject', 'likert_scale_4', 'likert_scale_5', 'likert_scale_6', 'likert_scale_7', 'likert_scale', 'likert_scale_no_undecided', 'out_of_10', 'out_of_100'] = 'accept_reject') None #
- __repr__()#
Return repr(self).
- classmethod _get_param_class_from_dict(param_dict: dict) type[ParameterValue] | None [source]#
Try to get the parameter class from a dictionary of serialised parameters.
- Parameters:
param_dict (dict) – A dictionary of parameters, which may have come from a
to_dict
method. This dictionary may contain a_type
key, which is used to determine the class of the parameter.- Returns:
param_class (type[ParameterValue] | None) – The class of the parameter, if it can be determined.
- Raises:
ValueError – If the class specified in the dictionary is not a valid parameter class.
- classmethod construct_test_params() BaseHyperParameters [source]#
Construct a set of basic parameters for testing.
- classmethod from_dict(params_dict: dict, ignore_extra_keys: bool = False) BaseHyperParameters [source]#
Create a parameters object from a dictionary.