nip.protocols.main_protocols.MultiChannelTestProtocol#
- class nip.protocols.main_protocols.MultiChannelTestProtocol(hyper_params: HyperParameters, settings: ExperimentSettings, *, verifier_name: str = 'verifier')[source]#
- A protocol for testing multi-channel communication between agents. - Methods Summary - __init__(hyper_params, settings, *[, ...])- _get_agent_decision_made_mask(round_id, y, ...)- Get a mask indicating whether an agent has made a decision. - _get_new_terminated_mask(round_id, ...)- Get a mask indicating whether the episode has been newly terminated. - Compute the guess reward for the verifier, with - continuous_decision.- _get_verifier_guess_reward_discrete(reward, ...)- Compute the guess reward for the verifier, without - continuous_decision.- Compute the rewards for the other agents and add them to the current reward. - Make sure that agents are only active in channels they can see. - can_agent_be_active(agent_name, round_id, ...)- Specify whether an agent can be active in a given round and channel. - can_agent_be_active_any_channel(agent_name, ...)- Specify whether an agent can be active in any channel in a given round. - can_agent_see_channel(agent_name, channel_name)- Determine whether an agent can see a channel. - Get a boolean mask of active agents for a batch of rounds. - get_agent_visible_channels(agent_name)- Get the names of the channels visible to an agent. - Get a boolean mask indicating when the verifier can make a guess. - is_agent_active(agent_name, round_id, ...)- Specify whether an agent is active in a given round and channel. - max_reward(agent_name)- Get the maximum possible reward for an agent. - min_reward(agent_name)- Get the minimum possible reward for an agent. - reward_mid_point_estimate(agent_name)- Get an estimate of the expected reward if all agents play randomly. - step_interaction_protocol(env_td)- Take a step in the interaction protocol. - Attributes - active_agents_by_round- A list of which agent names are active in each round and channel. - active_agents_mask- A boolean mask indicating which agents are active in each round and channel. - agent_channel_visibility- agent_channel_visibility_mask- A boolean mask indicating which agents can see which message channels. - agent_first_active_round- The first round in which each agent is or can be active. - agent_names- can_be_zero_knowledge- default_stackelberg_sequence- The default Stackelberg sequence for the protocol. - max_message_rounds- max_verifier_questions- message_channel_names- min_message_rounds- num_agents- The number of agents in the protocol. - num_message_channels- The number of message channels in the protocol. - protocol_common- The common protocol parameters. - prover_indices- The indices of the provers in the list of agent names. - prover_names- The names of the provers in the protocol. - stackelberg_sequence- The actual Stackelberg sequence used in this experiment. - verifier_index- The index of the verifier in the list of agent names. - verifier_names- The names of the verifiers in the protocol. - verifier_neither_accept_nor_reject_reward- The reward for the verifier when they neither accept nor reject. - Methods - __init__(hyper_params: HyperParameters, settings: ExperimentSettings, *, verifier_name: str = 'verifier')[source]#
 - _get_agent_decision_made_mask(round_id: Int[Tensor, '...'], y: Int[Tensor, '...'], guess_mask: Bool[Tensor, '...'], decision: Int[Tensor, '...'], *, follow_force_guess: bool = True) Bool[Tensor, '...'][source]#
- Get a mask indicating whether an agent has made a decision. - Parameters:
- round_id (Int[Tensor, "..."]) – The round number. 
- y (Int[Tensor, "..."]) – The target value. 
- guess_mask (Bool[Tensor, "..."]) – A mask indicating whether the agent is allowed to make a guess. 
- decision (Int[Tensor, "..."]) – The decision output of the agent. 
- follow_force_guess (bool, default=True) – Whether to follow the - force_guessparameter, which forces the agent to make a certain decision.
 
- Returns:
- decision_made (Bool[Tensor, “…”]) – A mask indicating whether the agent has made a decision. 
 
 - _get_new_terminated_mask(round_id: Int[Tensor, '...'], verifier_decision_made: Bool[Tensor, '...']) Bool[Tensor, '...'][source]#
- Get a mask indicating whether the episode has been newly terminated. - “Newly terminated” means that the episode has been terminated this round. This happens when the max number of rounds has been reached and the verifier has not guessed. - Parameters:
- round_id (Int[Tensor, "..."]) – The round number. 
- verifier_decision_made (Bool[Tensor, "..."]) – A mask indicating whether the verifier has made a decision. 
 
- Returns:
- terminated (Bool[Tensor, “…”]) – A mask indicating whether the episode has been newly terminated. 
 
 - _get_verifier_guess_reward_continuous(reward: Float[Tensor, '... agent'], y: Int[Tensor, '...'], verifier_decision_made: Bool[Tensor, '...'], verifier_float_decision: Float[Tensor, '...']) Float[Tensor, '... agent'][source]#
- Compute the guess reward for the verifier, with - continuous_decision.- This computes the reward for the verifier when they make a guess, when the more fine-grained decision is used. - Parameters:
- reward (Float[Tensor, "... agent"]) – The tensor of rewards for the agents, which is updated in place. 
- y (Int[Tensor, "..."]) – The target value. 
- verifier_decision_made (Bool[Tensor, "..."]) – A mask indicating whether the verifier has made a decision. 
- verifier_float_decision (Float[Tensor, "..."]) – The verifier’s (continuous) decision. 
 
 
 - _get_verifier_guess_reward_discrete(reward: Float[Tensor, '... agent'], y: Int[Tensor, '...'], verifier_decision_made: Bool[Tensor, '...'], verifier_decision: Int[Tensor, '...']) Float[Tensor, '... agent'][source]#
- Compute the guess reward for the verifier, without - continuous_decision.- This computes the reward for the verifier when they make a guess, when the more fine-grained decision is not used. - Parameters:
- reward (Float[Tensor, "... agent"]) – The tensor of rewards for the agents, which is updated in place. 
- y (Int[Tensor, "..."]) – The target value. 
- verifier_decision_made (Bool[Tensor, "..."]) – A mask indicating whether the verifier has made a decision. 
- verifier_decision (Int[Tensor, "..."]) – - The verifier’s (discrete) decision. This has the following possible values: - 0: reject 
- 1: accept 
- 2: no decision 
- 3: end with neither accept nor reject (only relevant for text-based scenarios) 
 
 
 
 - _include_prover_rewards(verifier_decision_made: Bool[Tensor, '...'], verifier_decision: Int[Tensor, '...'], verifier_float_decision: Float[Tensor, '...'] | None, reward: Float[Tensor, '... agent'], env_td: TensorDictBase | NestedArrayDict)[source]#
- Compute the rewards for the other agents and add them to the current reward. - The default implementation is as follows: - If there is one prover, they are rewarded when the verifier guesses “accept”. 
- If there are two provers, the first is rewarded when the verifier guesses “reject” and the second is rewarded when the verifier guesses “accept”. 
 - When the - continuous_decisionkey is present in the environment tensor, a continuous version of this is used instead, where the reward is a linear transformation of the verifier’s decision.- Implement a custom method for protocols with more than two provers, or for protocols with different reward schemes. - The - rewardtensor is updated in place, adding in the rewards for the agents at the appropriate indices.- Parameters:
- verifier_decision_made (Bool[Tensor, "..."]) – A boolean mask indicating whether the verifier has made a decision. 
- verifier_decision (Int[Tensor, "..."]) – The verifier’s discrete decision. 
- verifier_float_decision (Float[Tensor, "..."] | None) – The verifier’s continuous decision. This is only used if the - continuous_decisionkey is present in the environment tensor. If not- None, it is used to compute the reward for the provers instead of the discrete decision.
- reward (Float[Tensor, "... agent"]) – The currently computed reward, which should include the reward for the verifier. This is updated in place. 
- env_td (TensorDictBase | NestedArrayDict) – The current observation and state. If a - NestedArrayDict, it is converted to a- TensorDictBase.
 
 
 - can_agent_be_active(agent_name: str, round_id: int, channel_name: str) bool[source]#
- Specify whether an agent can be active in a given round and channel. - For deterministic protocols, this is the same as - is_agent_active.- Returns:
- can_be_active (bool) – Whether the agent can be active in the given round and channel. 
 
 - can_agent_be_active_any_channel(agent_name: str, round_id: int) bool[source]#
- Specify whether an agent can be active in any channel in a given round. - For non-deterministic protocols, this is true if the agent has some probability of being active. - Returns:
- can_be_active (bool) – Whether the agent can be active in the given round. 
 
 - can_agent_see_channel(agent_name: str, channel_name: str) bool[source]#
- Determine whether an agent can see a channel. - Returns:
- can_see_channel (bool) – Whether the agent can see the channel. 
 
 - get_active_agents_mask_from_rounds_and_seed(round_id: Int[Tensor, '...'], seed: Int[Tensor, '...'] | None) Bool[Tensor, '... agent channel'][source]#
- Get a boolean mask of active agents for a batch of rounds. - Given a batch or rounds, returns a boolean mask indicating which agents are active in each round and channel. - Parameters:
- round_id (Int[Tensor, "..."]) – The round of the protocol. 
- seed (Int[Tensor, "..."] | None) – The per-environment seed. This is ignored for deterministic protocols, so it can be - None.
 
- Returns:
- active_agents (Bool[Tensor, “… agent channel”]) – The boolean mask. - active_agents[*batch, agent, channel]is- Trueif the agent sends a message in the channel in round- round[*batch].
 
 - get_agent_visible_channels(agent_name: str) list[str][source]#
- Get the names of the channels visible to an agent. - Parameters:
- agent_name (str) – The name of the agent. 
- Returns:
- visible_channels (list[str]) – The names of the channels visible to the agent. 
 
 - get_verifier_guess_mask_from_rounds_and_seed(round_id: Int[Tensor, '...'], seed: Int[Tensor, '...'] | None) Bool[Tensor, '...'][source]#
- Get a boolean mask indicating when the verifier can make a guess. - Takes as input a tensor of rounds and returns a boolean mask indicating when the verifier can make a guess for each element in the batch. - Parameters:
- round_id (Int[Tensor, "..."]) – The batch of rounds. 
- seed (Int[Tensor, "..."] | None) – The per-environment seed. This is ignored for deterministic protocols, so it can be - None.
 
- Returns:
- verifier_turn (Bool[Tensor, “…”]) – Which batch items the verifier can make a guess in. 
 
 - is_agent_active(agent_name: str, round_id: int, channel_name: str)[source]#
- Specify whether an agent is active in a given round and channel. 
 - max_reward(agent_name: str) float[source]#
- Get the maximum possible reward for an agent. - For the verifier, this is the maximum reward it gets for guessing plus the bonus for not guessing in each round (if positive). - For the prover, this is the reward it gets for being accepted by the verifier. - Parameters:
- agent_name (str) – The name of the agent to get the maximum reward for. 
- Returns:
- max_reward (float) – The maximum possible reward for the agent. 
 
 - min_reward(agent_name: str) float[source]#
- Get the minimum possible reward for an agent. - For the verifier, this is the minimum reward it gets for guessing plus the bonus for not guessing in each round (if negative). - For the prover, this 0. - Parameters:
- agent_name (str) – The name of the agent to get the maximum reward for. 
- Returns:
- min_reward (float) – The minimum possible reward for the agent. 
 
 - reward_mid_point_estimate(agent_name: str) float[source]#
- Get an estimate of the expected reward if all agents play randomly. - This is used to compute the mid-point of the reward range for the agent. - For example, if the agent gets reward -1 for a wrong guess and 1 for a correct guess, the mid-point estimate could be 0. - Parameters:
- agent_name (str) – The name of the agent to get the reward mid-point for. 
- Returns:
- reward_mid_point (float) – The expected reward for the agent if all agents play randomly. 
 
 - step_interaction_protocol(env_td: TensorDictBase | NestedArrayDict) tuple[Bool[Tensor, '...'], Bool[Tensor, '... agent'], Bool[Tensor, '...'], Float[Tensor, '... agent']][source]#
- Take a step in the interaction protocol. - Computes the done signals and reward. - Used in the - _stepmethod of the environment.- Parameters:
- env_td (TensorDictBase | NestedArrayDict) – - The current observation and state. If a - NestedArrayDict, it is converted to a- TensorDictBase. Has keys:- ”y” (… 1): The target value. 
- ”round” (…): The current round. 
- (“agents”, “decision”) (… agent): The decision of each agent. 
- (“agents”, “continuous_decision”) (… agent): (Optional) A more fine-grained version of the decision, which is a float between -1 and 1. 
- (“agents”, “valid_response”) (… agent): (Optional) A boolean mask indicating whether the agent’s response is valid. 
- ”done” (…): A boolean mask indicating whether the episode is done. 
- (“agents”, “done”) (… agent): A boolean mask indicating whether each
- agent is done. 
 
- ”terminated” (…): A boolean mask indicating whether the episode has been
- terminated. 
 
 
- Returns:
- shared_done (Bool[Tensor, “…”]) – A boolean mask indicating whether the episode is done because all relevant agents have made a decision. 
- agent_done (Bool[Tensor, “… agent”]) – A boolean mask indicating whether each agent is done, because they have made a decision. This is the same as - shared_donefor agents which don’t make decisions.
- terminated (Bool[Tensor, “…”]) – A boolean mask indicating whether the episode has been terminated because the max number of rounds has been reached and the verifier has not guessed. 
- reward (Float[Tensor, “… agent”]) – The reward for the agents.