nip.trainers.malt_pure_text._PartialRolloutNode#

class nip.trainers.malt_pure_text._PartialRolloutNode(current_env_state: ~nip.utils.nested_array_dict.NestedArrayDict, protocol_handler: ~nip.protocols.protocol_base.ProtocolHandler, ended: bool = False, padding: bool = False, trajectory_env_states: list[~nip.utils.nested_array_dict.NestedArrayDict] = <factory>, node_id: int = -1, parent_partial_rollout: ~nip.trainers.malt_pure_text._PartialRolloutNode | None = None, child_partial_rollouts: list[~nip.trainers.malt_pure_text._PartialRolloutNode] = <factory>, num_branches: int = 0, total_reward_per_agent: ~numpy.ndarray | float = 0.0)[source]#

A node in the tree of responses, which is a partially generated rollout.

Methods Summary

__eq__(other)

Return self==value.

__init__(current_env_state, protocol_handler)

__post_init__()

__repr__()

Return repr(self).

_get_next_node_id_and_increment()

Get the next node ID and increment the counter.

clone_as_child()

Clone this node as a child of the current node.

has_agent_acted(agent_name)

Check if the given agent has acted at this node.

visualise([include_messages, ...])

Get a recursive string representation of the rollout tree.

Attributes

ended

Whether the rollout has ended from this point onwards.

node_id

The ID of this node, which is unique in the forest of rollouts.

num_branches

The number of branches passing through this node.

padding

Whether this node is a padding node.

parent_partial_rollout

The parent node of this node, or None if this is the root node.

total_reward_per_agent

The total reward for each agent at this node and below.

current_env_state

The state of the environment at this node.

protocol_handler

The protocol handler for the experiment.

trajectory_env_states

A list of the environment states in the trajectory leading up to this node.

child_partial_rollouts

The child nodes of this node, the one-step continuations of this node.

Methods

__eq__(other)#

Return self==value.

__init__(current_env_state: ~nip.utils.nested_array_dict.NestedArrayDict, protocol_handler: ~nip.protocols.protocol_base.ProtocolHandler, ended: bool = False, padding: bool = False, trajectory_env_states: list[~nip.utils.nested_array_dict.NestedArrayDict] = <factory>, node_id: int = -1, parent_partial_rollout: ~nip.trainers.malt_pure_text._PartialRolloutNode | None = None, child_partial_rollouts: list[~nip.trainers.malt_pure_text._PartialRolloutNode] = <factory>, num_branches: int = 0, total_reward_per_agent: ~numpy.ndarray | float = 0.0) None#
__post_init__()[source]#
__repr__()#

Return repr(self).

async _get_next_node_id_and_increment() int[source]#

Get the next node ID and increment the counter.

This uses a class-level lock to ensure that the node IDs are unique across all nodes in the forest.

Returns:

node_id (int) – The next node ID.

async clone_as_child() Self[source]#

Clone this node as a child of the current node.

This creates a new node with the same environment state and adds it to the current node’s list of child nodes.

Returns:

cloned_partial_rollout (_PartialRolloutNode) – The cloned node, which is a child of the current node.

has_agent_acted(agent_name: str) bool[source]#

Check if the given agent has acted at this node.

An action means either sending a message or making a decision (for verifiers).

Parameters:

agent_name (str) – The name of the agent to check.

Returns:

has_acted (bool) – True if the agent has acted at this node, False otherwise.

visualise(include_messages: bool = True, include_expected_reward: bool = True, include_pair_info: bool = True, include_padding_nodes: bool = False, tab_size: int = 2) str[source]#

Get a recursive string representation of the rollout tree.

Returns:

  • tree_string (str) – A representation of the rollout tree starting from this node.

  • include_messages (bool, default=True) – Whether to include the messages and decisions sent be each agent at each node.

  • include_expected_reward (bool, default=True) – Whether to include the expected reward for each agent.

  • include_pair_info (bool, default=True) – Whether to indicate whether a node is the positive or negative example in a preference pair.

  • include_padding_nodes (bool, default=False) – Whether to include padding nodes in the output. Padding nodes are nodes which are not part of the tree, but are used to fill in the tree so that it has uniform depth.

  • tab_size (int, default=2) – The number of spaces to indent each level of the tree.