nip.trainers.malt_pure_text._sample_positive_and_negative_examples#
- nip.trainers.malt_pure_text._sample_positive_and_negative_examples(partial_rollouts_by_level: list[list[_PartialRolloutNode]], hyper_params: HyperParameters)[source]#
Sample positive and negative examples for each node in the tree of responses.
We look at each node and check if in its children there is a positive and a negative example. If so, we set the
("agents", "has_positive_and_negative")
field to True. In this case, we randomly sample a positive and a negative example from the children and set the("agents", "sampled_positive_example")
and("agents", "sampled_negative_example")
fields to the corresponding node IDs. Otherwise these fields are set to -1.- Parameters:
partial_rollouts_by_level (list[list[_PartialRolloutNode]]) – The tree of responses, stratified by level. These are modified in-place, where we add
("agents", "has_positive_and_negative")
,("agents", "sampled_positive_example")
, and("agents", "sampled_negative_example")
fields to the rollouts.hyper_params (HyperParameters) – The parameters of the experiment.