nip.utils.bugfix.Reward2GoTransform

nip.utils.bugfix.Reward2GoTransform#

class nip.utils.bugfix.Reward2GoTransform(gamma: float | Tensor | None = 1.0, in_keys: Sequence[str | Tuple[str, ...]] | None = None, out_keys: Sequence[str | Tuple[str, ...]] | None = None, done_key: str | Tuple[str, ...] | None = 'done')[source]#

Calculates the reward to go based on the episode reward and a discount factor.

This is a fixed version of the Reward2GoTransform class from torchrl. The original version had a bug where the reward-to-go was reshaped rather than transposed.

See torchrl.envs.transforms.Reward2GoTransform for more information.

Methods Summary

_inv_apply_transform(reward, done)

Attributes

ENV_ERR

T_destination

call_super_init

container

Returns the env containing the transform.

dump_patches

in_keys

in_keys_inv

invertible

missing_tolerance

out_keys

out_keys_inv

parent

Returns the parent env of the transform.

training

Methods

_inv_apply_transform(reward: Tensor, done: Tensor) Tensor[source]#