nip.utils.bugfix.Reward2GoTransform#
- class nip.utils.bugfix.Reward2GoTransform(gamma: float | Tensor | None = 1.0, in_keys: Sequence[str | Tuple[str, ...]] | None = None, out_keys: Sequence[str | Tuple[str, ...]] | None = None, done_key: str | Tuple[str, ...] | None = 'done')[source]#
Calculates the reward to go based on the episode reward and a discount factor.
This is a fixed version of the
Reward2GoTransform
class from torchrl. The original version had a bug where the reward-to-go was reshaped rather than transposed.See
torchrl.envs.transforms.Reward2GoTransform
for more information.Methods Summary
_inv_apply_transform
(reward, done)Attributes
ENV_ERR
T_destination
call_super_init
container
Returns the env containing the transform.
dump_patches
in_keys
in_keys_inv
invertible
missing_tolerance
out_keys
out_keys_inv
parent
Returns the parent env of the transform.
training
Methods