nip.utils.bugfix.Reward2GoTransform#

Calculates the reward to go based on the episode reward and a discount factor.

This is a fixed version of the Reward2GoTransform class from torchrl. The original version had a bug where the reward-to-go was reshaped rather than transposed.

See torchrl.envs.transforms.Reward2GoTransform for more information.

Methods Summary

_inv_apply_transform(reward, done)

Attributes

`ENV_ERR`
`T_destination`
`call_super_init`
`container`	Returns the env containing the transform.
`dump_patches`
`in_keys`
`in_keys_inv`
`invertible`
`missing_tolerance`
`out_keys`
`out_keys_inv`
`parent`	Returns the parent env of the transform.
`training`

Methods

_inv_apply_transform(reward: Tensor, done: Tensor) → Tensor[source]#

nip.utils.bugfix.Reward2GoTransform

Contents

nip.utils.bugfix.Reward2GoTransform#