nip.utils.bugfix

nip.utils.bugfix#

Replacements for buggy parts of libraries we use.

Functions

reward2go(reward, done, gamma[, time_dim])

Compute the discounted cumulative sum of rewards for multiple trajectories.

Classes

Reward2GoTransform([gamma, in_keys, ...])

Calculates the reward to go based on the episode reward and a discount factor.