nip.utils.bugfix#
Replacements for buggy parts of libraries we use.
Functions
|
Compute the discounted cumulative sum of rewards for multiple trajectories. |
Classes
|
Calculates the reward to go based on the episode reward and a discount factor. |