nip.rl_objectives

nip.rl_objectives#

Implementations of RL objectives, extending those of TorchRL.

Classes

ClipPPOLossImproved(*args, **kwargs)

Clipped PPO loss which allows multiple actions keys and normalises advantages.

KLPENPPOLossImproved(*args, **kwargs)

KL penalty PPO loss which allows multiple actions keys and normalises advantages.

Objective(*args, **kwargs)

Base class for all RL objectives.

PPOLossImproved(*args, **kwargs)

Base PPO loss class which allows multiple actions keys and normalises advantages.

ReinforceLossImproved(*args, **kwargs)

Reinforce loss which allows multiple actions keys and normalises advantages.

SpgLoss(*args, **kwargs)

Loss for Stackelberg Policy Gradient and several variants.