nip.utils.torch.DummyOptimizer#

class nip.utils.torch.DummyOptimizer(*args, **kwargs)[source]#

A dummy optimizer which does nothing.

Methods Summary

__init__(*args, **kwargs)

step(*args, **kwargs)

Performs a single optimization step (parameter update).

zero_grad(*args, **kwargs)

Resets the gradients of all optimized torch.Tensor s.

Attributes

OptimizerPostHook

alias of Callable[[Self, Tuple[Any, ...], Dict[str, Any]], None]

OptimizerPreHook

alias of Callable[[Self, Tuple[Any, ...], Dict[str, Any]], Tuple[Tuple[Any, ...], Dict[str, Any]] | None]

Methods

__init__(*args, **kwargs)[source]#
step(*args, **kwargs)[source]#

Performs a single optimization step (parameter update).

Parameters:

closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

zero_grad(*args, **kwargs)[source]#

Resets the gradients of all optimized torch.Tensor s.

Parameters:

set_to_none (bool) – instead of setting to zero, set the grads to None. This will in general have lower memory footprint, and can modestly improve performance. However, it changes certain behaviors. For example: 1. When the user tries to access a gradient and perform manual ops on it, a None attribute or a Tensor full of 0s will behave differently. 2. If the user requests zero_grad(set_to_none=True) followed by a backward pass, .grads are guaranteed to be None for params that did not receive a gradient. 3. torch.optim optimizers have a different behavior if the gradient is 0 or None (in one case it does the step with a gradient of 0 and in the other it skips the step altogether).