Module optim

Module optim 

Source
Expand description

Re-export optimizers.

Modules§

adam
clip
ema
optimizer
radam
rmsprop
scheduler
sgd

Structs§

Adam
Adam optimizer (Adaptive Moment Estimation).
AdamW
AdamW optimizer (Adam with decoupled weight decay).
CosineAnnealingLR
Cosine annealing from initial_lr to min_lr over total_steps.
CosineWarmupLR
Linear warmup from 0 to initial_lr over warmup_steps, then cosine decay from initial_lr to min_lr over the remaining steps.
EMA
Exponential Moving Average of model parameters.
ExponentialLR
Multiply the learning rate by gamma every step.
GradAccumulator
Gradient accumulation helper.
LinearLR
Linearly interpolate the learning rate from start_factor * initial_lr to end_factor * initial_lr over total_steps steps.
OptimizerState
A serializable snapshot of an optimizer’s internal state.
RAdam
Rectified Adam (RAdam) optimizer.
RMSProp
RMSProp optimizer.
ReduceLROnPlateau
Reduce the learning rate when a monitored metric plateaus.
SGD
Stochastic Gradient Descent optimizer with optional momentum.
StepLR
Multiply the learning rate by gamma every step_size steps.

Traits§

LrScheduler
Trait for learning rate schedulers.
Optimizer
Trait that all optimizers implement.
Stateful
Trait for optimizers that can save and restore their internal state.

Functions§

clip_grad_norm
Clip gradients by their global L2 norm.
clip_grad_value
Clamp each gradient element to [-max_value, max_value].
grad_norm
Compute the global L2 norm of all gradients without clipping.