Module optim

Expand description

Re-export optimizers.

Modules§

Adam: Adam optimizer (Adaptive Moment Estimation).
AdamW: AdamW optimizer (Adam with decoupled weight decay).
CosineAnnealingLR: Cosine annealing from initial_lr to min_lr over total_steps.
CosineWarmupLR: Linear warmup from 0 to initial_lr over warmup_steps, then cosine decay from initial_lr to min_lr over the remaining steps.
EMA: Exponential Moving Average of model parameters.
ExponentialLR: Multiply the learning rate by gamma every step.
GradAccumulator: Gradient accumulation helper.
LinearLR: Linearly interpolate the learning rate from start_factor * initial_lr to end_factor * initial_lr over total_steps steps.
OptimizerState: A serializable snapshot of an optimizer’s internal state.
RAdam: Rectified Adam (RAdam) optimizer.
RMSProp: RMSProp optimizer.
ReduceLROnPlateau: Reduce the learning rate when a monitored metric plateaus.
SGD: Stochastic Gradient Descent optimizer with optional momentum.
StepLR: Multiply the learning rate by gamma every step_size steps.

LrScheduler: Trait for learning rate schedulers.
Optimizer: Trait that all optimizers implement.
Stateful: Trait for optimizers that can save and restore their internal state.