Structs§
- ELU
- ELU activation: x if x > 0, alpha * (exp(x) - 1) otherwise
- GeLU
- GELU activation (Gaussian Error Linear Unit) Used in Transformers (BERT, GPT, etc.)
- Leaky
ReLU - LeakyReLU activation: max(negative_slope * x, x)
- Mish
- Mish activation: x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))
- ReLU
- ReLU activation: max(0, x)
- SiLU
- SiLU / Swish activation: x * σ(x) Used in modern architectures (EfficientNet, LLaMA, etc.)
- Sigmoid
- Sigmoid activation: 1 / (1 + e^(-x))
- Tanh
- Tanh activation