pub struct TransformerBlock<B>where
B: Backend,{ /* private fields */ }Expand description
A single Transformer block (pre-norm style).
Contains:
- Self-attention with multi-head attention
- Feed-forward network (two linear layers with GELU)
- Two layer normalizations
- Residual connections around both sub-layers
Implementations§
Source§impl<B> TransformerBlock<B>where
B: Backend,
impl<B> TransformerBlock<B>where
B: Backend,
Sourcepub fn new(
d_model: usize,
num_heads: usize,
d_ff: usize,
causal: bool,
dtype: DType,
device: &<B as Backend>::Device,
) -> Result<TransformerBlock<B>, Error>
pub fn new( d_model: usize, num_heads: usize, d_ff: usize, causal: bool, dtype: DType, device: &<B as Backend>::Device, ) -> Result<TransformerBlock<B>, Error>
Create a new TransformerBlock.
§Arguments
d_model: model dimension (embedding size)num_heads: number of attention headsd_ff: feed-forward inner dimension (typically 4 * d_model)causal: whether to use causal (autoregressive) attention maskdtype: data typedevice: compute device
pub fn d_model(&self) -> usize
Trait Implementations§
Source§impl<B> Module<B> for TransformerBlock<B>where
B: Backend,
impl<B> Module<B> for TransformerBlock<B>where
B: Backend,
Source§fn forward(&self, x: &Tensor<B>) -> Result<Tensor<B>, Error>
fn forward(&self, x: &Tensor<B>) -> Result<Tensor<B>, Error>
Forward pass (pre-norm): x = x + attention(layernorm(x)) x = x + ffn(layernorm(x))
Source§fn parameters(&self) -> Vec<Tensor<B>>
fn parameters(&self) -> Vec<Tensor<B>>
Return all trainable parameters of this module.
The optimizer uses these to update weights during training.
Source§fn named_parameters(&self) -> Vec<(String, Tensor<B>)>
fn named_parameters(&self) -> Vec<(String, Tensor<B>)>
Return all trainable parameters with human-readable names. Read more
Source§fn set_training(&self, _training: bool)
fn set_training(&self, _training: bool)
Set training or evaluation mode. Read more
Source§fn is_training(&self) -> bool
fn is_training(&self) -> bool
Whether the module is in training mode (default: true).
Source§fn num_parameters(&self) -> usize
fn num_parameters(&self) -> usize
Total number of scalar parameters in this module.
Source§fn trainable_params_count(&self) -> usize
fn trainable_params_count(&self) -> usize
Number of trainable (variable) parameters.
Auto Trait Implementations§
impl<B> Freeze for TransformerBlock<B>
impl<B> RefUnwindSafe for TransformerBlock<B>
impl<B> Send for TransformerBlock<B>
impl<B> Sync for TransformerBlock<B>
impl<B> Unpin for TransformerBlock<B>
impl<B> UnwindSafe for TransformerBlock<B>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more