Struct MultiHeadAttention

Source

pub struct MultiHeadAttention<B>where
    B: Backend,
{ /* private fields */ }

Expand description

Multi-Head Self-Attention module.

§Examples

let attn = MultiHeadAttention::<CpuBackend>::new(512, 8, DType::F64, &dev)?;
let x = CpuTensor::rand((2, 10, 512), DType::F64, &dev)?;
let y = attn.forward(&x)?; // [2, 10, 512]

Implementations§

Source §

impl MultiHeadAttention
where B: Backend,

Source

pub fn new( d_model: usize, num_heads: usize, dtype: DType, device: &::Device, ) -> Result<MultiHeadAttention, Error>

Create a new Multi-Head Attention module.

§Arguments

d_model: total model dimension (must be divisible by num_heads)
num_heads: number of attention heads
dtype: data type for parameters
device: device to create parameters on

Source

pub fn with_causal(self, causal: bool) -> MultiHeadAttention

Enable causal (autoregressive) masking.

Source

pub fn num_heads(&self) -> usize

Source

pub fn d_model(&self) -> usize

Source

pub fn head_dim(&self) -> usize

Trait Implementations§

Source §

impl Module for MultiHeadAttention
where B: Backend,

Source §

fn forward(&self, x: &Tensor) -> Result<Tensor, Error>

Forward pass: self-attention on input x.

Input: [batch, seq_len, d_model] Output: [batch, seq_len, d_model]

Source §

fn parameters(&self) -> Vec<Tensor>

Return all trainable parameters of this module. The optimizer uses these to update weights during training.

Source §

fn named_parameters(&self) -> Vec<(String, Tensor)>

Return all trainable parameters with human-readable names. Read more

Source §

fn set_training(&self, _training: bool)

Set training or evaluation mode. Read more

Source §

fn is_training(&self) -> bool

Whether the module is in training mode (default: true).

Source §

fn train(&self)

Convenience: set training mode.

Source §

fn eval(&self)

Convenience: set evaluation mode.

Source §

fn num_parameters(&self) -> usize

Total number of scalar parameters in this module.

Source §

fn trainable_params_count(&self) -> usize

Number of trainable (variable) parameters.

Source §

fn frozen_parameters(&self) -> Vec<Tensor>

Freeze all parameters: returns new parameter tensors with is_variable = false, preventing gradient accumulation. Read more

Source §

fn state_dict(&self) -> Vec<(String, Tensor)>

Returns a state_dict-style map of parameter name → tensor. Read more

Auto Trait Implementations§

§

impl UnwindSafe for MultiHeadAttention
where ::Device: RefUnwindSafe,

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

§