Crate shrew

Expand description

§Shrew

A deep learning library built from scratch in Rust.

This is the top-level facade crate that re-exports everything you need.

use shrew::prelude::*;

Crate	Purpose
`shrew-core`	Tensor, Shape, DType, Layout, Backend trait, Autograd
`shrew-cpu`	CPU backend with SIMD matmul and rayon parallelism
`shrew-nn`	Neural network layers (Linear, Conv2d, RNN, LSTM, Transformer, etc.)
`shrew-optim`	Optimizers (SGD, Adam, AdamW, RAdam, RMSProp), LR schedulers, EMA
`shrew-data`	Dataset, DataLoader, MNIST, transforms
`shrew-cuda`	CUDA GPU backend (feature-gated)
`shrew-ir`	`.sw` IR format: lexer, parser, AST, Graph IR

checkpoint: Checkpoint — save and load model parameters.
distributed: Distributed training — DataParallel, MixedPrecision, Pipeline, gradient sync.
exec: Graph executor — runs .sw programs on the tensor runtime.
ir: Re-export the .sw IR parser and AST.
nn: Re-export neural network modules.
onnx: ONNX — Import/Export for interoperability with other frameworks.
optim: Re-export optimizers.
prelude: Prelude: import this for the most common types.
profiler: Profiling & Benchmarking — op-level timing, memory tracking, model summaries.
quantize: Quantization — INT8/INT4 post-training quantization for inference.
safetensors: Safetensors — interoperable tensor serialization (HuggingFace format).

CpuBackend: Re-export CPU backend. CPU backend. Implements Backend by running operations on CPU via iterators.
CpuDevice: Re-export CPU backend. The CPU device. Since every machine has exactly one CPU (from our perspective), this is a zero-sized type.
CudaBackend: Re-export CUDA backend (requires cuda feature + NVIDIA CUDA Toolkit). The CUDA GPU backend. This is a zero-sized marker type.
CudaDevice: Re-export CUDA backend (requires cuda feature + NVIDIA CUDA Toolkit). A CUDA device handle. Contains the cudarc device and a cuBLAS handle for matrix multiplication. Clonable (uses Arc internally).
GradStore: Re-export core types. Stores gradients for all tensors in a computation graph.
Layout: Re-export core types. Layout describes how a tensor’s logical shape maps to flat storage.
Shape: Re-export core types. N-dimensional shape of a tensor.
Tensor: Re-export core types. An n-dimensional array of numbers on a specific backend.
TensorId: Re-export core types. Unique identifier for a tensor. Used as keys in GradStore.

BinaryOp: Re-export core types. Element-wise binary operations.
CmpOp: Re-export core types. Comparison operations (produce boolean tensors).
CpuStorage: Re-export CPU backend. Storage of tensor data in CPU memory.
CudaStorage: Re-export CUDA backend (requires cuda feature + NVIDIA CUDA Toolkit). GPU-side storage. Each variant wraps a cudarc CudaSlice for the corresponding dtype. F16 and BF16 are stored as CudaSlice<u16> (bit-level representation).
DType: Re-export core types. Enum of all supported element data types.
Error: Re-export core types. All errors that can occur within Shrew.
Op: Re-export core types. Records the operation that produced a tensor, storing references to inputs.
ReduceOp: Re-export core types. Reduction operations along dimension(s).
UnaryOp: Re-export core types. Element-wise unary operations.

Backend: Re-export core types. The main Backend trait. Implementing this for a struct (e.g., CpuBackend) makes that struct a complete compute backend for Shrew.
BackendDevice: Re-export core types. Identifies a compute device (e.g., “CPU”, “CUDA:0”, “CUDA:1”).
BackendStorage: Re-export core types. A storage buffer that holds tensor data on a specific device.
WithDType: Re-export core types. Trait implemented by Rust types that can be stored in a tensor.

CpuTensor: Re-export CPU backend. Convenience type alias for CPU tensors.
CudaTensor: Re-export CUDA backend (requires cuda feature + NVIDIA CUDA Toolkit). Convenience type alias for CUDA tensors.
Result: Re-export core types. Convenience Result type used throughout Shrew.