Neural Networks
This chapter demonstrates Trueno's neural network primitives using the activation_functions example.
Running the Example
cargo run --example activation_functions
Activation Functions
Trueno provides 11 activation functions commonly used in neural networks:
Basic Activations
use trueno::Vector;
let x = Vector::from_slice(&[0.5, -0.2, 1.2, -0.8, 2.1]);
// ReLU - Rectified Linear Unit
let relu = x.relu()?; // max(0, x)
// Sigmoid - Logistic function
let sigmoid = x.sigmoid()?; // 1 / (1 + exp(-x))
// Tanh - Hyperbolic tangent
let tanh_result = x.tanh_activation()?; // (exp(x) - exp(-x)) / (exp(x) + exp(-x))
Advanced Activations
// GELU - Gaussian Error Linear Unit (Transformer default)
let gelu = x.gelu()?;
// Swish/SiLU - x * sigmoid(x) (EfficientNet)
let swish = x.swish()?;
// Mish - x * tanh(softplus(x)) (YOLOv4)
let mish = x.mish()?;
// SELU - Self-Normalizing ELU
let selu = x.selu()?;
// Hardswish - Efficient approximation (MobileNetV3)
let hardswish = x.hardswish()?;
// Softplus - Smooth ReLU approximation
let softplus = x.softplus()?;
// ELU - Exponential Linear Unit
let elu = x.elu(1.0)?; // alpha = 1.0
// Leaky ReLU - ReLU with negative slope
let leaky = x.leaky_relu(0.01)?; // alpha = 0.01
Softmax (Probability Distribution)
let logits = Vector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0]);
let probs = logits.softmax()?;
// Properties:
// - All values in [0, 1]
// - Sum = 1.0
When to Use Each Activation
| Network Type | Recommended Activation | Example |
|---|---|---|
| CNN | ReLU | ResNet, VGG |
| Transformer | GELU | BERT, GPT |
| EfficientNet | Swish | EfficientNet-B0 to B7 |
| MobileNet | Hardswish | MobileNetV3 |
| Object Detection | Mish | YOLOv4 |
| Self-Normalizing | SELU | Deep autoencoders |
| Output Layer (classification) | Softmax | Most classifiers |
| Output Layer (regression) | None (linear) | Regression tasks |
Building a Simple MLP
use trueno::{Vector, Matrix};
fn mlp_forward(
input: &Vector,
weights1: &Matrix,
bias1: &Vector,
weights2: &Matrix,
bias2: &Vector,
) -> Result<Vector, TruenoError> {
// Layer 1: Linear + ReLU
let h1 = weights1.matvec(input)?;
let h1 = h1.add(bias1)?;
let h1 = h1.relu()?;
// Layer 2: Linear + Softmax
let h2 = weights2.matvec(&h1)?;
let h2 = h2.add(bias2)?;
h2.softmax()
}
Transformer Building Blocks
Layer Normalization
fn layer_norm(x: &Vector, gamma: &Vector, beta: &Vector) -> Result<Vector, TruenoError> {
let mean = x.mean()?;
let centered = x.sub_scalar(mean)?;
let var = centered.dot(¢ered)? / x.len() as f32;
let std = (var + 1e-5).sqrt();
let normalized = centered.mul_scalar(1.0 / std)?;
let scaled = normalized.mul(gamma)?;
scaled.add(beta)
}
Attention Scores
fn attention_scores(query: &Vector, key: &Vector) -> Result<f32, TruenoError> {
let d_k = query.len() as f32;
let score = query.dot(key)?;
Ok(score / d_k.sqrt())
}
Performance Tips
Batching for Efficiency
// Process multiple samples together
let batch: Vec<Vector> = inputs
.iter()
.map(|x| x.relu().unwrap())
.collect();
Fused Operations
// Fusing reduces memory bandwidth
// Instead of:
let h = x.relu()?.mul_scalar(scale)?;
// Use pre-scaled weights when possible
GPU Acceleration
For large batch sizes, use GPU:
cargo run --release --features gpu --example activation_functions
Fused Bias + Activation (GPU PTX)
For GPU inference, trueno-gpu provides a fused bias+activation kernel that combines bias addition with activation in a single kernel pass:
use trueno_gpu::kernels::{BiasActivationKernel, Kernel};
// Bias + GELU (common in Transformers)
let kernel = BiasActivationKernel::new(4096, 256).with_gelu();
// Bias + ReLU (common in CNNs)
let kernel = BiasActivationKernel::new(4096, 256).with_relu();
let ptx = kernel.emit_ptx();
This is typically used as an epilogue after GEMM operations, reducing memory bandwidth by avoiding intermediate writes.
cargo run -p trueno-gpu --example bias_activation
See Also
- Activation Functions Example - Full source
- ML Similarity - k-NN example
- Performance Demo - SIMD speedups