Full Optimization Pipeline
CLI Equivalent: apr finetune ... && apr prune ... && apr distill ... && apr merge ... && apr quantize ...
What This Demonstrates
Composes the entire optimization pipeline in a single example: LoRA fine-tuning, magnitude pruning, KL distillation, SLERP merge, and 4-bit quantization applied sequentially to produce a smaller, faster model.
Run
cargo run --example optimize_full_pipeline
Key APIs
LoRALayer::new(base, d_out, d_in, rank, alpha)-- fine-tune with low-rank adaptersprune_magnitude(tensor, sparsity)-- remove small weightsDistillationLoss::new(temp, alpha).forward(...)-- transfer knowledge from teacherslerp_merge(&m1, &m2, &SlerpConfig::new(t))-- interpolate two modelsQuantization::Int4-- quantize to 4-bit integers