Structured Width Pruning
CLI Equivalent: apr prune --method structured --width-ratio 0.75 model.apr
What This Demonstrates
Structured width pruning (Minitron-style) that removes entire neurons/channels rather than individual weights. Produces genuinely smaller weight matrices that run faster without sparse tensor support.
Run
cargo run --example prune_structured
Key APIs
prune_structured(tensor, width_ratio)-- remove lowest-importance columnsimportance_score(tensor, axis)-- rank neurons by L2 normreshape_layers(model, new_width)-- adjust downstream layers for reduced width