Category R: Optimize

Model optimization recipes covering the full apr CLI optimization surface: fine-tuning, pruning, distillation, merging, and quantization. These examples mirror the subcommands available in apr finetune, apr prune, apr distill, apr merge, and apr quantize.

Full Pipeline

Recipe	Example	Description
Full Pipeline	`optimize_full_pipeline`	Composed finetune, prune, distill, merge, quantize pipeline

Fine-Tuning (`apr finetune`)

Recipe	Example	Description
LoRA Fine-Tuning	`finetune_lora`	LoRA adapter training with rank/alpha control
QLoRA Fine-Tuning	`finetune_qlora`	Quantized LoRA for memory-efficient fine-tuning
Merge Adapter	`finetune_merge_adapter`	Merge and unmerge LoRA adapters with base model
Plan VRAM	`finetune_plan_vram`	VRAM estimation and memory planning

Pruning (`apr prune`)

Recipe	Example	Description
Magnitude Pruning	`prune_magnitude`	Weight magnitude-based unstructured pruning
Structured Pruning	`prune_structured`	Width pruning (Minitron-style)
Depth Pruning	`prune_depth`	Layer removal (Minitron-style)
Wanda Pruning	`prune_wanda`	Pruning with calibration data (Wanda method)
Gradual Schedule	`prune_gradual_schedule`	Cubic and gradual pruning schedules

Distillation (`apr distill`)

Recipe	Example	Description
Standard KL	`distill_standard_kl`	Standard KL divergence knowledge distillation
Progressive	`distill_progressive`	Layer-wise progressive distillation
Ensemble	`distill_ensemble`	Multi-teacher ensemble distillation
Checkpoint	`distill_checkpoint`	Distillation with checkpoint saving/resuming

Merging (`apr merge`)

Recipe	Example	Description
Average Merge	`merge_average`	Uniform average of model weights
Weighted Merge	`merge_weighted`	Weighted average merge with custom ratios
SLERP Merge	`merge_slerp`	Spherical linear interpolation merge
TIES Merge	`merge_ties`	TIES merge with density parameter
DARE Merge	`merge_dare`	DARE merge with drop probability
Hierarchical Merge	`merge_hierarchical`	Multi-model hierarchical merge strategy

Quantization (`apr quantize`)

Recipe	Example	Description
4-bit Quantization	`quantize_4bit`	Int4 weight quantization
Fake QAT	`quantize_fake_qat`	Fake quantization-aware training

APR Cookbook - Idiomatic Rust Patterns for ML Model Deployment

Category R: Optimize

Full Pipeline

Fine-Tuning (apr finetune)

Pruning (apr prune)

Distillation (apr distill)

Merging (apr merge)

Quantization (apr quantize)

Fine-Tuning (`apr finetune`)

Pruning (`apr prune`)

Distillation (`apr distill`)

Merging (`apr merge`)

Quantization (`apr quantize`)