Category R: Optimize

Model optimization recipes covering the full apr CLI optimization surface: fine-tuning, pruning, distillation, merging, and quantization. These examples mirror the subcommands available in apr finetune, apr prune, apr distill, apr merge, and apr quantize.

Full Pipeline

RecipeExampleDescription
Full Pipelineoptimize_full_pipelineComposed finetune, prune, distill, merge, quantize pipeline

Fine-Tuning (apr finetune)

RecipeExampleDescription
LoRA Fine-Tuningfinetune_loraLoRA adapter training with rank/alpha control
QLoRA Fine-Tuningfinetune_qloraQuantized LoRA for memory-efficient fine-tuning
Merge Adapterfinetune_merge_adapterMerge and unmerge LoRA adapters with base model
Plan VRAMfinetune_plan_vramVRAM estimation and memory planning

Pruning (apr prune)

RecipeExampleDescription
Magnitude Pruningprune_magnitudeWeight magnitude-based unstructured pruning
Structured Pruningprune_structuredWidth pruning (Minitron-style)
Depth Pruningprune_depthLayer removal (Minitron-style)
Wanda Pruningprune_wandaPruning with calibration data (Wanda method)
Gradual Scheduleprune_gradual_scheduleCubic and gradual pruning schedules

Distillation (apr distill)

RecipeExampleDescription
Standard KLdistill_standard_klStandard KL divergence knowledge distillation
Progressivedistill_progressiveLayer-wise progressive distillation
Ensembledistill_ensembleMulti-teacher ensemble distillation
Checkpointdistill_checkpointDistillation with checkpoint saving/resuming

Merging (apr merge)

RecipeExampleDescription
Average Mergemerge_averageUniform average of model weights
Weighted Mergemerge_weightedWeighted average merge with custom ratios
SLERP Mergemerge_slerpSpherical linear interpolation merge
TIES Mergemerge_tiesTIES merge with density parameter
DARE Mergemerge_dareDARE merge with drop probability
Hierarchical Mergemerge_hierarchicalMulti-model hierarchical merge strategy

Quantization (apr quantize)

RecipeExampleDescription
4-bit Quantizationquantize_4bitInt4 weight quantization
Fake QATquantize_fake_qatFake quantization-aware training