QLoRA Fine-Tuning

CLI Equivalent: apr finetune --method qlora --rank 8 --alpha 16 model.apr

What This Demonstrates

Quantized LoRA (QLoRA) fine-tuning that keeps base model weights in 4-bit precision while training full-precision LoRA adapters. Enables fine-tuning of large models on limited VRAM.

Run

cargo run --example finetune_qlora

Key APIs

LoRAConfig::new(rank, alpha).target_qv_projections() -- configure adapter dimensions
Quantization::Int4 -- quantize base weights to 4-bit
LoRALayer::new(quantized_base, d_out, d_in, rank, alpha) -- attach adapters to quantized base
.trainable_params() -- only adapter weights are trainable
MemoryPlanner::estimate_vram(config) -- predict peak VRAM usage

Source

examples/optimize/finetune_qlora.rs

APR Cookbook - Idiomatic Rust Patterns for ML Model Deployment

QLoRA Fine-Tuning

What This Demonstrates

Run

Key APIs

Source