Case Study: Evolutionary Merge Optimization

Ticket: GH-444 | Contract: merge.md §evolutionary

Overview

Evolutionary merge uses CMA-ES (Covariance Matrix Adaptation Evolution Strategy) to automatically find optimal merge weights when combining multiple model checkpoints. Instead of manually tuning weights, the optimizer explores the weight space and finds the combination that minimizes a user-defined objective (e.g., perplexity on calibration data).

API

use aprender::format::converter::evolutionary_merge::{
    evolutionary_merge, EvolutionaryMergeConfig,
};
use aprender::format::MergeStrategy;

let config = EvolutionaryMergeConfig {
    num_models: 3,
    max_evaluations: 100,
    strategy: MergeStrategy::Weighted,
    sigma: 0.3,    // CMA-ES step size
    seed: 42,      // Reproducibility
    ..Default::default()
};

let result = evolutionary_merge(&models, &config, |merged| {
    // Return score to MINIMIZE (e.g., perplexity)
    evaluate_perplexity(&merged)
});

println!("Optimal weights: {:?}", result.weights);
println!("Best score: {:.4}", result.best_score);

// Use result.merge_options directly with apr_merge

Key Design Decisions

Softmax parameterization: Raw CMA-ES parameters are mapped through softmax to ensure weights are always positive and sum to 1.0.
In-memory merging: Tensor merging happens in-memory without file I/O, enabling fast objective function evaluation during optimization.
Strategy-agnostic: Works with Average, Weighted, and SLERP strategies. TIES/DARE density and drop_rate can also be optimized.

Falsification Tests

Test	Property
FALSIFY-EVOL-MERGE-001	Weights always sum to 1.0
FALSIFY-EVOL-MERGE-002	CMA-ES improves over equal-weight baseline
FALSIFY-EVOL-MERGE-003	Result produces valid MergeOptions

Run the Example

cargo run --example evolutionary_merge

Aprender — Pure Rust ML Framework

Case Study: Evolutionary Merge Optimization

Overview

API

Key Design Decisions

Falsification Tests

Run the Example