Building Custom Error Classifiers

This chapter demonstrates how to build ML-powered error classification systems using aprender, based on the real-world depyler-oracle implementation.

The Problem

Compile errors are painful. Developers waste hours deciphering cryptic messages. What if we could:

  1. Classify errors into actionable categories
  2. Predict fixes based on historical patterns
  3. Learn from successful resolutions

Architecture Overview

Error Message → Feature Extraction → Classification → Fix Prediction
                     ↓                    ↓               ↓
              TF-IDF + Handcrafted   DecisionTree    N-gram Matching

Step 1: Define Error Categories

use serde::{Deserialize, Serialize};

#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub enum ErrorCategory {
    TypeMismatch,
    BorrowChecker,
    MissingImport,
    SyntaxError,
    LifetimeError,
    TraitBound,
    Other,
}

impl ErrorCategory {
    pub fn index(&self) -> usize {
        match self {
            Self::TypeMismatch => 0,
            Self::BorrowChecker => 1,
            Self::MissingImport => 2,
            Self::SyntaxError => 3,
            Self::LifetimeError => 4,
            Self::TraitBound => 5,
            Self::Other => 6,
        }
    }

    pub fn from_index(idx: usize) -> Self {
        match idx {
            0 => Self::TypeMismatch,
            1 => Self::BorrowChecker,
            2 => Self::MissingImport,
            3 => Self::SyntaxError,
            4 => Self::LifetimeError,
            5 => Self::TraitBound,
            _ => Self::Other,
        }
    }
}

Step 2: Feature Extraction

Combine hand-crafted domain features with TF-IDF vectorization:

use aprender::text::vectorize::TfidfVectorizer;
use aprender::text::tokenize::WhitespaceTokenizer;

/// Hand-crafted features for error messages
pub struct ErrorFeatures {
    pub message_length: f32,
    pub type_keywords: f32,
    pub borrow_keywords: f32,
    pub has_error_code: f32,
    // ... more domain-specific features
}

impl ErrorFeatures {
    pub const DIM: usize = 12;

    pub fn from_message(msg: &str) -> Self {
        let lower = msg.to_lowercase();
        Self {
            message_length: (msg.len() as f32 / 500.0).min(1.0),
            type_keywords: Self::count_keywords(&lower, &[
                "expected", "found", "mismatched", "type"
            ]),
            borrow_keywords: Self::count_keywords(&lower, &[
                "borrow", "move", "ownership"
            ]),
            has_error_code: if msg.contains("E0") { 1.0 } else { 0.0 },
        }
    }

    fn count_keywords(text: &str, keywords: &[&str]) -> f32 {
        let count = keywords.iter().filter(|k| text.contains(*k)).count();
        (count as f32 / keywords.len() as f32).min(1.0)
    }
}

TF-IDF Feature Extraction

pub struct TfidfFeatureExtractor {
    vectorizer: TfidfVectorizer,
    is_fitted: bool,
}

impl TfidfFeatureExtractor {
    pub fn new() -> Self {
        Self {
            vectorizer: TfidfVectorizer::new()
                .with_tokenizer(Box::new(WhitespaceTokenizer::new()))
                .with_ngram_range(1, 3)  // unigrams, bigrams, trigrams
                .with_sublinear_tf(true)
                .with_max_features(500),
            is_fitted: false,
        }
    }

    pub fn fit(&mut self, documents: &[&str]) -> Result<(), AprenderError> {
        self.vectorizer.fit(documents)?;
        self.is_fitted = true;
        Ok(())
    }

    pub fn transform(&self, documents: &[&str]) -> Result<Matrix<f64>, AprenderError> {
        self.vectorizer.transform(documents)
    }
}

Step 3: N-gram Fix Predictor

Learn error→fix patterns from training data:

use std::collections::HashMap;

pub struct FixPattern {
    pub error_pattern: String,
    pub fix_template: String,
    pub category: ErrorCategory,
    pub frequency: usize,
    pub success_rate: f32,
}

pub struct NgramFixPredictor {
    patterns: HashMap<ErrorCategory, Vec<FixPattern>>,
    min_similarity: f32,
}

impl NgramFixPredictor {
    pub fn new() -> Self {
        Self {
            patterns: HashMap::new(),
            min_similarity: 0.1,
        }
    }

    /// Learn a new error-fix pattern
    pub fn learn_pattern(
        &mut self,
        error_message: &str,
        fix_template: &str,
        category: ErrorCategory,
    ) {
        let normalized = self.normalize(error_message);
        let patterns = self.patterns.entry(category).or_default();

        if let Some(existing) = patterns.iter_mut()
            .find(|p| p.error_pattern == normalized)
        {
            existing.frequency += 1;
        } else {
            patterns.push(FixPattern {
                error_pattern: normalized,
                fix_template: fix_template.to_string(),
                category,
                frequency: 1,
                success_rate: 0.0,
            });
        }
    }

    /// Predict fixes for an error
    pub fn predict(&self, error_message: &str, top_k: usize) -> Vec<FixSuggestion> {
        let normalized = self.normalize(error_message);
        let mut suggestions = Vec::new();

        for (category, patterns) in &self.patterns {
            for pattern in patterns {
                let similarity = self.jaccard_similarity(&normalized, &pattern.error_pattern);
                if similarity >= self.min_similarity {
                    suggestions.push(FixSuggestion {
                        fix: pattern.fix_template.clone(),
                        confidence: similarity * (1.0 + (pattern.frequency as f32).ln()),
                        category: *category,
                    });
                }
            }
        }

        suggestions.sort_by(|a, b| b.confidence.partial_cmp(&a.confidence).unwrap());
        suggestions.truncate(top_k);
        suggestions
    }

    fn normalize(&self, msg: &str) -> String {
        msg.to_lowercase()
            .replace(|c: char| c.is_ascii_digit(), "N")
            .replace("error:", "")
            .trim()
            .to_string()
    }

    fn jaccard_similarity(&self, a: &str, b: &str) -> f32 {
        let tokens_a: Vec<&str> = a.split_whitespace().collect();
        let tokens_b: Vec<&str> = b.split_whitespace().collect();

        let set_a: std::collections::HashSet<_> = tokens_a.iter().collect();
        let set_b: std::collections::HashSet<_> = tokens_b.iter().collect();

        let intersection = set_a.intersection(&set_b).count();
        let union = set_a.union(&set_b).count();

        if union == 0 { 0.0 } else { intersection as f32 / union as f32 }
    }
}

pub struct FixSuggestion {
    pub fix: String,
    pub confidence: f32,
    pub category: ErrorCategory,
}

Step 4: Training Data

Curate real-world error patterns:

pub struct TrainingSample {
    pub message: String,
    pub category: ErrorCategory,
    pub fix: Option<String>,
}

pub fn rustc_training_data() -> Vec<TrainingSample> {
    vec![
        // Type mismatches
        TrainingSample {
            message: "error[E0308]: mismatched types, expected `i32`, found `&str`".into(),
            category: ErrorCategory::TypeMismatch,
            fix: Some("Use .parse() or type conversion".into()),
        },
        TrainingSample {
            message: "error[E0308]: expected `String`, found `&str`".into(),
            category: ErrorCategory::TypeMismatch,
            fix: Some("Use .to_string() to create owned String".into()),
        },

        // Borrow checker
        TrainingSample {
            message: "error[E0382]: use of moved value".into(),
            category: ErrorCategory::BorrowChecker,
            fix: Some("Clone the value or use references".into()),
        },
        TrainingSample {
            message: "error[E0502]: cannot borrow as mutable because also borrowed as immutable".into(),
            category: ErrorCategory::BorrowChecker,
            fix: Some("Separate mutable and immutable operations".into()),
        },

        // Lifetimes
        TrainingSample {
            message: "error[E0106]: missing lifetime specifier".into(),
            category: ErrorCategory::LifetimeError,
            fix: Some("Add lifetime parameter: fn foo<'a>(x: &'a str) -> &'a str".into()),
        },

        // Trait bounds
        TrainingSample {
            message: "error[E0277]: the trait bound `T: Clone` is not satisfied".into(),
            category: ErrorCategory::TraitBound,
            fix: Some("Add #[derive(Clone)] or implement Clone".into()),
        },

        // ... add 50+ samples for robust training
    ]
}

Step 5: Putting It Together

use aprender::tree::DecisionTreeClassifier;
use aprender::metrics::drift::{DriftDetector, DriftConfig};

pub struct ErrorOracle {
    classifier: DecisionTreeClassifier,
    predictor: NgramFixPredictor,
    tfidf: TfidfFeatureExtractor,
    drift_detector: DriftDetector,
}

impl ErrorOracle {
    pub fn new() -> Self {
        Self {
            classifier: DecisionTreeClassifier::new().with_max_depth(10),
            predictor: NgramFixPredictor::new(),
            tfidf: TfidfFeatureExtractor::new(),
            drift_detector: DriftDetector::new(DriftConfig::default()),
        }
    }

    /// Train the oracle on labeled data
    pub fn train(&mut self, samples: &[TrainingSample]) -> Result<(), AprenderError> {
        // Extract messages for TF-IDF
        let messages: Vec<&str> = samples.iter().map(|s| s.message.as_str()).collect();
        self.tfidf.fit(&messages)?;

        // Train N-gram predictor
        for sample in samples {
            if let Some(fix) = &sample.fix {
                self.predictor.learn_pattern(&sample.message, fix, sample.category);
            }
        }

        // Train classifier (simplified - real impl uses Matrix)
        // self.classifier.fit(&features, &labels)?;

        Ok(())
    }

    /// Classify an error and suggest fixes
    pub fn analyze(&self, error_message: &str) -> Analysis {
        let features = ErrorFeatures::from_message(error_message);
        let suggestions = self.predictor.predict(error_message, 3);

        Analysis {
            category: suggestions.first()
                .map(|s| s.category)
                .unwrap_or(ErrorCategory::Other),
            confidence: suggestions.first()
                .map(|s| s.confidence)
                .unwrap_or(0.0),
            suggestions,
        }
    }
}

pub struct Analysis {
    pub category: ErrorCategory,
    pub confidence: f32,
    pub suggestions: Vec<FixSuggestion>,
}

Usage Example

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create and train oracle
    let mut oracle = ErrorOracle::new();
    oracle.train(&rustc_training_data())?;

    // Analyze an error
    let error = "error[E0308]: mismatched types
      --> src/main.rs:10:5
       |
    10 |     foo(bar)
       |         ^^^ expected `i32`, found `&str`";

    let analysis = oracle.analyze(error);

    println!("Category: {:?}", analysis.category);
    println!("Confidence: {:.2}", analysis.confidence);
    println!("\nSuggested fixes:");
    for (i, suggestion) in analysis.suggestions.iter().enumerate() {
        println!("  {}. {} (confidence: {:.2})",
            i + 1, suggestion.fix, suggestion.confidence);
    }

    Ok(())
}

Output:

Category: TypeMismatch
Confidence: 0.85

Suggested fixes:
  1. Use .parse() or type conversion (confidence: 0.85)
  2. Use .to_string() to create owned String (confidence: 0.72)
  3. Check function signature for expected type (confidence: 0.65)

Extending to Your Domain

This pattern works for any error classification:

DomainCategoriesFeatures
SQL errorsSyntax, Permission, Connection, ConstraintQuery structure, error codes
HTTP errors4xx, 5xx, Timeout, AuthStatus codes, headers, timing
Build errorsDependency, Config, Resource, ToolchainPackage names, paths, versions
Test failuresAssertion, Timeout, Setup, FlakyTest names, stack traces

Key Takeaways

  1. Combine features: Hand-crafted domain knowledge + TF-IDF captures both explicit and latent patterns
  2. N-gram matching: Simple but effective for text similarity
  3. Feedback loops: Track success rates to improve predictions over time
  4. Drift detection: Monitor model performance and retrain when accuracy drops

The full implementation is available in depyler-oracle (128 tests, 4,399 LOC).