depyler: Python to Rust Transpilation

Toyota Way Principle (Kaizen): Continuous improvement. Transform Python ML code to faster, safer Rust.

Status: Complete

The Problem: Python’s Limitations for Production ML

Python dominates ML development but has critical production issues:

GIL (Global Interpreter Lock): Only one thread executes at a time
Dynamic types: Errors discovered at runtime
Slow execution: Interpreter overhead
Memory management: GC pauses

depyler Solution: Transpile to Safe, Fast Rust

┌─────────────────────────────────────────────────────────┐
│                   depyler Pipeline                       │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Python Code → AST → Type Inference → Rust Code        │
│       │                    │                            │
│       ↓                    ↓                            │
│  Dynamic types        Static types                      │
│  GIL bottleneck       True parallelism                  │
│  Runtime errors       Compile-time errors               │
│                                                         │
└─────────────────────────────────────────────────────────┘

Validation

Run all chapter examples:

make run-ch10          # Run all examples
make run-ch10-python   # Python transpilation
make run-ch10-ml       # ML patterns
make test-ch10         # Run all tests

Type Mapping

Python Type	Rust Type
`int`	`i64`
`float`	`f64`
`str`	`String`
`bool`	`bool`
`list[T]`	`Vec<T>`
`dict[K, V]`	`HashMap<K, V>`
`Optional[T]`	`Option<T>`

Type Inference

# Python (implicit types)
def calculate_mean(values):
    total = sum(values)
    return total / len(values)

#![allow(unused)]
fn main() {
// Rust (explicit types via inference)
fn calculate_mean(values: Vec<f64>) -> f64 {
    let total: f64 = values.iter().sum();
    total / values.len() as f64
}
}

GIL Elimination

The Python Problem

import threading

def compute(data):
    # Only ONE thread runs at a time!
    # GIL blocks true parallelism
    return sum(x*x for x in data)

threads = [threading.Thread(...) for _ in range(4)]
# 4 threads, but effectively 1 CPU used

The Rust Solution

#![allow(unused)]
fn main() {
use rayon::prelude::*;

fn compute(data: &[f64]) -> f64 {
    data.par_iter()  // TRUE parallelism
        .map(|x| x * x)
        .sum()
}
// All CPUs utilized, no GIL!
}

NumPy to trueno Mapping

NumPy	Rust (trueno)
`np.array([1, 2, 3])`	`Vector::from_slice(&[1.0, 2.0, 3.0])`
`np.zeros((3, 3))`	`Matrix::zeros(3, 3)`
`np.dot(a, b)`	`a.dot(&b)`
`a + b` (element-wise)	`a.add(&b)`
`np.sum(a)`	`a.sum()`
`np.mean(a)`	`a.mean()`
`a.reshape((2, 3))`	`a.reshape(2, 3)`

List Comprehension Transpilation

Python	Rust
`[x*2 for x in data]`	`data.iter().map(\|x\| x * 2).collect()`
`[x for x in data if x > 0]`	`data.iter().filter(\|&x\| x > 0).collect()`
`[x*2 for x in data if x > 0]`	`data.iter().filter(\|&x\| x > 0).map(\|x\| x * 2).collect()`
`sum([x*x for x in data])`	`data.iter().map(\|x\| x * x).sum()`

Example

# Python
squares = [x*x for x in range(10) if x % 2 == 0]

#![allow(unused)]
fn main() {
// Rust
let squares: Vec<i32> = (0..10)
    .filter(|x| x % 2 == 0)
    .map(|x| x * x)
    .collect();
}

ML Training Patterns

Python (scikit-learn)

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
mse = mean_squared_error(y_test, predictions)

Rust (aprender)

#![allow(unused)]
fn main() {
use aprender::LinearRegression;

let model = LinearRegression::new();
let trained = model.fit(&x_train, &y_train)?;
let predictions = trained.predict(&x_test);
let mse = predictions.mse(&y_test);
}

Memory Safety

Python (Runtime Errors)

data = [1, 2, 3]
value = data[10]  # IndexError at runtime!

Rust (Compile-time Safety)

#![allow(unused)]
fn main() {
let data = vec![1, 2, 3];

// Option 1: Checked access (returns Option)
if let Some(value) = data.get(10) {
    // Use value safely
}

// Option 2: Panic-safe with default
let value = data.get(10).unwrap_or(&0);
}

Performance Comparison

Operation	Python	Rust	Speedup
Matrix mult (1000x1000)	50ms	3ms	16.7x
List iteration	100ms	5ms	20x
JSON parsing	25ms	2ms	12.5x
File I/O	15ms	3ms	5x

Key factors:

No GIL contention
No interpreter overhead
Direct SIMD access
Zero-cost abstractions

EU AI Act Compliance

Article 10: Data Governance

#![allow(unused)]
fn main() {
// No dynamic import of untrusted code
// All dependencies compiled and verified
use approved_ml_lib::Model;
}

Article 13: Transparency

Type annotations make behavior explicit
Source-to-source mapping preserved
All transformations documented

Article 15: Robustness

Memory-safe execution
Type-safe operations
No GIL-related race conditions

Testing

#![allow(unused)]
fn main() {
#[test]
fn test_numpy_pattern_dot_product() {
    let a = vec![1.0, 2.0, 3.0];
    let b = vec![4.0, 5.0, 6.0];

    let dot: f64 = a.iter()
        .zip(b.iter())
        .map(|(x, y)| x * y)
        .sum();

    // 1*4 + 2*5 + 3*6 = 32
    assert!((dot - 32.0).abs() < 1e-10);
}

#[test]
fn test_list_comprehension_filter_map() {
    // [x*2 for x in data if x > 2]
    let data = vec![1, 2, 3, 4, 5];
    let result: Vec<i32> = data.iter()
        .filter(|&x| *x > 2)
        .map(|x| x * 2)
        .collect();

    assert_eq!(result, vec![6, 8, 10]);
}
}

Key Takeaways

GIL eliminated: True parallelism with Rayon
Type safety: Compile-time error detection
ML patterns preserved: NumPy → trueno, sklearn → aprender
Performance gains: 5-20x faster execution
EU compliant: Auditable, transparent, robust

Next Steps

Chapter 11: decy - TypeScript to Rust transpilation
Chapter 12: aprender - ML training with Rust

Source Code

Full implementation: examples/ch10-depyler/

# Verify all claims
make test-ch10

# Run examples
make run-ch10

Keyboard shortcuts

Sovereign AI Stack: EXTREME TDD for EU-Compliant AI Systems