depyler: Python to Rust Transpilation
Toyota Way Principle (Kaizen): Continuous improvement. Transform Python ML code to faster, safer Rust.
Status: Complete
The Problem: Python’s Limitations for Production ML
Python dominates ML development but has critical production issues:
- GIL (Global Interpreter Lock): Only one thread executes at a time
- Dynamic types: Errors discovered at runtime
- Slow execution: Interpreter overhead
- Memory management: GC pauses
depyler Solution: Transpile to Safe, Fast Rust
┌─────────────────────────────────────────────────────────┐
│ depyler Pipeline │
├─────────────────────────────────────────────────────────┤
│ │
│ Python Code → AST → Type Inference → Rust Code │
│ │ │ │
│ ↓ ↓ │
│ Dynamic types Static types │
│ GIL bottleneck True parallelism │
│ Runtime errors Compile-time errors │
│ │
└─────────────────────────────────────────────────────────┘
Validation
Run all chapter examples:
make run-ch10 # Run all examples
make run-ch10-python # Python transpilation
make run-ch10-ml # ML patterns
make test-ch10 # Run all tests
Type Mapping
| Python Type | Rust Type |
|---|---|
int | i64 |
float | f64 |
str | String |
bool | bool |
list[T] | Vec<T> |
dict[K, V] | HashMap<K, V> |
Optional[T] | Option<T> |
Type Inference
# Python (implicit types)
def calculate_mean(values):
total = sum(values)
return total / len(values)
#![allow(unused)]
fn main() {
// Rust (explicit types via inference)
fn calculate_mean(values: Vec<f64>) -> f64 {
let total: f64 = values.iter().sum();
total / values.len() as f64
}
}
GIL Elimination
The Python Problem
import threading
def compute(data):
# Only ONE thread runs at a time!
# GIL blocks true parallelism
return sum(x*x for x in data)
threads = [threading.Thread(...) for _ in range(4)]
# 4 threads, but effectively 1 CPU used
The Rust Solution
#![allow(unused)]
fn main() {
use rayon::prelude::*;
fn compute(data: &[f64]) -> f64 {
data.par_iter() // TRUE parallelism
.map(|x| x * x)
.sum()
}
// All CPUs utilized, no GIL!
}
NumPy to trueno Mapping
| NumPy | Rust (trueno) |
|---|---|
np.array([1, 2, 3]) | Vector::from_slice(&[1.0, 2.0, 3.0]) |
np.zeros((3, 3)) | Matrix::zeros(3, 3) |
np.dot(a, b) | a.dot(&b) |
a + b (element-wise) | a.add(&b) |
np.sum(a) | a.sum() |
np.mean(a) | a.mean() |
a.reshape((2, 3)) | a.reshape(2, 3) |
List Comprehension Transpilation
| Python | Rust |
|---|---|
[x*2 for x in data] | data.iter().map(|x| x * 2).collect() |
[x for x in data if x > 0] | data.iter().filter(|&x| x > 0).collect() |
[x*2 for x in data if x > 0] | data.iter().filter(|&x| x > 0).map(|x| x * 2).collect() |
sum([x*x for x in data]) | data.iter().map(|x| x * x).sum() |
Example
# Python
squares = [x*x for x in range(10) if x % 2 == 0]
#![allow(unused)]
fn main() {
// Rust
let squares: Vec<i32> = (0..10)
.filter(|x| x % 2 == 0)
.map(|x| x * x)
.collect();
}
ML Training Patterns
Python (scikit-learn)
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
mse = mean_squared_error(y_test, predictions)
Rust (aprender)
#![allow(unused)]
fn main() {
use aprender::LinearRegression;
let model = LinearRegression::new();
let trained = model.fit(&x_train, &y_train)?;
let predictions = trained.predict(&x_test);
let mse = predictions.mse(&y_test);
}
Memory Safety
Python (Runtime Errors)
data = [1, 2, 3]
value = data[10] # IndexError at runtime!
Rust (Compile-time Safety)
#![allow(unused)]
fn main() {
let data = vec![1, 2, 3];
// Option 1: Checked access (returns Option)
if let Some(value) = data.get(10) {
// Use value safely
}
// Option 2: Panic-safe with default
let value = data.get(10).unwrap_or(&0);
}
Performance Comparison
| Operation | Python | Rust | Speedup |
|---|---|---|---|
| Matrix mult (1000x1000) | 50ms | 3ms | 16.7x |
| List iteration | 100ms | 5ms | 20x |
| JSON parsing | 25ms | 2ms | 12.5x |
| File I/O | 15ms | 3ms | 5x |
Key factors:
- No GIL contention
- No interpreter overhead
- Direct SIMD access
- Zero-cost abstractions
EU AI Act Compliance
Article 10: Data Governance
#![allow(unused)]
fn main() {
// No dynamic import of untrusted code
// All dependencies compiled and verified
use approved_ml_lib::Model;
}
Article 13: Transparency
- Type annotations make behavior explicit
- Source-to-source mapping preserved
- All transformations documented
Article 15: Robustness
- Memory-safe execution
- Type-safe operations
- No GIL-related race conditions
Testing
#![allow(unused)]
fn main() {
#[test]
fn test_numpy_pattern_dot_product() {
let a = vec![1.0, 2.0, 3.0];
let b = vec![4.0, 5.0, 6.0];
let dot: f64 = a.iter()
.zip(b.iter())
.map(|(x, y)| x * y)
.sum();
// 1*4 + 2*5 + 3*6 = 32
assert!((dot - 32.0).abs() < 1e-10);
}
#[test]
fn test_list_comprehension_filter_map() {
// [x*2 for x in data if x > 2]
let data = vec![1, 2, 3, 4, 5];
let result: Vec<i32> = data.iter()
.filter(|&x| *x > 2)
.map(|x| x * 2)
.collect();
assert_eq!(result, vec![6, 8, 10]);
}
}
Key Takeaways
- GIL eliminated: True parallelism with Rayon
- Type safety: Compile-time error detection
- ML patterns preserved: NumPy → trueno, sklearn → aprender
- Performance gains: 5-20x faster execution
- EU compliant: Auditable, transparent, robust
Next Steps
- Chapter 11: decy - TypeScript to Rust transpilation
- Chapter 12: aprender - ML training with Rust
Source Code
Full implementation: examples/ch10-depyler/
# Verify all claims
make test-ch10
# Run examples
make run-ch10