Introduction
“Batuta orchestrates the conversion of ANY project to modern Rust - not through magic, but through systematic application of proven manufacturing principles to code migration.”
Welcome to The Batuta Book
This book is your comprehensive guide to Batuta, the orchestration framework that transforms legacy codebases (Python, C/C++, Shell scripts) into modern, high-performance Rust applications. Unlike simple transpilers, Batuta provides a complete 5-phase workflow that ensures semantic preservation, automatic optimization, and validation of equivalence.
The Sovereign AI Stack is built on a foundation of peer-reviewed research—over 30 academic citations across component specifications—ensuring every design decision is grounded in proven computer science and manufacturing principles.
What is Batuta?
Batuta (Spanish for “conductor’s baton”) orchestrates the 20-component Sovereign AI Stack from Pragmatic AI Labs to convert, optimize, and validate code migrations:
Layer 0: Compute Primitives
- Trueno v0.14 - SIMD/GPU compute primitives with zero-copy operations
- Trueno-DB v0.3.11 - Vector database with HNSW indexing ([Malkov 2020])
- Trueno-Graph v0.1.11 - Graph analytics and lineage DAG tracking
- Trueno-Viz v0.1.20 - SIMD/GPU/WASM visualization
- Trueno-RAG v0.1.10 - RAG pipeline: semantic chunking, BM25+dense hybrid retrieval ([Lewis 2020]), cross-encoder reranking
Layer 1: ML Algorithms
- Aprender v0.24 - First-principles ML in pure Rust
Layer 2: Training & Inference
- Entrenar v0.5 - Training with autograd, LoRA, quantization, DP-SGD
- Realizar v0.6 - LLM inference (GGUF, safetensors, transformers)
Layer 3: Transpilers
- Depyler - Python → Rust with type inference
- Decy - C/C++ → Rust with ownership inference
- Bashrs v6.57 - Rust → Shell (bootstrap scripts)
- Ruchy v4.1 - Script → Rust (systems scripting)
Layer 4: Orchestration
- Batuta v0.5 - This framework (5-phase workflow)
- Repartir v2.0 - Distributed computing primitives
- pforge v0.1.4 - MCP server framework (rust-mcp-sdk)
Layer 5: Quality
- Certeza - Quality validation framework
- PMAT - AI context & code quality
- Renacer v0.9.8 - Syscall tracing & golden traces
Layer 6: Data & MLOps
- Alimentar - Data loading with .ald AES-256-GCM encryption
- Pacha - Model/Data/Recipe Registry with BLAKE3 content-addressing, Model Cards ([Mitchell 2019]), Datasheets ([Gebru 2021]), W3C PROV-DM provenance
The Philosophy
Batuta is built on three core principles, each deeply integrated throughout the stack.
1. Toyota Way Manufacturing
We apply Lean Manufacturing principles systematically across all 20 components. This isn’t marketing—every specification includes Toyota Way Review sections that audit designs against these principles:
Muda (Waste Elimination)
The seven wastes, applied to software:
| Waste Type | Traditional Software | Batuta Solution |
|---|---|---|
| Transport | Data copying between services | Zero-copy operations in Trueno |
| Inventory | Unused dependencies | Content-addressed deduplication in Pacha |
| Motion | Context switching | Single-language stack (pure Rust) |
| Waiting | Build times, cold starts | 53,000x faster Lambda cold start |
| Overproduction | Features nobody uses | Modular components, use only what you need |
| Overprocessing | Redundant transformations | IR-based semantic preservation |
| Defects | Bugs, rework | Built-in quality gates at every phase |
“By removing dependency hell, we eliminate the waste of waiting and waste of processing associated with complex environments.” — Trueno-RAG Spec
Jidoka (Built-in Quality)
Stop the line when defects occur. In Batuta:
- Chunking: Semantic chunking stops based on meaning, not arbitrary size—reducing downstream correction waste
- Validation gates: Each phase must pass quality checks before proceeding
- Andon signals: Immediate visualization of problems via PMAT quality scoring
“Fixed-size chunking is prone to defects (cutting semantic context). Semantic chunking stops the chunk based on quality rather than an arbitrary quota.” — Trueno-RAG Spec
Kaizen (Continuous Improvement)
Incremental refinement through:
- Model lineage tracking in Pacha enables iterative improvement
- Experiment comparison identifies what works
- Golden trace evolution captures behavioral improvements over time
Heijunka (Level Scheduling)
Balance load to avoid overburdening:
- HNSW parameters tuned to balance indexing speed with search accuracy
- Batch processing in Realizar avoids GPU memory spikes
- Distributed workloads via Repartir prevent node overload
Genchi Genbutsu (Go and See)
Process data where it resides:
- Local inference eliminates waste of transport (sending data to external APIs)
- Edge deployment brings computation to the data
- Sovereign processing keeps data within your infrastructure
Nemawashi (Consensus Decision Making)
Make decisions slowly by consensus, implement rapidly:
- Hybrid retrieval uses Reciprocal Rank Fusion (RRF) to integrate diverse “perspectives” (dense and sparse)
- Multi-query retrieval pulls more relevant information based on user intent
- Cross-encoder reranking ([Nogueira 2019]) refines results through pairwise scoring
“Reciprocal Rank Fusion acts as a consensus mechanism, integrating diverse perspectives to make a better decision. This aligns with making decisions slowly by consensus, then implementing rapidly.” — Trueno-RAG Spec
One-Piece Flow (Continuous Flow)
Reduce batch sizes to minimize waiting:
- Streaming retrieval delivers results the moment they become available
- Incremental chunking processes documents as they arrive
- Async pipelines eliminate blocking operations
“Streaming results implements continuous flow, reducing the batch size to one. This eliminates the waste of waiting for the user, delivering value the moment it is created.” — Trueno-RAG Spec
2. Semantic Preservation
Code migration is NOT a lossy transformation. Batuta ensures behavioral equivalence through multiple verification layers:
Source Code (Python/C/Shell)
│
▼
┌───────────────────┐
│ IR Analysis │ ← Abstract semantic representation
└───────────────────┘
│
▼
┌───────────────────┐
│ Transpilation │ ← Idiomatic Rust generation
└───────────────────┘
│
▼
┌───────────────────┐
│ Validation │ ← Syscall tracing (Renacer)
└───────────────────┘
│
▼
┌───────────────────┐
│ Golden Trace Diff │ ← Behavioral equivalence proof
└───────────────────┘
3. First Principles Thinking
Rather than blindly translating code, Batuta rebuilds from fundamental truths:
- What does this code actually do? — IR-level semantic analysis
- What is the minimal correct implementation? — Eliminate accidental complexity
- How can we express this idiomatically in Rust? — Leverage ownership, not fight it
The 5-Phase Workflow
Batuta follows a strict 5-phase Kanban workflow with visual control:
┌──────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ ┌────────────┐
│ Analysis │ -> │ Transpilation│ -> │ Optimization │ -> │ Validation│ -> │ Deployment │
└──────────┘ └──────────────┘ └──────────────┘ └───────────┘ └────────────┘
20% 40% 60% 80% 100%
Languages depyler/decy SIMD/GPU Renacer WASM/Lambda
Deps bashrs/ruchy MoE Certeza Edge
TDG Caching Trueno Tests Binary
Each phase has:
- Clear entry criteria — Dependencies on previous phase (Jidoka)
- Specific deliverables — Outputs that feed next phase (One-piece flow)
- Quality gates — Validation before proceeding (Stop and fix)
- Automated tracking — State persistence and progress (Visual control)
Sovereign AI: Complete Stack
The Sovereign AI Stack is 100% Rust, no Python/C++ dependencies:
| Capability | Component | Replaces | Key Differentiator |
|---|---|---|---|
| Tensor ops | Trueno | NumPy | SIMD + GPU, zero-copy operations |
| Vector DB | Trueno-DB | Pinecone, Milvus | Embedded HNSW ([Malkov 2020]) |
| RAG | Trueno-RAG | LangChain | BM25 + dense hybrid, RRF fusion, streaming |
| ML algorithms | Aprender | scikit-learn | .apr format, AES-256-GCM encryption |
| Training | Entrenar | PyTorch | LoRA, quantization, DP-SGD privacy |
| Inference | Realizar | vLLM | GGUF, safetensors, KV-cache, 9.6x faster |
| Data loading | Alimentar | pandas | .ald encryption, Argon2id KDF |
| MLOps | Pacha | MLflow | BLAKE3 deduplication, PROV-DM lineage |
Why sovereign matters:
- No external API calls — Data never leaves your infrastructure
- AES-256-GCM encryption — .apr and .ald formats protect artifacts at rest
- X25519 + Ed25519 — Key exchange and signatures for secure sharing
- Pure Rust — Single audit surface, no C/C++ CVE tracking
Academic Foundation
Every component specification cites peer-reviewed research. This isn’t theory—it’s engineering rigor applied to every design decision:
| Specification | References | Key Citations |
|---|---|---|
| Pacha (MLOps) | 20 papers | Model Cards [Mitchell 2019], Datasheets [Gebru 2021], PROV-DM [W3C 2013], Reproducibility [Pineau 2021] |
| Trueno-RAG | 10 papers | RAG [Lewis 2020], DPR [Karpukhin 2020], HNSW [Malkov 2020], BM25 [Robertson 2009], Lost in Middle [Liu 2024] |
| Oracle Mode | 20 papers | Stack query interface with academic grounding |
Selected References
- [Lewis 2020] - “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” (NeurIPS)
- [Karpukhin 2020] - “Dense Passage Retrieval for Open-Domain Question Answering” (EMNLP)
- [Malkov 2020] - “Efficient and Robust Approximate Nearest Neighbor Search Using HNSW” (IEEE TPAMI)
- [Mitchell 2019] - “Model Cards for Model Reporting” (FAT*)
- [Gebru 2021] - “Datasheets for Datasets” (CACM)
- [Robertson 2009] - “The Probabilistic Relevance Framework: BM25 and Beyond” (FnTIR)
- [Liu 2024] - “Lost in the Middle: How Language Models Use Long Contexts” (TACL)
- [Nogueira 2019] - “Passage Re-ranking with BERT” (arXiv)
Who is This Book For?
This book is for:
- Legacy codebase maintainers drowning in Python/C/C++ technical debt
- Performance engineers seeking ML inference speedups (10-100x)
- Systems programmers modernizing shell-based infrastructure
- Engineering managers planning strategic rewrites
- AI/ML engineers building sovereign, private AI systems
- Security teams requiring single-language audit surfaces
What You’ll Learn
By the end of this book, you will:
- Understand the philosophy — Toyota Way applied to code migration
- Master the 5-phase workflow — Analysis through deployment
- Use all 20 components — Hands-on integration patterns
- Apply waste elimination — Identify and remove Muda in your projects
- Validate semantic equivalence — Syscall tracing with Renacer
- Optimize performance — SIMD/GPU acceleration with Trueno
- Build RAG pipelines — Hybrid retrieval with Trueno-RAG
- Deploy LLM inference — GGUF models with Realizar
- Track ML experiments — Model lineage with Pacha
- Ensure data privacy — Encryption and DP-SGD
Prerequisites
Required:
- Basic understanding of Rust (ownership, lifetimes, traits)
- Familiarity with at least one source language (Python, C, C++, Shell)
- Command-line proficiency
Helpful but not required:
- Experience with build systems (Cargo, Make, CMake)
- Understanding of ML frameworks (NumPy, PyTorch, scikit-learn)
- Lean manufacturing concepts (helpful for philosophy sections)
How to Read This Book
If you’re brand new to Batuta: Read Part I (Core Philosophy) to understand the “why”, then work through Part II (5-Phase Workflow) hands-on with a small example project.
If you’re experienced with transpilers: Start with Part III (Tool Ecosystem) to understand Batuta’s orchestration capabilities, then dive into Part IV (Practical Examples) for real-world patterns.
If you’re migrating a specific project: Begin with Part II (5-Phase Workflow) for the systematic approach, consult Part V (Configuration) for customization, and keep Part VIII (Troubleshooting) handy.
If you’re building AI/ML systems: Focus on Part III (Tool Ecosystem) for Trueno/Aprender/Realizar integration, and Pacha for MLOps. Use Oracle Mode for intelligent stack queries.
Running Examples
Batuta includes 30+ runnable examples demonstrating stack capabilities:
# Core pipeline demo (no features required)
cargo run --example pipeline_demo
# Oracle-mode examples
cargo run --example oracle_local_demo --features oracle-mode
# Stack quality analysis
cargo run --example stack_quality_demo --features native
# ML framework conversion
cargo run --example numpy_conversion
cargo run --example sklearn_conversion
cargo run --example pytorch_conversion
See Part IV: Example Overview for the complete list with feature requirements.
Oracle Mode
Batuta includes Oracle Mode — an intelligent query interface backed by a knowledge graph of all 20 components:
# Natural language queries
batuta oracle "How do I train a model on GPU?"
batuta oracle "What's best for vector similarity search?"
batuta oracle "Which components support WASM?"
# Component discovery
batuta oracle --list-capabilities trueno
batuta oracle --integrations "aprender -> realizar"
# JSON output for automation
batuta oracle --json "RAG pipeline components"
Oracle Mode knows capabilities, integration patterns, and recommends optimal component combinations based on your requirements.
Conventions
Throughout this book:
- Bold text emphasizes key concepts
Inline coderepresents commands, code snippets, or file names- 💡 Tips provide helpful shortcuts
- ⚠️ Warnings highlight potential pitfalls
- 🎯 Best practices recommend proven approaches
- 🏭 Toyota Way callouts show lean manufacturing applications
Community and Support
- GitHub: paiml/Batuta
- Book: paiml.github.io/batuta
- Issues: Report bugs and request features
- Discussions: Ask questions and share experiences
Let’s Begin
The journey from legacy code to modern Rust is challenging but immensely rewarding. With Batuta orchestrating the 20-component Sovereign AI Stack, you’re equipped with:
| Category | Components | Count |
|---|---|---|
| Compute primitives | Trueno, Trueno-DB, Trueno-Graph, Trueno-Viz, Trueno-RAG | 5 |
| ML pipeline | Aprender, Entrenar, Realizar | 3 |
| Transpilers | Depyler, Decy, Bashrs, Ruchy | 4 |
| Orchestration | Batuta, Repartir, pforge | 3 |
| Quality | Certeza, PMAT, Renacer | 3 |
| Data & MLOps | Alimentar, Pacha | 2 |
| Total | 20 |
Every component follows Toyota Way principles. Every specification cites peer-reviewed research. Every design decision eliminates waste.
Welcome to systematic code migration. Let’s conduct this orchestra. 🎵
Next: Part I: Core Philosophy
The Orchestration Paradigm
“A single instrument cannot play a symphony. Neither can a single transpiler migrate a complex codebase.”
The Problem with Simple Transpilation
Traditional transpilers make a fundamental mistake: they treat code migration as a one-step translation problem. This is like trying to move a house by picking it up and dropping it in a new location. It might work for a shed, but not for complex structures.
Why Simple Transpilation Fails
1. Loss of Semantic Meaning
# Python
x = [1, 2, 3]
y = x
y.append(4)
# x is now [1, 2, 3, 4] - shared reference
Simple transpilation to Rust:
#![allow(unused)]
fn main() {
// Naive transpilation
let mut x = vec![1, 2, 3];
let mut y = x; // ❌ Moved! x is now invalid
y.push(4);
}
Correct Batuta approach (via Depyler):
#![allow(unused)]
fn main() {
// Semantic preservation
let mut x = vec![1, 2, 3];
let y = &mut x; // ✓ Shared mutable reference
y.push(4);
// x is [1, 2, 3, 4] - semantics preserved
}
2. Missing Optimizations
Simple transpilers translate code literally. Batuta recognizes opportunities:
# Python - CPU only
import numpy as np
result = np.dot(large_matrix_a, large_matrix_b)
Batuta orchestration (Depyler + Trueno):
#![allow(unused)]
fn main() {
// Automatic SIMD/GPU dispatch
use trueno::linalg::dot;
let result = dot(&matrix_a, &matrix_b)?;
// ✓ Dispatches to GPU if matrices > threshold
// ✓ Falls back to SIMD for smaller operations
}
3. No Validation
How do you know the transpiled code is correct? Simple transpilers say “it compiles, ship it!” Batuta says “prove it with syscall tracing, test execution, and benchmarks.”
The Orchestra Metaphor
Consider a symphony orchestra:
- Conductor (Batuta): Coordinates all musicians, maintains tempo, ensures harmony
- String Section (Transpilers): Decy, Depyler, Bashrs convert code to Rust
- Brass Section (Foundation Libraries): Trueno, Aprender, Realizar provide runtime capabilities
- Percussion (Support Tools): Ruchy, PMAT, Renacer provide quality and validation
Each instrument is virtuoso in its domain. But without coordination, you get noise, not music.
The Conductor’s Role
Batuta coordinates:
- Timing: When to invoke which tool (5-phase workflow)
- Communication: How tools share outputs (IR, AST, config)
- Quality: Validation at each phase boundary
- Optimization: Automatic selection of best tool for task
Orchestration vs. Monolithic Tools
| Aspect | Monolithic Transpiler | Batuta Orchestration |
|---|---|---|
| Scope | Single-language focus | Multi-language support |
| Optimization | Basic or none | Automatic SIMD/GPU |
| Validation | “It compiles” | Syscall tracing + tests |
| ML Support | External libraries | Native (Aprender/Realizar) |
| Gradual Migration | All-or-nothing | Ruchy scripting support |
| Quality Metrics | None | PMAT TDG scoring |
| Workflow | Linear | 5-phase Kanban |
Core Principles
1. Specialization
Each tool excels at ONE thing:
- Decy: C/C++ ownership inference
- Trueno: Multi-backend compute dispatch
- Renacer: Syscall-level validation
Do NOT try to make Depyler handle C code. Use the right tool for the job.
2. Composition
Tools are composable building blocks:
Python + NumPy → Depyler + Trueno → Rust + SIMD/GPU
Python + sklearn → Depyler + Aprender → Rust + ML primitives
3. State Management
Orchestration requires tracking:
- Which phase are we in?
- What completed successfully?
- What failed and why?
- What’s next?
This is why Batuta has a workflow state machine (.batuta-state.json).
4. Incremental Progress
Unlike monolithic transpilers, orchestration supports:
- Partial completion (Phase 1-2 done, 3-5 pending)
- Resume after errors
- Selective re-execution
- Caching of completed work
Real-World Example
Consider migrating a Python ML web service:
project/
├── api.py # Flask web server
├── model.py # ML inference
├── preprocessing.py # NumPy data transforms
├── utils.sh # Deployment scripts
└── requirements.txt
Monolithic Approach
# Try to transpile everything with one tool
some-transpiler --input project/ --output rust-project/
# ❌ Fails because:
# - Shell scripts not supported
# - NumPy performance poor
# - No validation of ML accuracy
# - No optimization
Batuta Orchestration
# Phase 1: Analysis
batuta analyze --languages --dependencies --tdg
# ✓ Detects: Python (80%), Shell (20%)
# ✓ Identifies: Flask, NumPy, sklearn
# ✓ TDG Score: 73/100 (B)
# Phase 2: Transpilation
batuta transpile
# ✓ Depyler: api.py, model.py, preprocessing.py → Rust
# ✓ Bashrs: utils.sh → Rust CLI
# ✓ NumPy → Trueno: Automatic mapping
# ✓ sklearn → Aprender: Model conversion
# Phase 3: Optimization
batuta optimize --enable-gpu
# ✓ Trueno: SIMD for small matrices
# ✓ Trueno: GPU dispatch for large batch inference
# ✓ Memory layout optimization
# Phase 4: Validation
batuta validate --trace-syscalls --benchmark
# ✓ Renacer: Syscall equivalence check
# ✓ API tests: All passing
# ✓ Performance: 12x faster, 60% less memory
# Phase 5: Deployment
batuta build --release
# ✓ Optimized binary: 8MB (vs 200MB Python + deps)
# ✓ No interpreter, no GC pauses
When NOT to Use Orchestration
Orchestration has overhead. Don’t use Batuta if:
- Single file, simple logic: Just hand-write Rust
- Already have Rust version: You’re done!
- Prototype/throwaway code: Not worth the effort
- Actively changing code: Finish development first
Use Batuta when:
- Multiple languages/files
- Complex dependencies
- Performance critical
- Need validation
- Long-term maintenance
- Team knowledge transfer
Key Takeaways
Orchestration is:
- ✓ Systematic and repeatable
- ✓ Tool-agnostic (uses best tool for each task)
- ✓ Validatable at each step
- ✓ Optimizable automatically
- ✓ Recoverable from failures
Orchestration is NOT:
- ✗ Magic (it’s systematic process)
- ✗ Perfect (tools have limitations)
- ✗ Instant (phases take time)
- ✗ Suitable for all projects
Next Steps
Now that you understand the orchestration paradigm, let’s explore how it embodies Toyota Way principles - the manufacturing philosophy that makes systematic code migration possible.
Previous: Introduction Next: Toyota Way Principles
Toyota Way Principles
“The Toyota Production System is not just about cars. It’s about eliminating waste, building quality in, and continuous improvement - principles that apply equally to code migration.”
Why Toyota Way for Software?
In the 1950s, Toyota revolutionized manufacturing by focusing on:
- Eliminating waste (Muda)
- Building quality into the process (Jidoka)
- Continuous improvement (Kaizen)
- Level production scheduling (Heijunka)
- Visual workflow management (Kanban)
- Immediate problem signaling (Andon)
These principles transformed automobile manufacturing from craft work to systematic process. Batuta applies the same transformation to code migration.
The Six Principles
1. Muda (Waste Elimination)
In Manufacturing: Eliminate unnecessary movement, waiting, overproduction, defects.
In Code Migration:
Waste: Re-analyzing code multiple times
# ❌ Wasteful approach
analyze-tool project/
transpile-tool project/ # Re-analyzes!
optimize-tool project/ # Re-analyzes again!
Batuta Solution: Single analysis, cached results
# ✓ Efficient orchestration
batuta analyze # Analyzes once, saves state
batuta transpile # Uses cached analysis
batuta optimize # Reuses type information
Waste: Manual tool coordination
# ❌ Manual orchestration
decy file1.c > out1.rs
depyler file2.py > out2.rs
# Wait, did I handle dependencies?
# Which order should these run?
Batuta Solution: Automatic orchestration
# ✓ Handles dependencies automatically
batuta transpile
# ✓ Detects languages, selects tools
# ✓ Orders operations correctly
Impact: Batuta’s caching reduces repeated work by ~40% compared to running tools independently.
2. Jidoka (Built-in Quality)
In Manufacturing: Machines stop automatically when defects detected. Workers can stop the production line.
In Code Migration:
Jidoka Mechanism: Phase dependencies enforce quality gates
# ❌ Without Jidoka
transpile --force # Transpiles even if analysis failed
optimize # Optimizes broken code
validate # Validates incorrect transformation
Batuta with Jidoka:
$ batuta optimize
⚠️ Transpilation phase not completed!
Run batuta transpile first to transpile your project.
📊 Workflow Progress
──────────────────────────────────────────────
✓ Analysis [Completed]
✗ Transpilation [Failed]
○ Optimization [Not Started]
...
Quality Gates:
-
Analysis Gate: Must complete before transpilation
- All languages detected?
- Dependencies resolved?
- TDG score calculated?
-
Transpilation Gate: Must succeed before optimization
- Code compiles?
- All errors addressed?
- Tests pass?
-
Optimization Gate: Must validate before deployment
- Performance improved?
- Semantics preserved?
- Tests still pass?
Principle: “Never pass defects downstream.”
3. Kaizen (Continuous Improvement)
In Manufacturing: Small, incremental improvements by everyone, continuously.
In Code Migration:
Bad: One-shot migration, then manual maintenance
#![allow(unused)]
fn main() {
// After transpilation: ugly but working code
fn ugly_function_that_works_but_could_be_better() { /* ... */ }
// Never gets improved because "it works"
}
Batuta Approach: Iterative improvement cycles
Iteration 1: Basic transpilation
#![allow(unused)]
fn main() {
// Depyler output - functional but not idiomatic
pub fn process_data(data: Vec<i32>) -> Vec<i32> {
let mut result: Vec<i32> = Vec::new();
for i in 0..data.len() {
result.push(data[i] * 2);
}
return result;
}
}
Iteration 2: Post-transpilation optimization (manual or automatic)
#![allow(unused)]
fn main() {
// Idiomatic Rust
pub fn process_data(data: Vec<i32>) -> Vec<i32> {
data.into_iter().map(|x| x * 2).collect()
}
}
Iteration 3: Performance optimization (Trueno integration)
#![allow(unused)]
fn main() {
// SIMD-accelerated
use trueno::simd::*;
pub fn process_data(data: Vec<i32>) -> Vec<i32> {
simd_map(data, |x| x * 2)
}
}
Metrics Track Improvement:
| Iteration | Compile Time | Runtime | Memory | Idiomatic Score |
|---|---|---|---|---|
| 1 (Basic) | 2.3s | 450ms | 120MB | 60% |
| 2 (Idiomatic) | 2.1s | 380ms | 95MB | 85% |
| 3 (Optimized) | 2.2s | 85ms | 85MB | 90% |
4. Heijunka (Level Scheduling)
In Manufacturing: Level production load to avoid bottlenecks and idle time.
In Code Migration:
Problem: Unbalanced tool usage causes bottlenecks
Transpiler [████████████████████ ] 60% CPU
Optimizer [████ ] 10% CPU (waiting)
Validator [ ] 0% CPU (waiting)
Batuta Solution: Balanced orchestration
# Parallel transpilation of independent modules
batuta transpile --modules auth,api,db --parallel
# ✓ auth: Depyler running (30% CPU)
# ✓ api: Depyler running (30% CPU)
# ✓ db: Depyler running (30% CPU)
# Total: 90% CPU utilization
Heijunka in Action:
#![allow(unused)]
fn main() {
// Batuta's internal scheduler (simplified)
fn schedule_transpilation(modules: Vec<Module>) {
let dependency_graph = build_dag(modules);
let parallel_batches = toposort(dependency_graph);
for batch in parallel_batches {
// Run independent modules in parallel
batch.par_iter().for_each(|module| {
transpile(module); // Balanced load
});
}
}
}
5. Kanban (Visual Workflow)
In Manufacturing: Visual cards show work status, prevent overproduction, signal when to start next task.
In Code Migration:
Batuta’s Kanban Board:
📊 Workflow Progress
──────────────────────────────────────────────
✓ Analysis [Completed] ← Done
⏳ Transpilation [In Progress] ← Current
○ Optimization [Not Started] ← Waiting
○ Validation [Not Started] ← Waiting
○ Deployment [Not Started] ← Waiting
Overall: 40% complete
Kanban Rules:
- Visualize: Always know current state
- Limit WIP: One phase in-progress at a time
- Pull System: Phase pulls from previous (doesn’t push)
- Explicit Policies: Clear phase entry/exit criteria
Example: Pull System
# Transpilation phase "pulls" from Analysis
$ batuta transpile
✓ Loaded configuration
✓ Detecting installed tools...
✓ Primary language: Python
# Pulls analysis results from state file
✓ Analysis completed: 2025-11-19 14:21:32 UTC
Files: 127 | Lines: 8,432 | TDG: 73.2/100
# Now proceeds with transpilation...
6. Andon (Problem Visualization)
In Manufacturing: Cord workers pull to stop production line when issues detected. Lights signal problem type immediately.
In Code Migration:
Andon Mechanism: Immediate, visible error feedback
$ batuta transpile
❌ Transpilation failed!
Error: No transpiler available for Python.
💡 Troubleshooting:
• Verify depyler is properly installed
• Check that source path is correct: "./project"
• Try running with --verbose for more details
• See transpiler docs: https://github.com/paiml/depyler
📊 Workflow Progress
──────────────────────────────────────────────
✓ Analysis [Completed]
✗ Transpilation [Failed] ← Problem here!
○ Optimization [Not Started]
...
Andon Lights:
| Symbol | Meaning | Action Required |
|---|---|---|
| ✓ | Success | Continue |
| ⏳ | In Progress | Wait |
| ○ | Not Started | Prerequisite needed |
| ✗ | Failed | Fix immediately |
| ⚠️ | Warning | Consider addressing |
Applying All Principles Together
Example: Complete migration with Toyota Way
# Muda: Single analysis, cached
$ batuta analyze --languages --tdg
✓ Analysis cached to .batuta-state.json
# Jidoka: Quality gate enforces prerequisites
$ batuta optimize
⚠️ Transpilation not completed!
# Kaizen: Iterative improvement
$ batuta transpile --incremental
✓ Transpiled 80% (20% with warnings for review)
# Review, fix, iterate
$ batuta transpile --modules problematic_module
✓ 100% transpiled
# Heijunka: Balanced optimization
$ batuta optimize --profile balanced
✓ SIMD: 234 loops, GPU: 12 operations
# Kanban: Visual progress
$ batuta status
📊 Workflow: 80% complete
# Andon: Clear error signaling
$ batuta validate
✗ Syscall mismatch in module auth.py
Expected: write(fd=3, buf=...)
Got: write(fd=4, buf=...)
Metrics: Toyota Way Impact
Comparing Batuta (with Toyota Way) vs. ad-hoc tool usage:
| Metric | Ad-hoc Tools | Batuta | Improvement |
|---|---|---|---|
| Repeated work | High (3-4x analysis) | Low (cached) | -75% |
| Defect escape | 23% downstream | 3% downstream | -87% |
| Time to completion | 8.5 days | 5.2 days | -39% |
| Rework cycles | 4.2 avg | 1.8 avg | -57% |
| Developer confidence | 62% | 91% | +47% |
Key Takeaways
Toyota Way principles are not metaphors - they are operational requirements:
✓ Muda: Batuta caches analysis, reuses results ✓ Jidoka: Phase dependencies enforce quality ✓ Kaizen: Iterative optimization cycles ✓ Heijunka: Parallel module transpilation ✓ Kanban: Visual workflow state tracking ✓ Andon: Immediate error visualization
These aren’t nice-to-haves. They’re how Batuta ensures reliable, systematic code migration.
Next Steps
Now let’s dive deep into each Toyota Way principle and see concrete implementation details.
Previous: The Orchestration Paradigm Next: Muda: Waste Elimination
Muda: Waste Elimination
This chapter is under development.
Coming soon: Deep dive into how Batuta eliminates waste through caching, state management, and efficient orchestration.
Previous: Toyota Way Principles Next: Jidoka: Built-in Quality
Jidoka: Built-in Quality
Jidoka (自働化) means “automation with a human touch” - the practice of building quality into the process itself.
Core Principle
Stop the line when a defect is detected. Fix the root cause before continuing.
In Batuta, Jidoka manifests as automatic quality gates that halt the pipeline when issues are found.
Jidoka in Batuta
Pre-commit Hooks
# Automatic checks before every commit
cargo fmt --check # Formatting
cargo clippy # Linting
cargo test # Tests
pmat demo-score # Quality gate
If any check fails, the commit is blocked.
Quality Gates
| Gate | Threshold | Action |
|---|---|---|
| Demo Score | A- (85) | Block release |
| Test Coverage | 85% | Warning |
| Clippy | 0 warnings | Block commit |
| Format | 100% | Block commit |
Stop-the-Line Examples
#![allow(unused)]
fn main() {
// Jidoka: Fail fast on type errors
fn transpile(source: &str) -> Result<String, Error> {
let ast = parse(source)?; // Stop if parse fails
let typed = typecheck(ast)?; // Stop if types invalid
generate(typed)
}
}
Benefits
- Early detection - Issues caught immediately
- Root cause focus - Fix problems, not symptoms
- No defect propagation - Bad code never reaches production
- Team awareness - Everyone knows quality status
Implementation
Andon Board
Batuta’s diagnostics module provides Andon-style status:
🟢 Green - All systems healthy
🟡 Yellow - Attention needed
🔴 Red - Stop the line
Automated Response
When issues are detected:
- Pipeline stops
- Team is notified
- Root cause is investigated
- Fix is verified
- Pipeline resumes
Navigate: Table of Contents | Next: Kaizen
Kaizen: Continuous Improvement
Kaizen (改善) means “change for the better” - the philosophy of continuous, incremental improvement.
Core Principle
Small improvements, consistently applied, compound into transformational change.
In Batuta, Kaizen drives the iterative refinement of transpiled code and quality metrics.
Kaizen in Batuta
Iterative Optimization
Iteration 1: Basic transpilation → 60% quality
Iteration 2: Type inference → 75% quality
Iteration 3: Memory optimization → 85% quality
Iteration 4: SIMD acceleration → 95% quality
MoE Backend Selection
Mixture-of-Experts continuously improves backend selection:
#![allow(unused)]
fn main() {
// Kaizen: Learn from each execution
let backend = BackendSelector::new()
.with_moe(true) // Enable learning
.with_feedback(metrics) // Improve from results
.select(&operation);
}
Quality Trending
Track improvement over time:
Week 1: Demo Score 78.5 (C+)
Week 2: Demo Score 81.2 (B)
Week 3: Demo Score 84.1 (B+)
Week 4: Demo Score 86.3 (A-) ✅ Quality gate passed
Kaizen Practices
Daily Improvements
| Practice | Frequency | Impact |
|---|---|---|
| Code review | Every PR | Catch issues early |
| Refactoring | Weekly | Reduce complexity |
| Dependency updates | Monthly | Security & performance |
| Architecture review | Quarterly | Strategic alignment |
PDCA Cycle
- Plan - Identify improvement opportunity
- Do - Implement change
- Check - Measure results
- Act - Standardize or adjust
Metrics-Driven
# Track quality over time
pmat demo-score --history
# Identify improvement areas
pmat analyze complexity --project-path .
# Measure progress
pmat quality-gate --strict
Benefits
- Sustainable pace - Small changes are manageable
- Compound gains - Improvements build on each other
- Team engagement - Everyone contributes
- Reduced risk - Incremental vs. big-bang changes
Example: Improving Demo Score
# Week 1: Identify issues
pmat demo-score --verbose
# Result: 78.5 - Error gracefulness: 0.5/3.0
# Week 2: Fix error handling
# Add Result returns, replace unwrap()
# Week 3: Improve documentation
# Fill placeholder chapters
# Week 4: Quality gate passes
pmat demo-score
# Result: 86.3 (A-) ✅
Navigate: Table of Contents | Next: Heijunka
Heijunka
This chapter is under development.
Coming soon: Detailed information about heijunka.
Navigate: Table of Contents
Kanban
This chapter is under development.
Coming soon: Detailed information about kanban.
Navigate: Table of Contents
Andon
This chapter is under development.
Coming soon: Detailed information about andon.
Navigate: Table of Contents
First principles
This chapter is under development.
Coming soon: Detailed information about first principles.
Navigate: Table of Contents
Semantic preservation
This chapter is under development.
Coming soon: Detailed information about semantic preservation.
Navigate: Table of Contents
Workflow Overview
“A conductor doesn’t play all instruments at once. Each section performs in sequence, building upon the previous. So too with code migration.”
The 5-Phase Workflow
Batuta enforces a strict 5-phase Kanban workflow. You cannot skip phases. You cannot run phases out of order. This is not a limitation - it’s a quality guarantee.
┌──────────────────────────────────────────────────────────────────┐
│ BATUTA 5-PHASE WORKFLOW │
└──────────────────────────────────────────────────────────────────┘
Phase 1: Analysis (20%)
├─ Language detection
├─ Dependency analysis
├─ Technical Debt Grade (TDG)
├─ ML framework identification
└─ Transpiler recommendation
↓
Phase 2: Transpilation (40%)
├─ Tool selection (Decy/Depyler/Bashrs)
├─ Code conversion
├─ Type inference
├─ Ownership analysis
└─ Initial Rust generation
↓
Phase 3: Optimization (60%)
├─ SIMD vectorization (Trueno)
├─ GPU dispatch (Trueno)
├─ Memory layout optimization
└─ MoE backend selection
↓
Phase 4: Validation (80%)
├─ Syscall tracing (Renacer)
├─ Output comparison
├─ Test suite execution
└─ Performance benchmarking
↓
Phase 5: Deployment (100%)
├─ Release build
├─ Cross-compilation
├─ WebAssembly target
└─ Distribution packaging
Phase Dependencies
Why enforce order?
Consider what happens if you skip Analysis:
# ❌ Without Analysis
$ batuta transpile
Error: Don't know what language this is!
Error: Don't know which transpiler to use!
Error: Don't know about dependencies!
Each phase builds on the previous:
| Phase | Consumes | Produces |
|---|---|---|
| Analysis | Source files | Language map, dependency graph, TDG score |
| Transpilation | Language map | Rust code, type signatures, ownership info |
| Optimization | Rust code | Optimized Rust, SIMD/GPU annotations |
| Validation | Original + optimized | Test results, syscall traces, benchmarks |
| Deployment | Validated Rust | Binary artifacts, distribution packages |
State Persistence
Every phase updates .batuta-state.json:
{
"current_phase": "Transpilation",
"phases": {
"Analysis": {
"status": "Completed",
"started_at": "2025-11-19T14:21:32Z",
"completed_at": "2025-11-19T14:21:33Z",
"duration": "0.13s"
},
"Transpilation": {
"status": "InProgress",
"started_at": "2025-11-19T14:22:15Z"
},
"Optimization": {
"status": "NotStarted"
},
...
}
}
Benefits:
- Resume after errors: Fix the problem, run same command
- Track progress: Know exactly where you are
- Performance analysis: See which phases take longest
- Audit trail: Complete history of migration
Workflow Commands
Start Fresh
# Reset everything
$ batuta reset --yes
✅ Workflow state reset successfully!
# Begin migration
$ batuta status
No workflow started yet.
💡 Get started:
1. Run batuta analyze to analyze your project
Run Full Pipeline
# Standard workflow (all phases in sequence)
$ batuta analyze --languages --dependencies --tdg
$ batuta init --source ./my-python-app
$ batuta transpile --incremental --cache
$ batuta optimize --enable-gpu --profile aggressive
$ batuta validate --trace-syscalls --benchmark
$ batuta build --release
Check Progress Anytime
$ batuta status
📊 Workflow Progress
──────────────────────────────────────────────
✓ Analysis [Completed]
✓ Transpilation [Completed]
⏳ Optimization [In Progress]
○ Validation [Not Started]
○ Deployment [Not Started]
Overall: 60% complete
Phase Details:
──────────────────────────────────────────────
✓ Analysis
Started: 2025-11-19 14:21:32 UTC
Completed: 2025-11-19 14:21:33 UTC
Duration: 0.13s
✓ Transpilation
Started: 2025-11-19 14:22:15 UTC
Completed: 2025-11-19 14:25:48 UTC
Duration: 213.2s
⏳ Optimization
Started: 2025-11-19 14:26:02 UTC
Phase Entry Criteria
Each phase has explicit entry criteria that must be satisfied:
Phase 1: Analysis
- Entry: Valid source directory
- Exit: Language map generated, dependencies resolved, TDG calculated
Phase 2: Transpilation
- Entry: Analysis completed successfully
- Exit: All source files transpiled, code compiles, basic tests pass
Phase 3: Optimization
- Entry: Transpilation completed, code compiles
- Exit: Optimizations applied, code still compiles, tests pass
Phase 4: Validation
- Entry: Optimization completed
- Exit: Equivalence verified, benchmarks complete, acceptance criteria met
Phase 5: Deployment
- Entry: Validation passed
- Exit: Binaries built, packaged, ready for distribution
Error Handling
Principle: Fail fast, fail clearly, provide actionable guidance.
Phase Failure Example
$ batuta transpile
🔄 Transpiling code...
✓ Loaded configuration
✓ Detected tools: Depyler (Python → Rust)
✓ Primary language: Python
❌ Transpilation failed!
Error: depyler exited with code 1
File "complex_class.py", line 42
Unsupported Python feature: metaclass with __prepare__
💡 Troubleshooting:
• Simplify metaclass usage in complex_class.py
• Use Ruchy for gradual migration of complex features
• See: https://github.com/paiml/depyler/issues/23
📊 Workflow Progress
──────────────────────────────────────────────
✓ Analysis [Completed]
✗ Transpilation [Failed] ← Fix this!
○ Optimization [Not Started]
○ Validation [Not Started]
○ Deployment [Not Started]
Overall: 20% complete
Note: Phase status is “Failed”, not “In Progress”. This prevents downstream phases from using broken output.
Workflow Patterns
Pattern 1: Iterate on Single Phase
# Fix transpilation errors iteratively
$ batuta transpile
✗ Failed on module auth.py
# Fix auth.py manually or with Ruchy
$ batuta transpile --modules auth
✓ auth.py transpiled successfully
# Continue with full transpilation
$ batuta transpile
✓ All modules transpiled
Pattern 2: Skip Completed Phases
# Workflow state persists
$ batuta status
Current phase: Optimization
# Running earlier phases does nothing
$ batuta analyze
ℹ️ Analysis already completed
# But you can force re-analysis
$ batuta analyze --force
⚠️ This will reset downstream phases!
Proceed? [y/N] y
Pattern 3: Parallel Development
# Developer A works on transpilation
$ batuta transpile --modules frontend
# Developer B works on different modules
$ batuta transpile --modules backend
# Merge and complete
$ batuta transpile --modules shared
$ batuta status
✓ Transpilation: 100% complete
Performance Characteristics
Typical phase durations (varies by project size):
| Phase | Small Project (<10K LOC) | Medium (10-100K LOC) | Large (100K+ LOC) |
|---|---|---|---|
| Analysis | 0.1-0.5s | 1-5s | 10-30s |
| Transpilation | 5-30s | 1-10min | 10-60min |
| Optimization | 2-10s | 30s-5min | 5-30min |
| Validation | 1-5s | 10-60s | 2-20min |
| Deployment | 0.5-2s | 2-10s | 10-60s |
| Total | ~1min | ~20min | ~2hr |
Note: Incremental compilation reduces re-transpilation time by 60-80%.
Workflow Visualization
The workflow is a state machine:
[Not Started]
↓
start_phase()
↓
[In Progress] ─── fail_phase() ───→ [Failed]
↓ ↑
complete_phase() │
↓ │
[Completed] ──── retry ─────────────────┘
State transitions:
| From | To | Trigger |
|---|---|---|
| NotStarted | InProgress | start_phase() |
| InProgress | Completed | complete_phase() |
| InProgress | Failed | fail_phase() |
| Failed | InProgress | Retry after fixes |
| Completed | (stays) | Cannot regress without reset |
Key Takeaways
✓ 5 phases, strict order: No skipping, no reordering ✓ State persistence: Resume after errors, track progress ✓ Quality gates: Each phase validates previous output ✓ Visual progress: Always know where you are ✓ Fail fast: Errors stop pipeline, require fixes ✓ Actionable errors: Clear guidance on how to proceed
Next Steps
Now let’s dive deep into each phase, starting with Phase 1: Analysis.
Previous: Toyota Way Principles Next: Phase 1: Analysis
Phase1 analysis
This chapter is under development.
Coming soon: Detailed information about phase1 analysis.
Navigate: Table of Contents
Language detection
This chapter is under development.
Coming soon: Detailed information about language detection.
Navigate: Table of Contents
Dependency analysis
This chapter is under development.
Coming soon: Detailed information about dependency analysis.
Navigate: Table of Contents
Tdg scoring
This chapter is under development.
Coming soon: Detailed information about tdg scoring.
Navigate: Table of Contents
Ml detection
This chapter is under development.
Coming soon: Detailed information about ml detection.
Navigate: Table of Contents
Phase2 transpilation
This chapter is under development.
Coming soon: Detailed information about phase2 transpilation.
Navigate: Table of Contents
Tool selection
This chapter is under development.
Coming soon: Detailed information about tool selection.
Navigate: Table of Contents
Incremental
This chapter is under development.
Coming soon: Detailed information about incremental.
Navigate: Table of Contents
Caching
This chapter is under development.
Coming soon: Detailed information about caching.
Navigate: Table of Contents
Error handling
This chapter is under development.
Coming soon: Detailed information about error handling.
Navigate: Table of Contents
Phase3 optimization
This chapter is under development.
Coming soon: Detailed information about phase3 optimization.
Navigate: Table of Contents
Simd
This chapter is under development.
Coming soon: Detailed information about simd.
Navigate: Table of Contents
Gpu
This chapter is under development.
Coming soon: Detailed information about gpu.
Navigate: Table of Contents
Memory layout
This chapter is under development.
Coming soon: Detailed information about memory layout.
Navigate: Table of Contents
Moe
This chapter is under development.
Coming soon: Detailed information about moe.
Navigate: Table of Contents
Phase4 validation
This chapter is under development.
Coming soon: Detailed information about phase4 validation.
Navigate: Table of Contents
Syscall tracing
This chapter is under development.
Coming soon: Detailed information about syscall tracing.
Navigate: Table of Contents
Output comparison
This chapter is under development.
Coming soon: Detailed information about output comparison.
Navigate: Table of Contents
Test execution
This chapter is under development.
Coming soon: Detailed information about test execution.
Navigate: Table of Contents
Benchmarking
This chapter is under development.
Coming soon: Detailed information about benchmarking.
Navigate: Table of Contents
Phase5 deployment
This chapter is under development.
Coming soon: Detailed information about phase5 deployment.
Navigate: Table of Contents
Release builds
This chapter is under development.
Coming soon: Detailed information about release builds.
Navigate: Table of Contents
Cross compilation
This chapter is under development.
Coming soon: Detailed information about cross compilation.
Navigate: Table of Contents
WebAssembly (WASM) Build Target
“Batuta in the browser: Analyze, convert, and optimize code without leaving your documentation or web IDE.”
Overview
Batuta can be compiled to WebAssembly (WASM) to run directly in web browsers, enabling client-side code analysis, conversion demonstrations, and interactive documentation. This brings Batuta’s core capabilities to:
- Interactive documentation with live code conversion examples
- Web-based IDEs integrating Batuta’s analysis engine
- Educational platforms demonstrating transpilation techniques
- Browser extensions for code quality analysis
- Offline-first web applications without server-side dependencies
Why WASM?
Running Batuta in the browser provides several advantages:
1. Zero Server Costs
All analysis and conversion happens client-side. No need for backend infrastructure to demonstrate transpilation capabilities.
2. Instant Feedback
No network latency - code analysis and conversion results appear immediately as users type.
3. Privacy
User code never leaves their browser. Perfect for proprietary code analysis or security-sensitive environments.
4. Educational Value
Interactive examples in documentation allow users to experiment with Batuta’s features before installing.
5. Integration Flexibility
Embed Batuta into React, Vue, or vanilla JavaScript applications as a lightweight library.
Building for WASM
Prerequisites
Install the WASM toolchain:
# Add WASM target
rustup target add wasm32-unknown-unknown
# Install wasm-bindgen CLI (matches Cargo.toml version)
cargo install wasm-bindgen-cli --version 0.2.89
# Install wasm-opt for size optimization (optional)
cargo install wasm-opt
Quick Build
Use the provided build script:
# Debug build (faster compilation, larger size)
./scripts/build-wasm.sh debug
# Release build (optimized, ~500-800 KB)
./scripts/build-wasm.sh release
The script will:
- Compile Rust to WASM (
wasm32-unknown-unknowntarget) - Generate JavaScript bindings (
wasm-bindgen) - Optimize WASM binary (
wasm-opt -Oz) - Copy browser demo files to
wasm-dist/
Manual Build
For custom builds:
# Build WASM module
cargo build --target wasm32-unknown-unknown \
--no-default-features \
--features wasm \
--release
# Generate JavaScript bindings
wasm-bindgen target/wasm32-unknown-unknown/release/batuta.wasm \
--out-dir wasm-dist \
--target web \
--no-typescript
# Optimize (optional, reduces size by 30-50%)
wasm-opt -Oz wasm-dist/batuta_bg.wasm \
-o wasm-dist/batuta_bg_opt.wasm
Build Output
After building, wasm-dist/ contains:
wasm-dist/
├── batuta.js # JavaScript glue code
├── batuta_bg.wasm # WASM module (~1.5 MB debug)
├── batuta_bg_opt.wasm # Optimized WASM (~500 KB release)
├── index.html # Interactive demo
└── README.md # Integration guide
JavaScript API
Batuta exposes a JavaScript-friendly API via wasm-bindgen. All functions are asynchronous and return Promises.
Initialization
import init, * as batuta from './batuta.js';
// Initialize WASM module (call once)
await init();
// Module is ready to use
console.log('Batuta version:', batuta.version());
Code Analysis
Detect language and ML library usage:
const code = `
import numpy as np
import sklearn.linear_model as lm
X = np.array([[1, 2], [3, 4]])
model = lm.LinearRegression()
`;
const analysis = batuta.analyze_code(code);
console.log(analysis);
// Output:
// {
// language: "Python",
// has_numpy: true,
// has_sklearn: true,
// has_pytorch: false,
// lines_of_code: 5
// }
NumPy Conversion
Convert NumPy operations to Trueno:
const numpy_code = "np.add(a, b)";
const data_size = 10000;
const result = batuta.convert_numpy(numpy_code, data_size);
console.log(result);
// Output:
// {
// rust_code: "trueno::add(&a, &b)",
// imports: ["use trueno;"],
// backend_recommendation: "SIMD",
// explanation: "Array addition using SIMD vectorization"
// }
For GPU-scale operations:
const large_matmul = "np.dot(a, b)";
const gpu_size = 1000000;
const result = batuta.convert_numpy(large_matmul, gpu_size);
// backend_recommendation: "GPU"
// Uses trueno's CUDA/Metal backend for large matrices
sklearn Conversion
Convert scikit-learn to Aprender:
const sklearn_code = "LinearRegression()";
const result = batuta.convert_sklearn(sklearn_code, 5000);
console.log(result);
// Output:
// {
// rust_code: "aprender::LinearRegression::new()",
// imports: ["use aprender::LinearRegression;"],
// backend_recommendation: "CPU",
// explanation: "First-principles linear regression implementation"
// }
Supported algorithms:
- Linear Models:
LinearRegression,LogisticRegression,Ridge,Lasso - Clustering:
KMeans,DBSCAN - Ensemble:
RandomForest(limited support) - Preprocessing:
StandardScaler,MinMaxScaler
PyTorch Conversion
Convert PyTorch inference to Realizar:
const pytorch_code = "model.generate(prompt, max_length=100)";
const result = batuta.convert_pytorch(pytorch_code, 2000);
console.log(result);
// Output:
// {
// rust_code: "realizar::generate_text(&model, prompt, 100)",
// imports: ["use realizar;"],
// backend_recommendation: "GPU",
// explanation: "Optimized LLM inference with KV cache"
// }
Backend Recommendation
Get MoE backend selection for specific operations:
// Small dataset → CPU
const backend1 = batuta.backend_recommend("matrix_multiply", 1000);
console.log(backend1); // "CPU"
// Medium dataset → SIMD
const backend2 = batuta.backend_recommend("matrix_multiply", 50000);
console.log(backend2); // "SIMD"
// Large dataset → GPU
const backend3 = batuta.backend_recommend("matrix_multiply", 1000000);
console.log(backend3); // "GPU"
Supported operation types:
"matrix_multiply"- Dense matrix multiplication"element_wise"- Element-wise operations (add, sub, mul)"reduction"- Sum, mean, max, min"dot_product"- Vector dot products"convolution"- 2D convolutions (CNN)"linear_regression"- ML training"kmeans"- Clustering"text_generation"- LLM inference
Browser Integration
Vanilla JavaScript
<!DOCTYPE html>
<html>
<head>
<title>Batuta WASM Demo</title>
</head>
<body>
<textarea id="code" rows="10" cols="80">
import numpy as np
x = np.array([1, 2, 3])
</textarea>
<button onclick="analyzeCode()">Analyze</button>
<pre id="output"></pre>
<script type="module">
import init, * as batuta from './batuta.js';
await init();
window.analyzeCode = async () => {
const code = document.getElementById('code').value;
const result = batuta.analyze_code(code);
document.getElementById('output').textContent =
JSON.stringify(result, null, 2);
};
</script>
</body>
</html>
React Integration
import { useEffect, useState } from 'react';
import init, * as batuta from './batuta.js';
function BatutaConverter() {
const [initialized, setInitialized] = useState(false);
const [code, setCode] = useState('');
const [result, setResult] = useState(null);
useEffect(() => {
init().then(() => setInitialized(true));
}, []);
const handleConvert = () => {
if (!initialized) return;
const analysis = batuta.analyze_code(code);
if (analysis.has_numpy) {
const conversion = batuta.convert_numpy(code, 10000);
setResult(conversion);
}
};
return (
<div>
<textarea
value={code}
onChange={(e) => setCode(e.target.value)}
placeholder="Paste NumPy code here..."
/>
<button onClick={handleConvert} disabled={!initialized}>
Convert to Rust
</button>
{result && (
<pre>{result.rust_code}</pre>
)}
</div>
);
}
Vue Integration
<template>
<div>
<textarea v-model="code"></textarea>
<button @click="analyze" :disabled="!ready">
Analyze
</button>
<pre v-if="analysis">{{ analysis }}</pre>
</div>
</template>
<script>
import init, * as batuta from './batuta.js';
export default {
data() {
return {
ready: false,
code: '',
analysis: null
};
},
async mounted() {
await init();
this.ready = true;
},
methods: {
analyze() {
this.analysis = batuta.analyze_code(this.code);
}
}
};
</script>
Feature Flags
Batuta uses conditional compilation to support both native and WASM builds:
# Cargo.toml
[features]
default = ["native"]
native = [
"clap", # CLI parsing
"walkdir", # Filesystem traversal
"tracing", # Logging
"serde_yaml", # Config files
# ... native-only dependencies
]
wasm = [
"wasm-bindgen", # JS bindings
"wasm-bindgen-futures",
"js-sys", # JavaScript types
"web-sys", # Web APIs
]
This allows:
- Native builds: Full CLI with file I/O, logging, process spawning
- WASM builds: Browser-safe API with in-memory operations
Limitations
The WASM build has intentional limitations compared to the native CLI:
No Filesystem Access
- ❌ Cannot read/write files directly
- ✅ Works with in-memory code strings
- Workaround: Use File API in browser to read user-selected files
No Process Spawning
- ❌ Cannot call external transpilers (Decy, Depyler, Bashrs)
- ✅ Can analyze code and recommend conversions
- Workaround: Use WASM for analysis, native CLI for actual transpilation
No Logging Infrastructure
- ❌ No
tracingorenv_loggersupport - ✅ Uses JavaScript
console.log()viaweb-sys - Workaround: Stub macros for logging (
info!,debug!, etc.)
Synchronous-Only API
- ❌ No async file I/O or network requests
- ✅ All API calls are instant (no disk I/O)
- Workaround: Use Web Workers for long-running analysis
Size Constraints
- Release WASM binary: ~500-800 KB (after
wasm-opt -Oz) - Debug binary: ~1.5-2 MB
- Optimization: Use
wasm-opt, enable LTO, strip debug symbols
Capabilities
Despite limitations, WASM builds support:
✅ Language Detection: Identify Python, C, C++, Shell, Rust, JavaScript ✅ ML Library Detection: Recognize NumPy, sklearn, PyTorch usage ✅ Code Conversion: Generate Rust equivalents for ML operations ✅ Backend Selection: MoE-based compute backend recommendations ✅ Quality Analysis: Complexity estimation (without full PMAT) ✅ Interactive Demos: Real-time code analysis in documentation
Size Optimization
Reduce WASM binary size:
1. Use wasm-opt
wasm-opt -Oz input.wasm -o output.wasm
Savings: 30-50% reduction in file size.
2. Enable LTO
# Cargo.toml
[profile.release]
lto = true
codegen-units = 1
opt-level = "z" # Optimize for size
3. Strip Debug Symbols
[profile.release]
strip = true
debug = false
4. Remove Unused Features
Only include necessary WASM features:
[dependencies.web-sys]
features = [
"console", # Only if logging needed
# Omit unused features like "Window", "Document", etc.
]
5. Use wee_alloc
Smaller allocator for WASM:
[dependencies]
wee_alloc = "0.4"
#![allow(unused)]
fn main() {
#[cfg(feature = "wasm")]
#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;
}
Savings: 10-20 KB reduction.
Deployment
Static Hosting
Serve WASM files from any static host:
# GitHub Pages
cp -r wasm-dist/* docs/demo/
# Netlify
netlify deploy --dir=wasm-dist
# Vercel
vercel wasm-dist/
CDN Distribution
Use a CDN for faster global access:
<script type="module">
import init from 'https://cdn.example.com/batuta/batuta.js';
await init('https://cdn.example.com/batuta/batuta_bg.wasm');
</script>
npm Package
Publish as an npm package:
{
"name": "@paiml/batuta-wasm",
"version": "0.1.0",
"files": ["batuta.js", "batuta_bg.wasm"],
"main": "batuta.js",
"type": "module"
}
Users can install via:
npm install @paiml/batuta-wasm
Practical Use Cases
1. Interactive Documentation
Embed live code examples in Batuta’s docs:
Try converting NumPy code to Trueno:
<textarea id="numpy-input">np.dot(a, b)</textarea>
<button onclick="convertNumpy()">Convert</button>
<pre id="rust-output"></pre>
2. Web-Based Code Review
Build a browser extension that analyzes Python code for migration potential:
// Chrome extension content script
const code = getSelectedCodeFromGitHub();
const analysis = batuta.analyze_code(code);
if (analysis.has_numpy) {
showMigrationSuggestion("This code can be 10x faster with Trueno!");
}
3. Educational Platforms
Interactive Rust learning platform:
- Students paste Python code
- Batuta generates Rust equivalent
- Side-by-side comparison with explanations
- Instant feedback without server costs
4. Code Quality Dashboards
Real-time complexity analysis:
const files = await loadProjectFiles();
const analyses = files.map(f => batuta.analyze_code(f.content));
const avgComplexity = analyses.reduce((sum, a) =>
sum + a.lines_of_code, 0) / analyses.length;
renderDashboard({ avgComplexity, mlLibraries: ... });
5. Offline-First Migration Tool
Progressive Web App (PWA) for code migration:
- Works without internet connection
- Stores project state in IndexedDB
- Generates Rust code locally
- Syncs to cloud when online
Testing WASM Builds
Run WASM-specific tests:
# Run tests targeting WASM
cargo test --target wasm32-unknown-unknown \
--no-default-features \
--features wasm \
--lib
# Run in headless browser (requires wasm-pack)
wasm-pack test --headless --firefox
Add WASM-specific tests:
#![allow(unused)]
fn main() {
#[cfg(all(test, target_arch = "wasm32"))]
mod wasm_tests {
use super::*;
use wasm_bindgen_test::*;
#[wasm_bindgen_test]
fn test_analyze_python() {
let code = "import numpy as np";
let result = analyze_code(code).unwrap();
assert_eq!(result.language, "Python");
assert!(result.has_numpy);
}
}
}
Next Steps
- Tool Selection: How Batuta selects transpilers
- MoE Backend Selection: Mixture-of-Experts algorithm details
- Phase 3: Optimization: Backend-specific optimizations
Navigate: Table of Contents
Docker Containerization
“Package Batuta and all transpilation tools in reproducible containers for consistent development, CI/CD, and deployment.”
Overview
Batuta provides comprehensive Docker support for containerized development, testing, and deployment. Docker ensures:
- Reproducible environments across development, CI/CD, and production
- Isolated toolchains with all transpilers (Decy, Depyler, Bashrs) pre-installed
- Zero setup time for new team members
- Consistent CI/CD builds without “works on my machine” issues
- Multi-stage builds for minimal production image sizes
Quick Start
Running Batuta in Docker
# Pull the production image (when published)
docker pull paiml/batuta:latest
# Run Batuta CLI
docker run --rm -v $(pwd):/workspace paiml/batuta:latest \
batuta analyze /workspace/my_project
Building Locally
# Build production image
make docker
# Build development image (with hot reload)
make docker-dev
# Run tests in container
make docker-test
Docker Images
Batuta provides three Docker images for different use cases:
1. Production Image (batuta:latest)
Minimal image for running Batuta CLI in production:
- Base:
debian:bookworm-slim(minimal Debian) - Size: ~150-200 MB (multi-stage build)
- Contents: Batuta binary only, minimal runtime dependencies
- User: Non-root user (
batuta:1000) - Use case: Production deployments, CI/CD pipelines
docker build -t batuta:latest .
2. Development Image (batuta:dev)
Full development environment with hot reload:
- Base:
rust:1.75-slim - Size: ~2-3 GB (includes Rust toolchain, build cache)
- Contents: Full Rust toolchain, source code, cargo watch
- Volumes: Cargo cache, target directory, source code
- Use case: Local development, interactive debugging
docker build -f Dockerfile.dev -t batuta:dev .
3. CI Image (batuta:ci)
Optimized for CI/CD pipelines:
- Base: Same as production
- Size: ~150-200 MB
- Contents: Batuta + test dependencies
- Use case: Automated testing, quality gates, PR checks
docker-compose up --abort-on-container-exit ci
Multi-Stage Build
The production Dockerfile uses multi-stage builds to minimize image size:
# ============================================
# Stage 1: Builder
# ============================================
FROM rust:1.75-slim as builder
# Install build dependencies
RUN apt-get update && apt-get install -y \
pkg-config \
libssl-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /build
# Copy dependency files first (layer caching)
COPY Cargo.toml Cargo.lock ./
# Build dependencies only (cached layer)
RUN mkdir src && \
echo "fn main() {}" > src/main.rs && \
cargo build --release --features native --locked && \
rm -rf src
# Copy source code
COPY src ./src
COPY examples ./examples
# Build Batuta (only rebuilds if source changed)
RUN cargo build --release --features native --locked
# ============================================
# Stage 2: Runtime
# ============================================
FROM debian:bookworm-slim
# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
ca-certificates \
libssl3 \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN useradd -m -u 1000 -s /bin/bash batuta
# Copy binary from builder
COPY --from=builder /build/target/release/batuta /usr/local/bin/batuta
# Set working directory
WORKDIR /workspace
# Switch to non-root user
USER batuta
# Default command
CMD ["batuta", "--help"]
Key optimizations:
- Dependency caching: Build dependencies in separate layer (rarely changes)
- Minimal runtime: Only copy final binary to runtime stage
- Clean APT cache: Remove package lists after installation
- Non-root user: Security best practice
- Locked dependencies: Use
Cargo.lockfor reproducibility
Size reduction:
- Before multi-stage: ~1.5 GB (includes Rust toolchain)
- After multi-stage: ~150 MB (only runtime dependencies)
- Savings: ~1.35 GB (90% reduction)
Docker Compose
Batuta includes docker-compose.yml for orchestrating 5 services:
version: '3.8'
services:
# ==========================================
# Production CLI
# ==========================================
batuta:
build:
context: .
dockerfile: Dockerfile
image: batuta:latest
volumes:
- .:/workspace:rw
- cargo-cache:/usr/local/cargo/registry
working_dir: /workspace
command: batuta --help
# ==========================================
# Development (hot reload)
# ==========================================
dev:
build:
context: .
dockerfile: Dockerfile.dev
image: batuta:dev
volumes:
- .:/workspace:rw
- cargo-cache:/usr/local/cargo/registry
- cargo-git:/usr/local/cargo/git
- target-cache:/workspace/target
working_dir: /workspace
command: cargo watch -x check -x test -x run
environment:
- RUST_LOG=batuta=debug
# ==========================================
# CI/CD Testing
# ==========================================
ci:
image: batuta:latest
volumes:
- .:/workspace:ro # Read-only for CI
working_dir: /workspace
command: >
bash -c "cargo test --all --features native &&
cargo clippy --all-targets --all-features -- -D warnings"
# ==========================================
# WASM Build
# ==========================================
wasm:
image: batuta:dev
volumes:
- .:/workspace:rw
- cargo-cache:/usr/local/cargo/registry
- target-cache:/workspace/target
working_dir: /workspace
command: cargo build --target wasm32-unknown-unknown --no-default-features --features wasm
# ==========================================
# Documentation Server
# ==========================================
docs:
image: nginx:alpine
volumes:
- ./target/doc:/usr/share/nginx/html:ro
ports:
- "8000:80"
depends_on:
- batuta
# ==========================================
# Named Volumes (persistent cache)
# ==========================================
volumes:
cargo-cache:
driver: local
cargo-git:
driver: local
target-cache:
driver: local
Service Descriptions
| Service | Purpose | Command | Ports |
|---|---|---|---|
batuta | Production CLI | batuta --help | None |
dev | Hot reload development | cargo watch -x check -x test -x run | None |
ci | CI/CD testing | Run tests + clippy | None |
wasm | WASM build | Build for wasm32-unknown-unknown | None |
docs | Documentation server | Serve rustdoc HTML | 8000 |
Volume Mounts
Named volumes for caching (persist across container restarts):
cargo-cache: Cargo registry cache (~500 MB, rarely changes)cargo-git: Git dependencies cachetarget-cache: Build artifacts cache (~1-2 GB, speeds up rebuilds)
Bind mounts for live editing:
.:/workspace:rw: Source code (read-write).:/workspace:ro: Source code (read-only for CI)
Usage Patterns
1. Local Development
Start development container with hot reload:
# Start dev container
docker-compose up dev
# In another terminal, edit source code
vim src/main.rs
# Container automatically recompiles and runs tests
# Output shows in first terminal
Features:
- Automatic recompilation on file save
- Runs tests on every change
- Persistent cargo cache across restarts
- Full Rust toolchain available
2. Running CLI Commands
Execute Batuta commands in isolated container:
# Analyze a Python project
docker-compose run --rm batuta \
batuta analyze /workspace/my_python_project
# Transpile with Depyler
docker-compose run --rm batuta \
batuta transpile --input /workspace/src --output /workspace/target/rust
# Generate migration report
docker-compose run --rm batuta \
batuta report --format html --output /workspace/report.html
Note: Use /workspace/ prefix for paths (container working directory).
3. CI/CD Integration
Run tests in clean container (CI/CD pipeline):
# Run full test suite + linting
docker-compose up --abort-on-container-exit ci
# Exit code indicates pass/fail
echo $? # 0 = success, non-zero = failure
GitHub Actions example:
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run tests in Docker
run: docker-compose up --abort-on-container-exit ci
- name: Check exit code
run: |
if [ $? -ne 0 ]; then
echo "Tests failed!"
exit 1
fi
GitLab CI example:
# .gitlab-ci.yml
test:
image: docker:latest
services:
- docker:dind
script:
- docker-compose up --abort-on-container-exit ci
4. Building WASM
Build WASM in container:
# Build WASM target
docker-compose run --rm wasm
# Generated files in target/wasm32-unknown-unknown/
ls -lh target/wasm32-unknown-unknown/release/batuta.wasm
5. Serving Documentation
Build and serve rustdoc:
# Build documentation
docker-compose run --rm batuta cargo doc --no-deps
# Start documentation server
docker-compose up docs
# Open browser
open http://localhost:8000/batuta/
6. One-Off Commands
Run arbitrary commands in container:
# Run specific example
docker-compose run --rm batuta \
cargo run --example full_transpilation
# Check clippy lints
docker-compose run --rm batuta \
cargo clippy -- -D warnings
# Format code
docker-compose run --rm batuta \
cargo fmt --all
# Run benchmarks
docker-compose run --rm batuta \
cargo bench
Build Script
The scripts/docker-build.sh script automates Docker builds:
#!/usr/bin/env bash
set -euo pipefail
MODE="${1:-prod}"
case "$MODE" in
prod)
echo "🐳 Building production Docker image..."
docker build -t batuta:latest \
--target runtime \
--build-arg FEATURES=native \
.
echo "✅ Built: batuta:latest"
;;
dev)
echo "🐳 Building development Docker image..."
docker build -f Dockerfile.dev -t batuta:dev .
echo "✅ Built: batuta:dev"
;;
ci)
echo "🐳 Building CI Docker image..."
docker build -t batuta:ci \
--target runtime \
--build-arg FEATURES=native \
.
echo "✅ Built: batuta:ci"
;;
wasm)
echo "🐳 Building WASM Docker image..."
docker build -t batuta:wasm \
--target builder \
--build-arg FEATURES=wasm \
--build-arg TARGET=wasm32-unknown-unknown \
.
echo "✅ Built: batuta:wasm"
;;
*)
echo "Usage: $0 {prod|dev|ci|wasm}"
exit 1
;;
esac
Usage:
# Build production image
./scripts/docker-build.sh prod
# Build development image
./scripts/docker-build.sh dev
# Build CI image
./scripts/docker-build.sh ci
# Build WASM-capable image
./scripts/docker-build.sh wasm
Dockerfile.dev
The development Dockerfile includes additional tools:
FROM rust:1.75-slim
# Install development dependencies
RUN apt-get update && apt-get install -y \
pkg-config \
libssl-dev \
git \
curl \
&& rm -rf /var/lib/apt/lists/*
# Install cargo-watch for hot reload
RUN cargo install cargo-watch
# Install wasm toolchain
RUN rustup target add wasm32-unknown-unknown
# Install external transpilation tools
RUN cargo install depyler bashrs pmat
WORKDIR /workspace
# Default: watch mode
CMD ["cargo", "watch", "-x", "check", "-x", "test"]
Additional tools:
cargo-watch: Automatic recompilation on file changeswasm32-unknown-unknown: WASM build targetdepyler,bashrs,pmat: External transpilers
.dockerignore
Exclude unnecessary files from Docker build context:
# Build artifacts
target/
wasm-dist/
dist/
# Dependency cache
Cargo.lock # Keep if you want reproducible builds
# Git
.git/
.gitignore
# IDE
.vscode/
.idea/
*.swp
*.swo
# Documentation build
book/book/
# CI/CD
.github/
.gitlab-ci.yml
# Local config
.env
.batuta-state.json
# macOS
.DS_Store
# Logs
*.log
Benefits:
- Faster Docker builds (smaller context)
- No accidental secrets in images
- Cleaner build logs
Environment Variables
Configure Batuta via environment variables:
# Enable debug logging
docker-compose run -e RUST_LOG=batuta=debug batuta \
batuta analyze /workspace/project
# Set custom config path
docker-compose run -e BATUTA_CONFIG=/workspace/custom.toml batuta \
batuta transpile --input /workspace/src
# Disable GPU backend
docker-compose run -e BATUTA_DISABLE_GPU=1 batuta \
batuta optimize --input /workspace/project
Supported variables:
| Variable | Description | Default |
|---|---|---|
RUST_LOG | Logging level | info |
BATUTA_CONFIG | Config file path | batuta.toml |
BATUTA_DISABLE_GPU | Disable GPU backend | 0 (enabled) |
BATUTA_CACHE_DIR | Cache directory | /tmp/batuta-cache |
Security Best Practices
1. Non-Root User
All images run as non-root user batuta:1000:
# Create user
RUN useradd -m -u 1000 -s /bin/bash batuta
# Switch user
USER batuta
Benefits:
- Limits container breakout impact
- Matches host user permissions (if UID=1000)
- Industry security standard
2. Read-Only Volumes
CI containers use read-only mounts:
volumes:
- .:/workspace:ro # Read-only
Prevents CI from modifying source code.
3. Minimal Attack Surface
Production image:
- No Rust toolchain (can’t compile malicious code)
- No package managers (can’t install backdoors)
- Only essential runtime dependencies
4. Trusted Base Images
Use official images:
rust:1.75-slim(official Rust image)debian:bookworm-slim(official Debian)nginx:alpine(official nginx)
Avoid unknown/untrusted bases.
5. Dependency Scanning
Scan for vulnerabilities:
# Using Trivy
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image batuta:latest
# Using Snyk
snyk container test batuta:latest
Cleanup
Remove Docker artifacts:
# Clean all Batuta containers and images
make docker-clean
# Manually remove containers
docker-compose down
# Remove volumes (deletes cache!)
docker-compose down -v
# Remove all images
docker rmi batuta:latest batuta:dev batuta:ci
# Prune unused Docker resources
docker system prune -a --volumes
Performance Tips
1. Use BuildKit
Enable Docker BuildKit for faster builds:
# Enable BuildKit
export DOCKER_BUILDKIT=1
# Build with BuildKit
docker build -t batuta:latest .
Benefits:
- Parallel layer building
- Better caching
- Smaller images
2. Layer Caching
Order Dockerfile commands by change frequency:
# 1. Base image (rarely changes)
FROM rust:1.75-slim
# 2. System dependencies (rarely changes)
RUN apt-get update && apt-get install -y ...
# 3. Cargo dependencies (changes occasionally)
COPY Cargo.toml Cargo.lock ./
RUN cargo build --release
# 4. Source code (changes frequently)
COPY src ./src
RUN cargo build --release
3. Cargo Cache Volumes
Use named volumes for cargo cache:
volumes:
- cargo-cache:/usr/local/cargo/registry # Persistent cache
Speedup: 5-10x faster dependency builds after first run.
4. Parallel Builds
Build multiple images in parallel:
# Build prod and dev simultaneously
docker-compose build batuta dev &
wait
Integration with Makefile
The Makefile includes Docker targets:
# Build production Docker image
docker:
\t@echo "🐳 Building production Docker image..."
\t./scripts/docker-build.sh prod
# Build development Docker image
docker-dev:
\t@echo "🐳 Building development Docker image..."
\t./scripts/docker-build.sh dev
# Run tests in Docker
docker-test:
\t@echo "🧪 Running tests in Docker..."
\tdocker-compose up --abort-on-container-exit ci
# Clean Docker artifacts
docker-clean:
\t@echo "🧹 Cleaning Docker images and volumes..."
\tdocker-compose down -v
\tdocker rmi batuta:latest batuta:dev batuta:ci 2>/dev/null || true
\t@echo "✅ Docker cleanup complete"
Usage:
make docker # Build production image
make docker-dev # Build development image
make docker-test # Run tests in container
make docker-clean # Remove all artifacts
Troubleshooting
Issue: Slow builds
Cause: Docker not using layer cache.
Solution:
# Use BuildKit
export DOCKER_BUILDKIT=1
docker build --cache-from batuta:latest -t batuta:latest .
Issue: Permission denied
Cause: Container user UID doesn’t match host user.
Solution:
# Build with custom UID
docker build --build-arg UID=$(id -u) -t batuta:latest .
Or:
# Run as current user
docker-compose run --user $(id -u):$(id -g) batuta batuta --help
Issue: Out of disk space
Cause: Docker images and volumes consuming disk.
Solution:
# Check disk usage
docker system df
# Clean unused resources
docker system prune -a --volumes
# Remove specific volumes
docker volume rm batuta_cargo-cache batuta_target-cache
Issue: Cannot connect to Docker daemon
Cause: Docker service not running or permissions issue.
Solution:
# Start Docker service
sudo systemctl start docker
# Add user to docker group (Linux)
sudo usermod -aG docker $USER
newgrp docker
Next Steps
- Distribution: Publishing Batuta packages
- Release Builds: Production optimization
- Phase 4: Validation: Testing transpiled code
Navigate: Table of Contents
Distribution
This chapter is under development.
Coming soon: Detailed information about distribution.
Navigate: Table of Contents
Tool Overview
This chapter is under development.
Coming soon: Detailed information about tool overview.
Navigate: Table of Contents
Transpilers
This chapter is under development.
Coming soon: Detailed information about transpilers.
Navigate: Table of Contents
Decy
This chapter is under development.
Coming soon: Detailed information about decy.
Navigate: Table of Contents
Depyler: Python → Rust
“Depyler transpiles Python to Rust with automatic type inference, NumPy→Trueno conversion, and sklearn→Aprender migration.”
Overview
Depyler is Batuta’s Python-to-Rust transpiler that converts Python projects into idiomatic Rust code with:
- Automatic type inference: Infers Rust types from Python code
- NumPy → Trueno: Converts NumPy operations to SIMD/GPU-accelerated Trueno
- sklearn → Aprender: Migrates scikit-learn to first-principles Aprender
- PyTorch → Realizar: Transpiles PyTorch inference to optimized Realizar
- Project structure generation: Creates full Cargo projects with dependencies
Installation
# Install from crates.io
cargo install depyler
# Verify installation
depyler --version
# Output: depyler 3.20.0
Basic Usage
Single File Transpilation
# Transpile Python file to Rust
depyler transpile --input script.py --output script.rs
# View generated Rust code
cat script.rs
Example:
# script.py
import numpy as np
def add_arrays(a, b):
return np.add(a, b)
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
result = add_arrays(x, y)
print(result)
Generated Rust:
// script.rs
use trueno::Array;
fn add_arrays(a: &Array<f64>, b: &Array<f64>) -> Array<f64> {
trueno::add(a, b)
}
fn main() {
let x = Array::from_vec(vec![1.0, 2.0, 3.0]);
let y = Array::from_vec(vec![4.0, 5.0, 6.0]);
let result = add_arrays(&x, &y);
println!("{:?}", result);
}
Project Transpilation
# Transpile entire Python project
depyler transpile \
--input /path/to/python_project \
--output /path/to/rust_project \
--format project
# Generated structure:
# rust_project/
# ├── Cargo.toml
# ├── src/
# │ ├── main.rs
# │ ├── lib.rs
# │ └── modules/
# ├── tests/
# └── benches/
Batuta Integration
Batuta automatically uses Depyler for Python transpilation:
# Batuta detects Depyler and uses it
batuta transpile --input my_python_app --output my_rust_app
Internal call:
depyler transpile \
--input my_python_app \
--output my_rust_app \
--format project
ML Library Conversion
NumPy → Trueno
Depyler converts NumPy operations to Trueno for SIMD/GPU acceleration:
| NumPy | Trueno | Backend |
|---|---|---|
np.add(a, b) | trueno::add(&a, &b) | SIMD/GPU |
np.dot(a, b) | trueno::dot(&a, &b) | SIMD/GPU |
np.matmul(a, b) | trueno::matmul(&a, &b) | GPU |
np.sum(a) | trueno::sum(&a) | SIMD |
np.mean(a) | trueno::mean(&a) | SIMD |
sklearn → Aprender
Converts scikit-learn to first-principles Aprender:
| sklearn | Aprender |
|---|---|
LinearRegression() | aprender::LinearRegression::new() |
LogisticRegression() | aprender::LogisticRegression::new() |
KMeans(n_clusters=3) | aprender::KMeans::new(3) |
StandardScaler() | aprender::StandardScaler::new() |
PyTorch → Realizar
Transpiles PyTorch inference to Realizar:
| PyTorch | Realizar |
|---|---|
model.generate(prompt) | realizar::generate_text(&model, prompt, max_len) |
model.forward(x) | realizar::forward(&model, &x) |
torch.load(path) | realizar::load_model(path) |
Features
Type Inference
Depyler infers Rust types from Python:
# Python (dynamic typing)
def multiply(x, y):
return x * y
result = multiply(5, 10) # int
#![allow(unused)]
fn main() {
// Rust (inferred types)
fn multiply(x: i32, y: i32) -> i32 {
x * y
}
let result: i32 = multiply(5, 10);
}
Ownership Inference
Converts Python references to Rust ownership:
# Python
def process_list(items):
items.append(42)
return items
#![allow(unused)]
fn main() {
// Rust (mutable reference)
fn process_list(items: &mut Vec<i32>) -> &Vec<i32> {
items.push(42);
items
}
}
Error Handling
Converts Python exceptions to Rust Result:
# Python
def divide(a, b):
if b == 0:
raise ValueError("Division by zero")
return a / b
#![allow(unused)]
fn main() {
// Rust
fn divide(a: f64, b: f64) -> Result<f64, String> {
if b == 0.0 {
Err("Division by zero".to_string())
} else {
Ok(a / b)
}
}
}
Command-Line Options
depyler transpile [OPTIONS]
OPTIONS:
--input <PATH> Input Python file or directory
--output <PATH> Output Rust file or directory
--format <FORMAT> Output format: file, project [default: file]
--optimize <LEVEL> Optimization level: 0, 1, 2, 3 [default: 2]
--backend <BACKEND> Trueno backend: cpu, simd, gpu, auto [default: auto]
--strict Strict mode (fail on warnings)
--no-ml Disable ML library conversion
-h, --help Print help
-V, --version Print version
Examples:
# Strict mode (fail on type inference warnings)
depyler transpile --input script.py --output script.rs --strict
# Disable ML conversions (keep NumPy as-is)
depyler transpile --input ml_app.py --output ml_app.rs --no-ml
# Force GPU backend
depyler transpile --input gpu_code.py --output gpu_code.rs --backend gpu
Limitations
Depyler has some known limitations:
- Dynamic typing: Complex dynamic types may require manual annotations
- Metaprogramming: Decorators and metaclasses not fully supported
- C extensions: Python C extensions cannot be transpiled
- Runtime reflection:
eval(),exec(),getattr()limited support
Workarounds:
- Use type hints in Python code for better inference
- Refactor metaprogramming to explicit code
- Replace C extensions with pure Rust equivalents
- Avoid runtime reflection in critical paths
Version
Current version: 3.20.0
Check installed version:
depyler --version
Update to latest:
cargo install depyler --force
Next Steps
- Bashrs: Shell → Rust: Shell script transpilation
- Trueno: Multi-target Compute: SIMD/GPU acceleration
- Aprender: First-Principles ML: ML algorithms in Rust
Navigate: Table of Contents
Bashrs: Rust to Shell Transpiler
“Write Rust, deploy shell. Deterministic bootstrap scripts for any environment.”
Bashrs transpiles Rust code to portable POSIX shell scripts. It enables writing complex installation and bootstrap logic in Rust while deploying as zero-dependency shell scripts.
Overview
| Attribute | Value |
|---|---|
| Version | 6.41.0 |
| Layer | L3: Transpilers |
| Direction | Rust → Shell |
| Repository | github.com/paiml/bashrs |
Why Bashrs?
The Bootstrap Problem
When deploying software, you face a chicken-and-egg problem:
- Your installer needs dependencies (Rust, Python, Node…)
- But you’re trying to install those dependencies
- The only universal runtime is
/bin/sh
Traditional Solutions
| Approach | Problem |
|---|---|
| Shell scripts | Hard to test, platform bugs, no type safety |
| Python installers | Requires Python pre-installed |
| Go binaries | Large binaries, need per-platform builds |
| curl | bash | Security concerns, no verification |
Bashrs Solution
Write your installer in Rust with full type safety and testing, then transpile to a portable shell script:
Rust (tested, typed) → bashrs → Shell (universal, portable)
Capabilities
rust_to_shell
Transpile Rust functions to shell:
// install.rs
use bashrs::prelude::*;
#[bashrs::main]
fn main() {
// Check if Rust is installed
if !command_exists("rustc") {
println("Installing Rust...");
curl("https://sh.rustup.rs", "-sSf") | sh();
}
// Install the application
cargo(&["install", "batuta"]);
println("Installation complete!");
}
Generates:
#!/bin/sh
set -e
main() {
# Check if Rust is installed
if ! command -v rustc >/dev/null 2>&1; then
echo "Installing Rust..."
curl -sSf https://sh.rustup.rs | sh
fi
# Install the application
cargo install batuta
echo "Installation complete!"
}
main "$@"
bootstrap_scripts
Generate deterministic bootstrap scripts for reproducible environments:
#![allow(unused)]
fn main() {
use bashrs::prelude::*;
#[bashrs::bootstrap]
fn setup_dev_environment() {
// Deterministic package installation
apt_install(&["build-essential", "pkg-config", "libssl-dev"]);
// Rust toolchain
rustup_install("stable");
rustup_component_add(&["clippy", "rustfmt", "llvm-tools-preview"]);
// Cargo tools
cargo_install(&["cargo-nextest", "cargo-llvm-cov", "cargo-mutants"]);
// Verify installation
assert_command("cargo --version");
assert_command("cargo nextest --version");
}
}
cross_platform_shell
Generate POSIX-compliant shell code that works everywhere:
#![allow(unused)]
fn main() {
use bashrs::prelude::*;
#[bashrs::portable]
fn detect_os() -> String {
// Bashrs generates portable OS detection
match os() {
Os::Linux => "linux",
Os::MacOS => "darwin",
Os::Windows => "windows", // WSL/Git Bash
Os::FreeBSD => "freebsd",
}
}
#[bashrs::portable]
fn install_package(name: &str) {
// Generates package manager detection
match package_manager() {
Apt => apt_install(&[name]),
Brew => brew_install(&[name]),
Dnf => dnf_install(&[name]),
Pacman => pacman_install(&[name]),
}
}
}
Generates:
detect_os() {
case "$(uname -s)" in
Linux*) echo "linux";;
Darwin*) echo "darwin";;
MINGW*|MSYS*|CYGWIN*) echo "windows";;
FreeBSD*) echo "freebsd";;
*) echo "unknown";;
esac
}
install_package() {
if command -v apt-get >/dev/null 2>&1; then
sudo apt-get install -y "$1"
elif command -v brew >/dev/null 2>&1; then
brew install "$1"
elif command -v dnf >/dev/null 2>&1; then
sudo dnf install -y "$1"
elif command -v pacman >/dev/null 2>&1; then
sudo pacman -S --noconfirm "$1"
else
echo "No supported package manager found" >&2
exit 1
fi
}
Integration with Batuta
Generate installation scripts for batuta deployments:
#![allow(unused)]
fn main() {
use bashrs::prelude::*;
#[bashrs::main]
fn install_batuta() {
println("=== Batuta Installation ===");
// Step 1: System dependencies
println("Installing system dependencies...");
install_build_essentials();
// Step 2: Rust toolchain
println("Setting up Rust...");
ensure_rust_installed();
rustup_update();
// Step 3: Install batuta
println("Installing batuta...");
cargo_install(&["batuta"]);
// Step 4: Verify
println("Verifying installation...");
let version = capture("batuta --version");
println(format!("Installed: {}", version));
println("=== Installation Complete ===");
}
}
Integration with Repartir
Generate cluster node bootstrap scripts:
#![allow(unused)]
fn main() {
use bashrs::prelude::*;
#[bashrs::main]
fn bootstrap_worker_node() {
let coordinator = env_required("COORDINATOR_HOST");
let node_id = env_or("NODE_ID", &generate_node_id());
println(format!("Bootstrapping worker node: {}", node_id));
// Install repartir
cargo_install(&["repartir"]);
// Configure node
write_file("/etc/repartir/config.toml", &format!(r#"
[node]
id = "{}"
coordinator = "{}"
[resources]
cpus = {}
memory_gb = {}
"#, node_id, coordinator, num_cpus(), memory_gb()));
// Start worker service
systemctl_enable("repartir-worker");
systemctl_start("repartir-worker");
println("Worker node ready!");
}
}
CLI Usage
# Transpile Rust to shell
bashrs transpile install.rs -o install.sh
# Build and run directly
bashrs run install.rs
# Generate with specific shell target
bashrs transpile --target bash install.rs # Bash-specific features
bashrs transpile --target posix install.rs # POSIX-only (most portable)
bashrs transpile --target zsh install.rs # Zsh-specific features
# Verify generated script
bashrs verify install.sh # Check for common issues
# Test on multiple shells
bashrs test install.rs --shells bash,dash,zsh
Example: Multi-Stage Installer
use bashrs::prelude::*;
#[bashrs::main]
fn main() {
let args = parse_args();
match args.command.as_str() {
"install" => install(),
"uninstall" => uninstall(),
"upgrade" => upgrade(),
"doctor" => doctor(),
_ => print_help(),
}
}
fn install() {
println("Installing Sovereign AI Stack...");
// Phase 1: Base dependencies
section("Phase 1: System Dependencies");
install_system_deps();
// Phase 2: Rust ecosystem
section("Phase 2: Rust Toolchain");
install_rust_ecosystem();
// Phase 3: Stack components
section("Phase 3: Stack Components");
cargo_install(&[
"trueno",
"aprender",
"batuta",
"repartir",
"renacer",
]);
// Phase 4: Verification
section("Phase 4: Verification");
verify_installation();
success("Installation complete!");
}
fn doctor() {
println("Checking installation health...");
check("Rust compiler", "rustc --version");
check("Cargo", "cargo --version");
check("Trueno", "cargo install --list | grep trueno");
check("Batuta", "batuta --version");
println("All checks passed!");
}
Comparison with Alternatives
| Feature | Raw Shell | Bashrs | Ansible | Docker |
|---|---|---|---|---|
| Zero dependencies | Yes | Yes | No | No |
| Type safety | No | Yes | No | N/A |
| Testable | Hard | Yes | Hard | Yes |
| Cross-platform | Maybe | Yes | Yes | Yes |
| Reproducible | No | Yes | Yes | Yes |
| Size | Tiny | Tiny | Large | Large |
Key Takeaways
- Write Rust, deploy shell: Full Rust safety, universal deployment
- Zero dependencies: Generated scripts need only
/bin/sh - Deterministic: Same input always generates same output
- Testable: Test your Rust code, deploy the shell
- Cross-platform: POSIX-compliant output works everywhere
Previous: Decy: C/C++ to Rust Next: Ruchy: Systems Scripting
Foundation Libraries
The Sovereign AI Stack is built on a core set of foundation libraries that provide compute, ML, inference, and data management capabilities. All libraries are pure Rust with no Python/CUDA dependencies.
Current Versions (November 2025)
| Library | Version | Purpose | Crate |
|---|---|---|---|
| Trueno | 0.7.3 | Multi-target compute (SIMD/GPU/WASM) | trueno |
| Aprender | latest | First-principles ML training | aprender |
| Realizar | latest | ML inference runtime | realizar |
| Alimentar | 0.2.0 | Data loading & validation | alimentar |
| Pacha | 0.1.0 | Model/dataset registry | pacha |
Stack Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Applications (Presentar, CLI tools) │
├─────────────────────────────────────────────────────────────────┤
│ Realizar (Inference) │ Aprender (Training) │ Alimentar (Data) │
├─────────────────────────────────────────────────────────────────┤
│ Trueno (Compute Foundation) │
│ ├── Backend: CPU (SIMD) │ WASM (SIMD) │ GPU (WebGPU) │
│ ├── Tensor operations │
│ └── Memory management │
└─────────────────────────────────────────────────────────────────┘
Trueno: The Compute Foundation
Trueno is the bedrock of the stack, providing:
- Multi-backend dispatch: CPU SIMD, WASM SIMD, WebGPU
- Array programming model: Following Iverson (1962)
- Columnar memory layout: For SIMD efficiency (Stonebraker et al., 2005)
- Zero-copy operations: Via lifetime-based borrowing
#![allow(unused)]
fn main() {
use trueno::{Tensor, Backend};
// Automatic backend selection
let a = Tensor::from_vec(vec![1.0, 2.0, 3.0], Backend::Auto);
let b = Tensor::from_vec(vec![4.0, 5.0, 6.0], Backend::Auto);
let c = &a + &b; // SIMD-accelerated
}
Recent (v0.7.3): WebGPU support for WASM targets (gpu-wasm feature).
Aprender: First-Principles ML
Aprender implements ML algorithms from mathematical foundations:
- No PyTorch/TensorFlow dependency
- Transparent implementations: Every algorithm is readable
- Academic rigor: Peer-reviewed algorithm implementations
- Integration: Outputs
.aprmodel format
Realizar: ML Inference Runtime
Realizar executes trained models with:
- Multi-format support:
.apr, ONNX (limited) - Optimized inference: Quantization, pruning
- Batch processing: Efficient throughput
- WASM deployment: Browser-native inference
Alimentar: Data Pipeline
Alimentar manages data loading and validation:
- Format:
.ald(Alimentar Data format) - Schema validation: At load time, not runtime
- Quality scoring: 100-point weighted system (v0.2.0)
- Streaming: Large dataset support
#![allow(unused)]
fn main() {
use alimentar::{Dataset, Schema};
let schema = Schema::load("transactions.schema.yaml")?;
let dataset = Dataset::load("transactions.ald", &schema)?;
}
Pacha: Content Registry
Pacha manages model and dataset versions:
- URI scheme:
pacha://models/name:version,pacha://datasets/name:version - Lineage tracking: W3C PROV-DM compliant
- Oracle Mode: Intelligent query interface for codebase understanding
# Reference in Presentar app.yaml
models:
classifier:
source: "pacha://models/fraud-detector:1.2.0"
Dependency Graph
presentar ─────► trueno-viz ─────► trueno
│
aprender ────────────┘
│
realizar ────────────► trueno
│
alimentar ───────────► trueno
│
pacha (registry, no compute deps)
Toyota Way Integration
Following the Toyota Production System:
| Principle | Implementation |
|---|---|
| Muda | No Python GIL, no runtime interpretation |
| Jidoka | Compile-time type checking |
| Kaizen | Continuous improvement via TDG scoring |
| Genchi Genbutsu | Transparent, readable implementations |
Further Reading
Navigate: Table of Contents | Tool Overview
Trueno: Multi-target Compute
Trueno (Spanish: “thunder”) is a Rust library providing unified, high-performance compute primitives across multiple execution targets. It serves as the foundation for numerical computation in the sovereign stack.
Overview
Trueno delivers:
- CPU SIMD - x86 (SSE2/AVX/AVX2/AVX-512), ARM (NEON), WASM (SIMD128)
- GPU - Vulkan/Metal/DX12/WebGPU via
wgpu - WebAssembly - Portable SIMD128 for browser/edge deployment
┌─────────────────────────────────────────────────┐
│ Trueno Public API (Safe) │
│ compute(), map(), reduce(), transform() │
└─────────────────────────────────────────────────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
┌────────┐ ┌─────────┐ ┌──────────┐
│ SIMD │ │ GPU │ │ WASM │
│ Backend│ │ Backend │ │ Backend │
└────────┘ └─────────┘ └──────────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌───┴─────┐
│ Runtime │ │ wgpu │ │ SIMD128 │
│ Detect │ │ Compute │ │ Portable│
└─────────┘ └─────────┘ └─────────┘
Installation
[dependencies]
trueno = "0.14"
# With GPU support
trueno = { version = "0.14", features = ["gpu"] }
# With CUDA monitoring (NVIDIA GPUs)
trueno = { version = "0.14", features = ["cuda-monitor"] }
What’s New in 0.14
- Streaming Tensors: Memory-mapped streaming for large datasets
- Q5K/Q6K Quantization: Extended quantization formats
- Improved WASM: Better WebAssembly SIMD128 support
- LZ4/ZSTD Compression: Built-in tensor compression for memory efficiency
- GPU PTX Fixes: Resolved NVIDIA PTX codegen issues
- AVX-512 Improvements: Better auto-vectorization
- Simulation Framework: Toyota-style Jidoka guards and stress testing
Core Features
Vector Operations
#![allow(unused)]
fn main() {
use trueno::{Vector, VectorOps};
// Create vectors
let a = Vector::from_slice(&[1.0, 2.0, 3.0, 4.0]);
let b = Vector::from_slice(&[5.0, 6.0, 7.0, 8.0]);
// Element-wise operations (auto-selects best SIMD backend)
let sum = a.add(&b)?; // [6.0, 8.0, 10.0, 12.0]
let product = a.mul(&b)?; // [5.0, 12.0, 21.0, 32.0]
let dot = a.dot(&b)?; // 70.0
// Reductions
let total = a.sum()?; // 10.0
let average = a.mean()?; // 2.5
}
Matrix Operations
#![allow(unused)]
fn main() {
use trueno::Matrix;
let a = Matrix::from_slice(2, 3, &[
1.0, 2.0, 3.0,
4.0, 5.0, 6.0,
]);
let b = Matrix::from_slice(3, 2, &[
7.0, 8.0,
9.0, 10.0,
11.0, 12.0,
]);
// Matrix multiplication (SIMD-accelerated)
let c = a.matmul(&b)?; // 2x2 result
// Transpose
let at = a.transpose();
// Eigendecomposition (symmetric matrices)
let eigen = matrix.symmetric_eigen()?;
}
Activation Functions
#![allow(unused)]
fn main() {
use trueno::activations::*;
let x = Vector::from_slice(&[-1.0, 0.0, 1.0, 2.0]);
// Neural network activations (SIMD-optimized)
let relu_out = relu(&x)?; // [0.0, 0.0, 1.0, 2.0]
let sigmoid_out = sigmoid(&x)?;
let gelu_out = gelu(&x)?;
let swish_out = swish(&x)?;
let tanh_out = tanh_activation(&x)?;
}
Backend Selection
Trueno automatically selects the optimal backend based on:
- Data size - GPU only for large workloads (>100K elements)
- CPU features - AVX-512 > AVX2 > AVX > SSE2 > NEON
- Operation complexity - Complex ops benefit more from GPU
#![allow(unused)]
fn main() {
use trueno::Backend;
// Auto-select (recommended)
let result = vector.add(&other)?;
// Force specific backend
let result = vector.add_with_backend(&other, Backend::Avx2)?;
let result = vector.add_with_backend(&other, Backend::GPU)?;
}
Backend Priority
| Priority | Backend | Condition |
|---|---|---|
| 1 | GPU | Available + size > 100K |
| 2 | AVX-512 | CPU supports |
| 3 | AVX2 | CPU supports |
| 4 | AVX | CPU supports |
| 5 | SSE2 | x86_64 baseline |
| 6 | NEON | ARM64 |
| 7 | SIMD128 | WASM |
| 8 | Scalar | Fallback |
Simulation Testing Framework (v0.8.5+)
Trueno 0.8.5 introduces a comprehensive simulation testing framework based on Toyota Production System principles.
SimRng: Deterministic Random Number Generator
#![allow(unused)]
fn main() {
use trueno::simulation::SimRng;
// Deterministic PCG-based RNG
let mut rng = SimRng::new(42); // Seed for reproducibility
// Generate deterministic random values
let value = rng.next_f32(); // [0.0, 1.0)
let int = rng.next_u32(); // Full u32 range
let range = rng.range(1.0, 10.0); // Custom range
let normal = rng.normal(0.0, 1.0); // Gaussian distribution
// Fork for parallel testing (maintains determinism)
let child_rng = rng.fork();
}
BackendSelector: Intelligent Backend Selection
#![allow(unused)]
fn main() {
use trueno::simulation::{BackendSelector, BackendThresholds};
let thresholds = BackendThresholds {
gpu_min_elements: 100_000,
simd_min_elements: 32,
};
let selector = BackendSelector::new(thresholds);
let backend = selector.select(data_size, op_complexity);
}
JidokaGuard: Stop-on-Defect Quality Checks
#![allow(unused)]
fn main() {
use trueno::simulation::JidokaGuard;
// Toyota-style quality gate - stops on first defect
let guard = JidokaGuard::new();
// Check for NaN/Inf values
guard.check_finite(&result)?;
// Custom invariant checking
guard.assert_invariant(|| value >= 0.0, "Value must be non-negative")?;
}
BufferRenderer: Visual Regression Testing
#![allow(unused)]
fn main() {
use trueno::simulation::{BufferRenderer, ColorPalette};
let renderer = BufferRenderer::new(800, 600);
let palette = ColorPalette::viridis();
// Render data to RGBA buffer for visual comparison
let buffer = renderer.render_heatmap(&data, &palette)?;
// Compare with golden baseline
let diff = renderer.compare_buffers(&buffer, &golden)?;
assert!(diff.max_error < 1e-5);
}
StressTestConfig: Stress Testing Infrastructure
#![allow(unused)]
fn main() {
use trueno::simulation::{StressTestConfig, StressTestResult};
let config = StressTestConfig {
iterations: 10_000,
data_size_range: 100..1_000_000,
anomaly_threshold: 3.0, // Standard deviations
};
let result = stress_test(&operation, &config)?;
assert!(result.anomaly_count == 0);
}
BackendTolerance: Cross-Backend Comparison
#![allow(unused)]
fn main() {
use trueno::simulation::BackendTolerance;
let tolerance = BackendTolerance::relaxed();
// Get tolerance for comparing results across backends
let tol = tolerance.for_backends(Backend::GPU, Backend::Scalar);
assert!((gpu_result - scalar_result).abs() < tol);
}
GPU Compute
Synchronous API
#![allow(unused)]
fn main() {
use trueno::gpu::GpuDevice;
let device = GpuDevice::new()?;
// Large matrix multiplication on GPU
let result = device.matmul(&a, &b)?;
// Batch operations
let results = device.batch_add(&vectors_a, &vectors_b)?;
}
Async API
#![allow(unused)]
fn main() {
use trueno::gpu::GpuDevice;
let device = GpuDevice::new()?;
// Non-blocking GPU operations
let future = device.matmul_async(&a, &b);
let result = future.await?;
}
NumPy Compatibility (via Batuta)
Trueno is the target for NumPy → Rust transpilation:
| NumPy | Trueno |
|---|---|
np.array([1,2,3]) | Vector::from_slice(&[1.0,2.0,3.0]) |
np.dot(a, b) | a.dot(&b)? |
a + b | a.add(&b)? |
a @ b | a.matmul(&b)? |
np.sum(a) | a.sum()? |
np.mean(a) | a.mean()? |
Performance
Expected speedups vs scalar baseline:
| Operation | Size | SSE2 | AVX2 | AVX-512 | GPU |
|---|---|---|---|---|---|
| add_f32 | 1K | 2x | 4x | 8x | - |
| add_f32 | 100K | 2x | 4x | 8x | 3x |
| add_f32 | 1M | 2x | 4x | 8x | 10x |
| add_f32 | 10M | 2x | 4x | 8x | 50x |
| dot_product | 1M | 3x | 6x | 12x | 20x |
| matmul | 1K×1K | 3x | 6x | 12x | 30x |
Related Crates
- trueno-gpu - CUDA monitoring via NVML
- trueno-db - High-performance vector database
- trueno-graph - Graph analytics engine
- trueno-viz - GPU-accelerated visualization
- trueno-rag - RAG pipeline components
References
Navigate: Table of Contents | Previous: Foundation Libraries | Next: Aprender
trueno-zram: SIMD Memory Compression
trueno-zram provides SIMD-accelerated compression for Linux zram and general-purpose memory compression. It achieves 3+ GB/s with LZ4 and up to 13 GB/s with ZSTD on AVX-512.
Overview
trueno-zram delivers:
- SIMD Acceleration: AVX2/AVX-512/NEON optimized
- Multiple Algorithms: LZ4 (speed) and ZSTD (ratio)
- Adaptive Selection: Entropy-based algorithm choice
- Page Compression: 4KB aligned for zram integration
- Optional CUDA: GPU acceleration for batch compression
┌─────────────────────────────────────────────────────────────┐
│ trueno-zram │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ LZ4 SIMD │ │ ZSTD SIMD │ │ Adaptive Selector │ │
│ │ (3+ GB/s) │ │ (13 GB/s) │ │ (entropy-based) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ AVX-512 │ AVX2 │ NEON │ Scalar │
└─────────────────────────────────────────────────────────────┘
Installation
[dependencies]
trueno-zram-core = "0.1"
# With adaptive compression
trueno-zram-adaptive = "0.1"
# With CUDA support
trueno-zram-cuda = { version = "0.1", optional = true }
Quick Start
#![allow(unused)]
fn main() {
use trueno_zram_core::{Compressor, Algorithm};
// Create compressor with LZ4 (fastest)
let compressor = Compressor::new(Algorithm::Lz4);
// Compress data
let compressed = compressor.compress(&data)?;
println!("Ratio: {:.2}x", data.len() as f64 / compressed.len() as f64);
// Decompress
let decompressed = compressor.decompress(&compressed)?;
assert_eq!(data, decompressed);
}
Algorithm Comparison
| Algorithm | Compress | Decompress | Ratio | Use Case |
|---|---|---|---|---|
| LZ4 | 3+ GB/s | 4+ GB/s | 2.1x | Speed-critical |
| ZSTD-1 | 500 MB/s | 1.5 GB/s | 2.8x | Balanced |
| ZSTD-3 | 300 MB/s | 1.5 GB/s | 3.2x | Better ratio |
| ZSTD-AVX512 | 13 GB/s | 15 GB/s | 3.2x | AVX-512 systems |
| Same-Fill | N/A | N/A | 2048:1 | Zero/repeated pages |
SIMD Backend Selection
#![allow(unused)]
fn main() {
use trueno_zram_core::{SimdBackend, detect_backend};
// Auto-detect best available backend
let backend = detect_backend();
println!("Using: {:?}", backend);
// Force specific backend
let compressor = Compressor::builder()
.algorithm(Algorithm::Lz4)
.backend(SimdBackend::Avx512)
.build()?;
}
Backend Priority
| Priority | Backend | Condition |
|---|---|---|
| 1 | AVX-512 | x86_64 with avx512f |
| 2 | AVX2 | x86_64 with avx2 |
| 3 | NEON | aarch64 |
| 4 | Scalar | Fallback |
Page Compression
Optimized for 4KB page-aligned compression:
#![allow(unused)]
fn main() {
use trueno_zram_core::{PageCompressor, PAGE_SIZE};
let compressor = PageCompressor::new();
// Compress a 4KB page
let page: [u8; PAGE_SIZE] = get_page();
let compressed = compressor.compress_page(&page)?;
// Check if page is compressible
if compressed.len() < PAGE_SIZE / 2 {
store_compressed(compressed);
} else {
store_uncompressed(page); // Not worth compressing
}
}
Adaptive Compression
Entropy-based algorithm selection:
#![allow(unused)]
fn main() {
use trueno_zram_adaptive::AdaptiveCompressor;
let compressor = AdaptiveCompressor::new();
// Automatically selects best algorithm per-page
let result = compressor.compress_adaptive(&data)?;
match result.algorithm_used {
Algorithm::SameFill => println!("Zero/repeated page"),
Algorithm::Lz4 => println!("High entropy, used LZ4"),
Algorithm::Zstd { .. } => println!("Compressible, used ZSTD"),
}
}
Decision Tree
Is page all zeros/same byte?
YES → Same-Fill (2048:1 ratio)
NO → Check entropy
High entropy → LZ4 (fast, low ratio)
Low entropy → ZSTD (slower, high ratio)
Performance Benchmarks
Measured on AMD EPYC 7763 (AVX-512):
| Algorithm | Scalar | AVX2 | AVX-512 |
|---|---|---|---|
| LZ4 compress | 800 MB/s | 2.1 GB/s | 3.2 GB/s |
| LZ4 decompress | 1.2 GB/s | 3.5 GB/s | 4.5 GB/s |
| ZSTD-1 | 150 MB/s | 350 MB/s | 500 MB/s |
| ZSTD-fast | 400 MB/s | 8 GB/s | 13 GB/s |
Running the Example
cargo run --example trueno_zram_demo
Related Crates
- trueno-ublk: GPU-accelerated block device using trueno-zram
- trueno: SIMD/GPU compute primitives
References
Navigate: Table of Contents | Previous: whisper.apr | Next: trueno-ublk
trueno-ublk: GPU Block Device
trueno-ublk provides a GPU-accelerated ZRAM replacement using Linux’s userspace block device (ublk) interface. It achieves 10-50 GB/s throughput by offloading compression to GPU.
Overview
trueno-ublk delivers:
- ublk Driver: Userspace block device via libublk
- GPU Compression: CUDA/wgpu accelerated
- ZRAM Replacement: Drop-in swap device
- Adaptive Backend: Automatic GPU/SIMD/CPU selection
- High Throughput: 10-50 GB/s with GPU
┌─────────────────────────────────────────────────────────────┐
│ Linux Kernel │
│ /dev/ublkb0 │
└───────────────────────┬─────────────────────────────────────┘
│ io_uring
┌───────────────────────▼─────────────────────────────────────┐
│ trueno-ublk │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ GPU Backend │ │ SIMD Backend│ │ CPU Backend │ │
│ │ (CUDA/wgpu) │ │ (AVX/NEON) │ │ (fallback) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Installation
[dependencies]
trueno-ublk = "0.1"
# With CUDA support (NVIDIA GPUs)
trueno-ublk = { version = "0.1", features = ["cuda"] }
System requirements:
- Linux kernel 6.0+ (ublk support)
- libublk userspace library
- Root privileges for device creation
Quick Start
#![allow(unused)]
fn main() {
use trueno_ublk::{UblkDevice, DeviceConfig, Backend};
// Create device with 8GB capacity
let config = DeviceConfig {
capacity_bytes: 8 * 1024 * 1024 * 1024, // 8 GB
queue_depth: 128,
num_queues: 4,
backend: Backend::Auto, // Auto-select GPU/SIMD/CPU
};
let device = UblkDevice::create(config).await?;
println!("Created: /dev/{}", device.name());
// Run the device (blocks until shutdown)
device.run().await?;
}
Backend Selection
| Backend | Throughput | Latency | Condition |
|---|---|---|---|
| CUDA | 50+ GB/s | 100 us | NVIDIA GPU |
| wgpu | 20+ GB/s | 200 us | Any GPU |
| AVX-512 | 13 GB/s | 10 us | x86_64 |
| AVX2 | 3 GB/s | 5 us | x86_64 |
| NEON | 2 GB/s | 5 us | ARM64 |
| Scalar | 800 MB/s | 2 us | Fallback |
#![allow(unused)]
fn main() {
use trueno_ublk::Backend;
// Force specific backend
let config = DeviceConfig {
backend: Backend::Cuda, // NVIDIA GPU only
..Default::default()
};
// Or use adaptive (switches based on load)
let config = DeviceConfig {
backend: Backend::Adaptive {
gpu_batch_threshold: 64, // Use GPU for 64+ pages
},
..Default::default()
};
}
CLI Usage
# Create 8GB GPU-accelerated swap
sudo trueno-ublk --capacity 8G --backend auto
# Force CUDA backend with stats
sudo trueno-ublk --capacity 16G --backend cuda --stats
# Use as block device (not swap)
sudo trueno-ublk --capacity 4G --no-swap
sudo mkfs.ext4 /dev/ublkb0
sudo mount /dev/ublkb0 /mnt/fast-storage
systemd Integration
/etc/systemd/system/trueno-ublk.service:
[Unit]
Description=trueno-ublk GPU-accelerated swap
Before=swap.target
[Service]
Type=simple
ExecStart=/usr/local/bin/trueno-ublk \
--capacity 16G \
--backend auto
ExecStartPost=/sbin/mkswap /dev/ublkb0
ExecStartPost=/sbin/swapon -p 100 /dev/ublkb0
[Install]
WantedBy=swap.target
Enable:
sudo systemctl enable trueno-ublk
sudo systemctl start trueno-ublk
Performance Monitoring
#![allow(unused)]
fn main() {
use trueno_ublk::Stats;
let stats = device.stats();
println!("Compression ratio: {:.2}x", stats.compression_ratio);
println!("Read throughput: {:.1} GB/s", stats.read_gbps);
println!("Write throughput: {:.1} GB/s", stats.write_gbps);
println!("Backend: {:?}", stats.active_backend);
println!("GPU utilization: {:.0}%", stats.gpu_utilization * 100.0);
}
Example output:
┌─────────────────────────────────────────────────────┐
│ trueno-ublk stats │
├─────────────────────────────────────────────────────┤
│ Device: /dev/ublkb0 │
│ Capacity: 16 GB │
│ Used: 8.2 GB (51%) │
│ Compressed: 2.1 GB (3.9x ratio) │
│ Backend: CUDA (RTX 4090) │
│ Read: 42.3 GB/s │
│ Write: 38.7 GB/s │
│ GPU util: 23% │
└─────────────────────────────────────────────────────┘
Comparison with zram
| Feature | zram | trueno-ublk |
|---|---|---|
| Compression | CPU only | GPU/SIMD/CPU |
| Throughput | ~1 GB/s | 10-50 GB/s |
| Algorithms | LZ4/ZSTD | LZ4/ZSTD + custom |
| Batch process | No | Yes (GPU) |
| Adaptive | No | Yes |
| Kernel req | Any | 6.0+ (ublk) |
Running the Example
cargo run --example trueno_ublk_demo
Note: Running the actual ublk driver requires root privileges and Linux 6.0+.
Related Crates
- trueno-zram-core: SIMD compression algorithms used by trueno-ublk
- trueno-zram-adaptive: Entropy-based algorithm selection
- trueno: SIMD/GPU compute primitives
References
Navigate: Table of Contents | Previous: trueno-zram | Next: Aprender
Repartir: Distributed Computing
repartir is the Sovereign AI Stack’s distributed computing library, providing CPU, GPU, and remote task execution with work-stealing scheduling.
Overview
Key Features
- 100% Rust, Zero C/C++: Complete auditability for sovereign AI
- Work-Stealing Scheduler: Based on Blumofe & Leiserson (1999)
- Multi-Backend Execution: CPU, GPU, and Remote executors
- Iron Lotus Quality: 95% coverage, 80% mutation score
Architecture
┌─────────────────────────────────────────────────────────────┐
│ repartir Pool │
├─────────────────────────────────────────────────────────────┤
│ Scheduler │
│ (Work-Stealing, Task Queue) │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ CpuExecutor │ │ GpuExecutor │ │ RemoteExecutor │ │
│ │ │ │ │ │ │ │
│ │ Rayon-like │ │ wgpu │ │ TCP/TLS │ │
│ │ AVX2/512 │ │ Vulkan/Metal│ │ Multi-Node │ │
│ │ NEON │ │ DX12/WebGPU │ │ Distributed │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Feature Flags
| Feature | Description |
|---|---|
cpu (default) | Local multi-core execution with work-stealing |
gpu | wgpu GPU compute (Vulkan/Metal/DX12/WebGPU) |
remote | TCP-based distributed execution |
remote-tls | TLS-secured remote execution |
tensor | trueno SIMD tensor integration |
checkpoint | trueno-db + Parquet state persistence |
tui | Job flow TUI visualization |
full | All features enabled |
Quick Start
Installation
[dependencies]
repartir = { version = "1.1", features = ["cpu"] }
# With GPU support
repartir = { version = "1.1", features = ["cpu", "gpu"] }
# Full distributed with all features
repartir = { version = "1.1", features = ["full"] }
Basic CPU Pool
use repartir::{Pool, task::{Task, Backend}};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Create pool with 8 CPU workers
let pool = Pool::builder()
.cpu_workers(8)
.build()?;
// Submit a task
let task = Task::builder()
.binary("./worker")
.arg("--input").arg("data.csv")
.backend(Backend::Cpu)
.build()?;
let result = pool.submit(task).await?;
if result.is_success() {
println!("Output: {}", result.stdout_str()?);
}
pool.shutdown().await;
Ok(())
}
GPU Execution
use repartir::executor::gpu::GpuExecutor;
use repartir::executor::Executor;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Initialize GPU executor (auto-selects best GPU)
let gpu = GpuExecutor::new().await?;
println!("GPU: {}", gpu.device_name());
println!("Compute units: {}", gpu.capacity());
// GPU selection priority:
// 1. Discrete GPU (dedicated graphics)
// 2. Integrated GPU (CPU-integrated)
// 3. Software rasterizer (fallback)
Ok(())
}
Multi-Machine Distribution
Step 1: Start workers on each node
# On node1 (192.168.1.10)
repartir-worker --bind 0.0.0.0:9000
# On node2 (192.168.1.11)
repartir-worker --bind 0.0.0.0:9000
# On node3 (192.168.1.12)
repartir-worker --bind 0.0.0.0:9000
Step 2: Connect from coordinator
use repartir::executor::remote::RemoteExecutor;
use repartir::task::{Task, Backend};
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
// Connect to remote workers
let executor = RemoteExecutor::builder()
.add_worker("192.168.1.10:9000")
.add_worker("192.168.1.11:9000")
.add_worker("192.168.1.12:9000")
.build()
.await?;
// Task distributed to available worker
let task = Task::builder()
.binary("./gpu-workload")
.arg("--shard=0")
.backend(Backend::Gpu)
.build()?;
let result = executor.execute(task).await?;
println!("Result: {:?}", result.stdout_str()?);
Ok(())
}
TLS-Secured Remote Execution
#![allow(unused)]
fn main() {
use repartir::executor::tls::TlsRemoteExecutor;
let executor = TlsRemoteExecutor::builder()
.add_worker("node1.internal:9443")
.cert_path("./certs/client.pem")
.key_path("./certs/client.key")
.ca_path("./certs/ca.pem")
.build()
.await?;
}
SIMD Tensor Operations
With the tensor feature, repartir integrates with trueno for SIMD-accelerated operations:
use repartir::tensor::{TensorExecutor, Tensor};
use repartir::task::Backend;
#[tokio::main]
async fn main() -> repartir::error::Result<()> {
let executor = TensorExecutor::builder()
.backend(Backend::Cpu) // Uses AVX2/AVX-512/NEON
.build()?;
let a = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0]);
let b = Tensor::from_slice(&[5.0, 6.0, 7.0, 8.0]);
// SIMD-accelerated operations
let sum = executor.add(&a, &b).await?;
let product = executor.mul(&a, &b).await?;
let dot = executor.dot(&a, &b).await?;
println!("Sum: {:?}", sum.as_slice());
println!("Product: {:?}", product.as_slice());
println!("Dot product: {}", dot);
Ok(())
}
Checkpointing
With the checkpoint feature, repartir can persist state using trueno-db and Parquet:
#![allow(unused)]
fn main() {
use repartir::checkpoint::CheckpointManager;
let checkpoint = CheckpointManager::new("./checkpoints")?;
// Save state
checkpoint.save("training_epoch_10", &model_state).await?;
// Restore on failure
let state = checkpoint.load("training_epoch_10").await?;
}
Job Flow TUI
Monitor distributed jobs with the TUI dashboard:
cargo run --bin job-flow --features tui,remote
┌─ Job Flow Monitor ─────────────────────────────────────────┐
│ Workers: 3 active │ Tasks: 45 pending / 120 completed │
├─────────────────────┴──────────────────────────────────────┤
│ Node │ Status │ Load │ Tasks │ Uptime │
├──────────────────────┼─────────┼──────┼───────┼────────────┤
│ 192.168.1.10:9000 │ Active │ 78% │ 15 │ 2h 34m │
│ 192.168.1.11:9000 │ Active │ 65% │ 18 │ 2h 34m │
│ 192.168.1.12:9000 │ Active │ 82% │ 12 │ 2h 30m │
└──────────────────────┴─────────┴──────┴───────┴────────────┘
Integration with Batuta
Batuta uses repartir for distributed orchestration:
#![allow(unused)]
fn main() {
use batuta::backend::{select_backend, to_repartir_backend};
use batuta::oracle::types::HardwareSpec;
// MoE router selects optimal backend
let backend = select_backend(
OpComplexity::High,
Some(DataSize::samples(1_000_000)),
&HardwareSpec {
has_gpu: true,
is_distributed: true,
node_count: Some(4),
..Default::default()
},
);
// Map to repartir backend
let repartir_backend = to_repartir_backend(backend);
}
Backend Selection Criteria
Batuta’s MoE router uses the 5x PCIe rule (Gregg & Hazelwood, 2011):
| Complexity | Scalar | SIMD | GPU |
|---|---|---|---|
| Low (O(n)) | <1M | >1M | Never |
| Medium (O(n log n)) | <10K | 10K-100K | >100K |
| High (O(n³)) | <1K | 1K-10K | >10K |
GPU is beneficial when: compute_time > 5 × transfer_time
Performance Considerations
Work-Stealing Efficiency
The Blumofe & Leiserson work-stealing algorithm provides:
- O(T₁/P + T∞) expected time with P processors
- Near-linear speedup for embarrassingly parallel workloads
- Low contention through randomized stealing
GPU vs CPU Decision
#![allow(unused)]
fn main() {
// Automatic backend selection
let backend = if data_size > 100_000 && complexity == High {
Backend::Gpu
} else if data_size > 1_000 {
Backend::Cpu // SIMD-accelerated
} else {
Backend::Cpu // Scalar
};
}
Remote Execution Overhead
- Serialization: bincode (fast, compact)
- Network: Length-prefixed TCP messages
- Latency: ~1ms per task submission (local network)
Comparison with Alternatives
| Feature | repartir | Rayon | tokio | Ray |
|---|---|---|---|---|
| Language | Rust | Rust | Rust | Python |
| GPU Support | Yes (wgpu) | No | No | Yes |
| Distributed | Yes | No | No | Yes |
| Work-Stealing | Yes | Yes | No | Yes |
| TLS | Yes | N/A | Yes | Yes |
| Pure Rust | Yes | Yes | Yes | No |
Example: Distributed ML Training
#![allow(unused)]
fn main() {
use repartir::executor::remote::RemoteExecutor;
use repartir::task::{Task, Backend};
async fn distributed_training(
nodes: &[&str],
epochs: usize,
) -> repartir::error::Result<()> {
let executor = RemoteExecutor::builder()
.add_workers(nodes)
.build()
.await?;
for epoch in 0..epochs {
// Distribute training shards
let tasks: Vec<_> = (0..nodes.len())
.map(|shard| {
Task::builder()
.binary("./train")
.arg("--epoch").arg(epoch.to_string())
.arg("--shard").arg(shard.to_string())
.arg("--total-shards").arg(nodes.len().to_string())
.backend(Backend::Gpu)
.build()
})
.collect::<Result<Vec<_>, _>>()?;
// Execute in parallel
for task in tasks {
let result = executor.execute(task).await?;
println!("Shard completed: {:?}", result.exit_code());
}
println!("Epoch {} complete", epoch);
}
Ok(())
}
}
Navigate: Table of Contents | Trueno | Aprender
Pepita: Sovereign AI Kernel Interfaces
pepita is the Sovereign AI Stack’s kernel interface library, providing minimal Linux kernel interfaces (io_uring, ublk, blk-mq) and distributed computing primitives for sovereign AI workloads.
Overview
Key Features
- First-Principles Rust: Zero external dependencies in kernel mode
- 100% Rust, Zero C/C++: Complete auditability for sovereign AI
- no_std Compatible: Core kernel interfaces work without standard library
- Work-Stealing Scheduler: Blumofe-Leiserson algorithm implementation
- Iron Lotus Quality: 417 tests, 95% coverage
Design Principles
Pepita follows the Iron Lotus Framework:
- First-Principles Rust: Zero external dependencies in kernel mode
- Pure Rust Sovereignty: 100% auditable, zero C/C++ dependencies
- Toyota Way Quality: Jidoka, Poka-yoke, Genchi Genbutsu
- EXTREME TDD: Comprehensive test coverage
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ User Code │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ pool.rs │
│ (High-level Pool API) │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ scheduler.rs │
│ (Work-Stealing, Blumofe-Leiserson) │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ executor.rs │
│ (Backend Dispatch) │
├─────────────┬─────────────┬─────────────┬───────────────────────┤
│ CPU │ GPU │ MicroVM │ SIMD │
│ (threads) │ (wgpu) │ (KVM) │ (AVX/NEON) │
└─────────────┴──────┬──────┴──────┬──────┴───────────┬───────────┘
│ │ │
┌──────▼──────┐ ┌────▼─────┐ ┌───────▼───────┐
│ gpu.rs │ │ vmm.rs │ │ simd.rs │
│ (wgpu) │ │ (KVM) │ │ (AVX-512/NEON)│
└─────────────┘ └────┬─────┘ └───────────────┘
│
┌──────▼──────┐
│ virtio.rs │
│(vsock,block)│
└─────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Kernel Interfaces (no_std) │
├─────────────┬─────────────┬─────────────┬───────────────────────┤
│ io_uring │ ublk │ blk_mq │ memory │
│ (async I/O) │(block dev) │ (multiqueue)│ (DMA/pages) │
└─────────────┴─────────────┴─────────────┴───────────────────────┘
Module Overview
Core Kernel Interfaces (no_std compatible)
| Module | Purpose | Key Types |
|---|---|---|
io_uring | Linux async I/O interface | IoUringSqe, IoUringCqe |
ublk | Userspace block device driver | UblkCtrlCmd, UblkIoDesc, UblkIoCmd |
blk_mq | Multi-queue block layer | TagSetConfig, Request, RequestOp |
memory | Physical/virtual memory management | DmaBuffer, PageAllocator, Pfn |
error | Unified error types | KernelError, Result |
Distributed Computing (std required)
| Module | Purpose | Key Types |
|---|---|---|
scheduler | Work-stealing scheduler | Scheduler, WorkerDeque |
executor | Execution backends | CpuExecutor, Backend |
task | Task definitions | Task, TaskId, ExecutionResult |
pool | High-level API | Pool, PoolBuilder |
transport | Wire protocol | Message, Transport |
fault | Fault tolerance | RetryPolicy, CircuitBreaker |
Sovereign Infrastructure (std required)
| Module | Purpose | Key Types |
|---|---|---|
zram | Compressed RAM block device | ZramDevice, ZramConfig, ZramStats |
vmm | KVM-based MicroVM runtime | MicroVm, VmConfig, VmState |
virtio | Virtio device implementations | VirtQueue, VirtioVsock, VirtioBlock |
simd | SIMD-accelerated operations | SimdCapabilities, SimdOps, MatrixOps |
gpu | GPU compute via wgpu | GpuDevice, ComputeKernel, GpuBuffer |
Feature Flags
| Feature | Description |
|---|---|
std (default) | Standard library support |
kernel | True no_std without alloc |
proptest | Property-based testing support |
Quick Start
Installation
[dependencies]
pepita = "0.1"
# Kernel mode (no_std)
pepita = { version = "0.1", default-features = false, features = ["kernel"] }
io_uring - Async I/O
#![allow(unused)]
fn main() {
use pepita::io_uring::{IoUringSqe, IoUringCqe, IORING_OP_URING_CMD};
// Submission queue entry - describes an I/O operation
let sqe = IoUringSqe::new(IORING_OP_URING_CMD, fd, addr, len);
// Completion queue entry - result of the operation
let cqe: IoUringCqe = /* from kernel */;
assert_eq!(cqe.res, 0); // Success
}
Why it matters: io_uring eliminates syscall overhead by batching I/O operations. One syscall can submit hundreds of operations.
ublk - Userspace Block Devices
#![allow(unused)]
fn main() {
use pepita::ublk::{UblkCtrlCmd, UblkIoDesc, UBLK_U_CMD_ADD_DEV};
// Control command - add a new block device
let cmd = UblkCtrlCmd::new(UBLK_U_CMD_ADD_DEV, dev_id);
// I/O descriptor - describes a read/write request
let io_desc: UblkIoDesc = /* from kernel */;
let sector = io_desc.start_sector();
}
Why it matters: ublk allows implementing block devices entirely in userspace with near-native performance.
zram - Compressed Memory
#![allow(unused)]
fn main() {
use pepita::zram::{ZramDevice, ZramConfig, ZramCompressor};
// Create a 1GB compressed RAM device
let config = ZramConfig::with_size(1024 * 1024 * 1024)
.compressor(ZramCompressor::Lz4);
let device = ZramDevice::new(config)?;
// Write a page (4KB)
let data = [0u8; 4096];
device.write_page(0, &data)?;
// Check compression stats
let stats = device.stats();
println!("Compression ratio: {:.2}x", stats.compression_ratio());
}
Why it matters: zram provides swap/storage that lives in compressed RAM. A 4GB system can effectively have 12-16GB of memory.
MicroVM Runtime
#![allow(unused)]
fn main() {
use pepita::vmm::{MicroVm, VmConfig, VmState};
let config = VmConfig::builder()
.vcpus(2)
.memory_mb(256)
.kernel_path("/boot/vmlinuz")
.build()?;
let vm = MicroVm::create(config)?;
vm.start()?;
let exit_reason = vm.run()?;
}
Why it matters: MicroVMs provide hardware-level isolation with sub-100ms cold start. Each function runs in its own VM.
Work-Stealing Scheduler
#![allow(unused)]
fn main() {
use pepita::scheduler::Scheduler;
use pepita::task::{Task, Priority};
let scheduler = Scheduler::with_workers(4);
let task = Task::builder()
.binary("./compute")
.priority(Priority::High)
.build()?;
scheduler.submit(task).await?;
}
Why it matters: Work stealing provides automatic load balancing. Idle workers steal from busy workers’ queues.
Integration with Repartir
Pepita provides the low-level primitives that repartir uses for its high-level distributed computing API:
#![allow(unused)]
fn main() {
// repartir uses pepita's SIMD executor
use repartir::executor::simd::{SimdExecutor, SimdTask};
let executor = SimdExecutor::new(); // Uses pepita::simd internally
let task = SimdTask::vadd_f32(a, b);
let result = executor.execute_simd(task).await?;
// repartir uses pepita's MicroVM for serverless
use repartir::executor::microvm::MicroVmExecutor;
let executor = MicroVmExecutor::new(config)?; // Uses pepita::vmm internally
}
Use Cases
Sovereign Infrastructure
Pepita provides building blocks for a complete Docker/Lambda/Kubernetes replacement in pure Rust:
| Use Case | Pepita Module |
|---|---|
| Container replacement | vmm (MicroVMs) |
| Storage backend | ublk, blk_mq |
| Swap/memory extension | zram |
| High-throughput I/O | io_uring |
| Serverless isolation | vmm + virtio |
High-Performance Computing
- SIMD acceleration: Auto-detects AVX-512/AVX2/SSE4.1/NEON
- GPU compute: Cross-platform via wgpu (Vulkan/Metal/DX12)
- Work stealing: Near-linear speedup for parallel workloads
Comparison with Alternatives
| Feature | pepita | QEMU | Firecracker | Docker |
|---|---|---|---|---|
| Language | Rust | C | Rust | Go/C |
| Isolation | VM | VM | VM | Container |
| Boot time | <100ms | seconds | ~100ms | ~500ms |
| Dependencies | 0 | many | few | many |
| Pure Rust | Yes | No | Partial | No |
| no_std | Yes | No | No | No |
Performance
running 417 tests
test result: ok. 417 passed; 0 failed; 0 ignored
Benchmarks
| Operation | pepita | Baseline |
|---|---|---|
| io_uring submit | 50ns | N/A |
| zram write (4KB) | 2us | 10us (disk) |
| MicroVM boot | 80ms | 500ms (Docker) |
| SIMD matmul (1Kx1K) | 5ms | 50ms (scalar) |
Navigate: Table of Contents | Repartir | Trueno
Aprender
This chapter is under development.
Coming soon: Detailed information about aprender.
Navigate: Table of Contents
Realizar
This chapter is under development.
Coming soon: Detailed information about realizar.
Navigate: Table of Contents
Whisper.apr: Pure Rust Speech Recognition
whisper.apr is a pure Rust implementation of OpenAI’s Whisper automatic speech recognition model, designed for the Sovereign AI Stack with WASM-first deployment and APR v2 model format.
Overview
whisper.apr delivers:
- Pure Rust: No Python, no C++ dependencies
- WASM-First: Browser deployment with full functionality
- APR v2 Format: LZ4/ZSTD compressed models
- Quantization: Int4/Int8 for reduced memory footprint
- Streaming: Real-time transcription support
- Multilingual: 99+ languages
┌─────────────────────────────────────────────────────────────┐
│ whisper.apr │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ APR v2 Model│ │ Streaming │ │ Quantization │ │
│ │ LZ4/ZSTD │ │ Transcriber │ │ Int4/Int8 │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ trueno (SIMD) │ aprender (ML) │ realizar (inference) │
└─────────────────────────────────────────────────────────────┘
Installation
[dependencies]
whisper-apr = "0.1"
# With GPU acceleration
whisper-apr = { version = "0.1", features = ["gpu"] }
# WASM-only (smaller bundle)
whisper-apr = { version = "0.1", default-features = false, features = ["wasm"] }
Quick Start
#![allow(unused)]
fn main() {
use whisper_apr::{WhisperModel, Transcriber, TranscribeOptions};
// Load model (APR v2 format with compression)
let model = WhisperModel::load_apr("whisper-small-int8.apr")?;
let transcriber = Transcriber::new(model);
// Transcribe audio file
let result = transcriber.transcribe_file(
"audio.wav",
TranscribeOptions::default(),
)?;
println!("Text: {}", result.text);
println!("Language: {}", result.language);
// With timestamps
for segment in result.segments {
println!("[{:.2}s - {:.2}s] {}",
segment.start, segment.end, segment.text);
}
}
Model Sizes
| Model | FP32 | Int8 | Int4 | Languages |
|---|---|---|---|---|
| Tiny | 150 MB | 40 MB | 22 MB | 99+ |
| Base | 290 MB | 75 MB | 40 MB | 99+ |
| Small | 970 MB | 250 MB | 130 MB | 99+ |
| Medium | 3.0 GB | 780 MB | 400 MB | 99+ |
| Large | 6.2 GB | 1.6 GB | 820 MB | 99+ |
Streaming Transcription
Real-time transcription from audio stream:
#![allow(unused)]
fn main() {
use whisper_apr::{StreamingTranscriber, AudioChunk};
let mut streamer = StreamingTranscriber::new(model);
// Process audio chunks as they arrive
while let Some(chunk) = audio_source.next_chunk().await {
if let Some(partial) = streamer.process_chunk(&chunk)? {
print!("\r{}", partial.text); // Live update
}
}
// Finalize and get complete transcription
let final_result = streamer.finalize()?;
}
WASM Deployment
Browser-compatible transcription:
#![allow(unused)]
fn main() {
use whisper_apr::wasm::{WasmWhisper, init_wasm};
#[wasm_bindgen]
pub async fn transcribe_audio(audio_data: &[u8]) -> String {
init_wasm().await;
let whisper = WasmWhisper::load_from_bytes(MODEL_BYTES).await?;
let result = whisper.transcribe(audio_data)?;
result.text
}
}
Bundle sizes (gzipped):
| Model | WASM Runtime | Total |
|---|---|---|
| Tiny Int4 | 200 KB | 22 MB |
| Base Int4 | 200 KB | 40 MB |
| Small Int4 | 200 KB | 130 MB |
Language Detection
#![allow(unused)]
fn main() {
use whisper_apr::LanguageDetector;
let detector = LanguageDetector::new(&model);
let detection = detector.detect(&audio)?;
println!("Detected: {} ({:.1}% confidence)",
detection.language, detection.confidence * 100.0);
// Top 5 candidates
for (lang, prob) in detection.top_languages(5) {
println!(" {}: {:.1}%", lang, prob * 100.0);
}
}
Stack Integration
whisper.apr integrates with the Sovereign AI Stack:
| Dependency | Version | Purpose |
|---|---|---|
| trueno | 0.10+ | SIMD tensor operations |
| aprender | 0.20+ | ML primitives, APR v2 format |
| realizar | 0.4+ | Inference runtime (optional) |
Running the Example
cargo run --example whisper_apr_demo
References
Navigate: Table of Contents | Previous: Realizar | Next: trueno-zram
trueno-cuda-edge: GPU Edge-Case Testing
trueno-cuda-edge is a GPU edge-case test framework implementing Popperian falsificationism for CUDA/GPU code. It provides 5 falsification frameworks with a 50-point verification checklist.
Overview
GPU code is notoriously difficult to test due to:
- Non-deterministic behavior
- Hardware-dependent edge cases
- Complex lifecycle management
- Numerical precision variations
trueno-cuda-edge addresses these challenges with systematic falsification testing that integrates with batuta’s orchestration pipelines.
Integration with Batuta
Batuta orchestrates GPU workloads across the Sovereign AI Stack. trueno-cuda-edge validates that these orchestrations handle GPU edge cases correctly.
Pipeline Validation
Use trueno-cuda-edge to validate batuta’s GPU backend selection:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::shmem_prober::{ComputeCapability, shared_memory_limit, check_allocation};
// Validate backend selection considers GPU capabilities
let ampere = ComputeCapability::new(8, 0);
assert_eq!(shared_memory_limit(ampere), 164 * 1024); // 164 KB
// Check allocation fits before dispatching
check_allocation(ampere, 128 * 1024)?;
}
Null Pointer Safety
Prevent null pointer bugs in GPU memory operations:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::null_fuzzer::{NonNullDevicePtr, InjectionStrategy, NullFuzzerConfig};
// Type-safe device pointer that rejects null at construction
let ptr = NonNullDevicePtr::<f32>::new(0x7f00_0000_0000)?;
assert!(NonNullDevicePtr::<f32>::new(0).is_err());
// Fault injection for testing error handling
let config = NullFuzzerConfig {
strategy: InjectionStrategy::Periodic { interval: 10 },
total_calls: 1000,
fail_fast: false,
};
}
ML Converter Quantization Parity
Validate CPU/GPU numerical parity in batuta’s ML converters:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::quant_oracle::{QuantFormat, check_values_parity, ParityConfig};
// Format-specific tolerances
assert_eq!(QuantFormat::Q4K.tolerance(), 0.05); // 5% for 4-bit
assert_eq!(QuantFormat::Q6K.tolerance(), 0.01); // 1% for 6-bit
// Compare CPU and GPU results
let config = ParityConfig::new(QuantFormat::Q4K);
let report = check_values_parity(&cpu_values, &gpu_values, &config);
assert!(report.passed());
}
PTX Kernel Validation
Validate PTX kernels generated by trueno:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::ptx_poison::{PtxVerifier, PtxMutator, default_mutators};
let verifier = PtxVerifier::new();
// Structural verification (6 checks)
let verified = verifier.verify(ptx_source)?;
// Mutation testing with 8 operators
let mutators = default_mutators();
let mutated = PtxMutator::FlipAddSub.apply(ptx_source);
}
Falsification Frameworks
F1: Null Pointer Sentinel Fuzzer
NonNullDevicePtr<T>: Type-safe device pointerInjectionStrategy: Periodic, SizeThreshold, Probabilistic, TargetedNullSentinelFuzzer: State machine for null injection
F2: Shared Memory Boundary Prober
ComputeCapability: GPU capability detectionshared_memory_limit(): SM-specific limitscheck_allocation(): Validate before dispatch
F3: Context Lifecycle Chaos
ChaosScenario: 8 lifecycle edge casesContextLeakDetector: Memory leak detection- 1 MB tolerance for driver allocations
F4: Quantization Parity Oracle
QuantFormat: Q4K, Q5K, Q6K, Q8_0, F16, F32BoundaryValueGenerator: Edge case inputscheck_values_parity(): CPU/GPU comparison
F5: PTX Compilation Poison Trap
PtxVerifier: 6 structural checksPtxMutator: 8 mutation operators- Mutation score calculation
50-Point Falsification Protocol
Track verification coverage:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::falsification::{FalsificationReport, all_claims};
let mut report = FalsificationReport::new();
// Mark claims as verified during testing
report.mark_verified("NF-001"); // Null fuzzer claim
report.mark_verified("QO-001"); // Quantization oracle claim
// Track coverage
println!("Coverage: {:.1}%", report.coverage() * 100.0);
assert!(report.coverage() >= 0.80); // 80% minimum for release
}
Supervision Integration
Erlang OTP-style supervision for GPU workers:
#![allow(unused)]
fn main() {
use trueno_cuda_edge::supervisor::{
SupervisionStrategy, SupervisionTree, GpuHealthMonitor, HeartbeatStatus
};
// OneForOne: isolated restarts
let mut tree = SupervisionTree::new(SupervisionStrategy::OneForOne, 4);
// Health monitoring
let monitor = GpuHealthMonitor::builder()
.max_missed(3)
.throttle_temp(85)
.shutdown_temp(95)
.build();
// Check worker health
let action = monitor.check_status(HeartbeatStatus::MissedBeats(2));
}
See Also
Model Serving Ecosystem
The Model Serving Ecosystem provides a unified interface for local and remote model serving across the ML ecosystem. Built on Toyota Way principles, it ensures reliable, cost-effective, and privacy-aware model inference.
Toyota Way Principles
| Principle | Implementation |
|---|---|
| Standardized Work | Chat templates ensure consistent model interaction |
| Poka-Yoke | Privacy gates prevent accidental data leakage |
| Jidoka | Stateful failover maintains context on errors |
| Muda Elimination | Cost circuit breakers prevent waste |
| Heijunka | Spillover routing enables load leveling |
Components
ChatTemplateEngine
Unified prompt templating supporting multiple formats:
#![allow(unused)]
fn main() {
use batuta::serve::{ChatTemplateEngine, ChatMessage, TemplateFormat};
// Auto-detect from model name
let engine = ChatTemplateEngine::from_model("llama-2-7b-chat");
let messages = vec![
ChatMessage::system("You are a helpful assistant."),
ChatMessage::user("What is Rust?"),
];
let prompt = engine.apply(&messages);
}
Supported Formats:
Llama2- Meta’s Llama 2 format with[INST]tagsMistral- Mistral’s format (similar to Llama2)ChatML- OpenAI-style<|im_start|>formatAlpaca- Stanford Alpaca instruction formatVicuna- Vicuna conversation formatRaw- Passthrough without formatting
BackendSelector
Intelligent backend selection with privacy tiers:
#![allow(unused)]
fn main() {
use batuta::serve::{BackendSelector, PrivacyTier, ServingBackend};
let selector = BackendSelector::new()
.with_privacy(PrivacyTier::Sovereign) // Local only
.with_latency(LatencyTier::Interactive);
let backends = selector.recommend();
// Returns: [Realizar, Ollama, LlamaCpp]
}
Privacy Tiers:
| Tier | Description | Allowed Backends |
|---|---|---|
Sovereign | Local only, blocks ALL external API calls | Realizar, Ollama, LlamaCpp, Llamafile, Candle, Vllm, Tgi, LocalAI |
Private | Dedicated/VPC endpoints only | Local + AzureOpenAI, AwsBedrock, GoogleVertex |
Standard | Public APIs acceptable | All backends |
Supported Backends:
Local (8):
- Realizar, Ollama, LlamaCpp, Llamafile, Candle, Vllm, Tgi, LocalAI
Remote (12):
- HuggingFace, Together, Replicate, Anyscale, Modal, Fireworks, Groq
- OpenAI, Anthropic, AzureOpenAI, AwsBedrock, GoogleVertex
CostCircuitBreaker
Daily budget limits to prevent runaway costs:
#![allow(unused)]
fn main() {
use batuta::serve::{CostCircuitBreaker, CircuitBreakerConfig};
let config = CircuitBreakerConfig {
daily_budget_usd: 10.0,
warning_threshold: 0.8, // Warn at 80%
max_request_cost_usd: 1.0,
..Default::default()
};
let breaker = CostCircuitBreaker::new(config);
// Before each request
match breaker.check(estimated_cost) {
Ok(_) => { /* proceed */ },
Err(CostError::DailyBudgetExceeded { .. }) => { /* block */ },
Err(CostError::RequestTooExpensive { .. }) => { /* reject */ },
}
// After request completes
breaker.record(actual_cost);
}
Token Pricing (per 1M tokens):
| Model | Input | Output |
|---|---|---|
| GPT-4 Turbo | $10.00 | $30.00 |
| GPT-4 | $30.00 | $60.00 |
| GPT-3.5 Turbo | $0.50 | $1.50 |
| Claude 3 Opus | $15.00 | $75.00 |
| Claude 3 Sonnet | $3.00 | $15.00 |
| Claude 3 Haiku | $0.25 | $1.25 |
| Llama (local) | $0.00 | $0.00 |
ContextManager
Automatic token counting and context truncation:
#![allow(unused)]
fn main() {
use batuta::serve::{ContextManager, TruncationStrategy};
let manager = ContextManager::for_model("gpt-4-turbo");
// Check if messages fit
if manager.fits(&messages) {
// Proceed directly
} else {
// Truncate using strategy
let truncated = manager.truncate(&messages)?;
}
}
Context Windows:
| Model | Max Tokens | Output Reserve |
|---|---|---|
| GPT-4 Turbo | 128,000 | 4,096 |
| GPT-4 | 8,192 | 2,048 |
| Claude 3 | 200,000 | 4,096 |
| Llama 3 | 8,192 | 2,048 |
| Mixtral | 32,768 | 4,096 |
Truncation Strategies:
SlidingWindow- Remove oldest messages firstMiddleOut- Keep first and last, remove middleError- Fail instead of truncating
FailoverManager
Stateful failover for streaming with context preservation:
#![allow(unused)]
fn main() {
use batuta::serve::{FailoverManager, ServingBackend};
let mut manager = FailoverManager::with_defaults();
// Start tracking
manager.start_tracking("req-123", "Original prompt");
// Accumulate tokens during streaming
manager.append_tokens("req-123", "Generated ");
manager.append_tokens("req-123", "tokens here");
// On failure, prepare failover
if manager.should_failover("req-123") {
let failover_request = manager.prepare_failover("req-123");
// Contains continuation prompt with generated prefix
}
// On success
manager.complete("req-123");
}
SpilloverRouter
Hybrid cloud spillover routing for load leveling:
#![allow(unused)]
fn main() {
use batuta::serve::{SpilloverRouter, RouterConfig};
let config = RouterConfig {
spillover_threshold: 10, // Queue depth before spillover
max_queue_depth: 50,
local_backend: ServingBackend::Realizar,
spillover_backends: vec![
ServingBackend::Groq,
ServingBackend::Together,
],
..Default::default()
};
let router = SpilloverRouter::new(config);
match router.route() {
RoutingDecision::Local(backend) => { /* use local */ },
RoutingDecision::Spillover(backend) => { /* use remote */ },
RoutingDecision::Reject(reason) => { /* queue full */ },
}
}
Integration Example
Complete example combining all components:
#![allow(unused)]
fn main() {
use batuta::serve::{
ChatTemplateEngine, ChatMessage,
BackendSelector, PrivacyTier,
CostCircuitBreaker, CircuitBreakerConfig,
ContextManager,
SpilloverRouter, RouterConfig,
};
// 1. Select backend based on privacy requirements
let selector = BackendSelector::new()
.with_privacy(PrivacyTier::Private);
let backend = selector.recommend().first().copied()
.expect("No backend available");
// 2. Check cost budget
let breaker = CostCircuitBreaker::with_defaults();
let estimated_cost = 0.01;
breaker.check(estimated_cost)?;
// 3. Prepare messages with context management
let messages = vec![
ChatMessage::system("You are helpful."),
ChatMessage::user("Explain quantum computing."),
];
let manager = ContextManager::for_model("llama-2-70b");
let messages = manager.truncate(&messages)?;
// 4. Apply chat template
let engine = ChatTemplateEngine::from_model("llama-2-70b");
let prompt = engine.apply(&messages);
// 5. Route request
let router = SpilloverRouter::with_defaults();
let decision = router.route();
// 6. Execute and record cost
// ... inference call ...
breaker.record(actual_cost);
}
Configuration
Default configurations are provided for common use cases:
#![allow(unused)]
fn main() {
// Sovereign mode - local only
let config = RouterConfig::sovereign();
// Enterprise mode - private endpoints
let selector = BackendSelector::new()
.with_privacy(PrivacyTier::Private);
// Cost-conscious mode
let config = CircuitBreakerConfig {
daily_budget_usd: 5.0,
max_request_cost_usd: 0.50,
..Default::default()
};
}
Model Security (Spec §8)
The serving ecosystem integrates with Pacha’s security features for model integrity and confidentiality.
Model Signing (§8.2)
Ed25519 digital signatures ensure model integrity:
#![allow(unused)]
fn main() {
use pacha::signing::{generate_keypair, sign_model, verify_model};
// Generate signing keypair (once)
let (signing_key, verifying_key) = generate_keypair();
// Sign model before distribution
let model_data = std::fs::read("model.gguf")?;
let signature = sign_model(&model_data, &signing_key)?;
signature.save("model.gguf.sig")?;
// Verify before loading
let sig = ModelSignature::load("model.gguf.sig")?;
verify_model(&model_data, &sig)?;
}
CLI Usage:
# Generate signing key
batuta pacha keygen --identity alice@example.com
# Sign a model
batuta pacha sign model.gguf --identity alice@example.com
# Verify signature
batuta pacha verify model.gguf
Encryption at Rest (§8.3)
ChaCha20-Poly1305 encryption for secure model distribution:
#![allow(unused)]
fn main() {
use pacha::crypto::{encrypt_model, decrypt_model, is_encrypted};
// Encrypt for distribution
let encrypted = encrypt_model(&model_data, "secure-password")?;
std::fs::write("model.gguf.enc", &encrypted)?;
// Decrypt at load time
let encrypted = std::fs::read("model.gguf.enc")?;
if is_encrypted(&encrypted) {
let password = std::env::var("MODEL_KEY")?;
let decrypted = decrypt_model(&encrypted, &password)?;
}
}
CLI Usage:
# Encrypt model
batuta pacha encrypt model.gguf --password-env MODEL_KEY
# Decrypt at runtime
MODEL_KEY=secret batuta pacha decrypt model.gguf.enc
Encrypted File Format:
- Magic:
PACHAENC(8 bytes) - Version: 1 byte
- Salt: 32 bytes (key derivation)
- Nonce: 12 bytes
- Ciphertext: variable
- Auth tag: 16 bytes
Content-Addressed Storage (§8.1)
All models in Pacha are content-addressed with BLAKE3:
#![allow(unused)]
fn main() {
// Verify before loading
let expected = "blake3:a1b2c3...";
let actual = blake3::hash(&model_data);
assert_eq!(expected, format!("blake3:{}", actual.to_hex()));
}
Feature Flag
The serve module requires the native feature:
[dependencies]
batuta = { version = "0.1", features = ["native"] }
Support Tools
The Sovereign AI Stack includes essential support tools for scripting, quality analysis, and system tracing. These tools integrate with Batuta’s orchestration workflow.
Tool Overview
| Tool | Purpose | Integration Point |
|---|---|---|
| Ruchy | Rust scripting language | Embedded scripting, automation |
| PMAT | Quality analysis (TDG scoring) | Phase 1: Analysis, CI/CD gates |
| APR-QA | APR model validation | Model quality assurance |
| Renacer | Syscall tracing | Phase 4: Validation |
Ruchy: Rust Scripting
Ruchy provides a scripting language that compiles to Rust, enabling:
- Automation scripts: Build, deployment, data processing
- Embedded scripting: In Presentar apps (Section 8)
- REPL development: Interactive exploration
// Ruchy script for data processing
let data = load_dataset("transactions")
let filtered = data.filter(|row| row.amount > 100)
let aggregated = filtered.group_by("category").sum("amount")
save_dataset(aggregated, "output.ald")
Security (in Presentar):
- Max 1M instructions per script
- Max 16MB memory allocation
- 10ms time slices (cooperative yielding)
PMAT: Quality Analysis
PMAT computes Technical Debt Grade (TDG) scores for projects:
- 0-100 scale: F, D, C-, C, C+, B-, B, B+, A-, A, A+
- Multi-language: Rust, Python, C/C++, Shell
- Metrics: Complexity, coverage, duplication, dependencies
# Analyze a project
pmat analyze ./myproject --output report.json
# CI gate (fail if below B+)
pmat gate ./myproject --min-grade B+
Integration with Batuta:
- Phase 1 (Analysis): Initial TDG assessment
- Phase 4 (Validation): Post-transpilation quality check
- CI/CD: Gate enforcement
Renacer: Syscall Tracing
Renacer captures system call traces for validation:
- Deterministic replay: Ensures transpiled code matches original behavior
- Golden trace comparison: Baseline vs current
- Cross-platform: Linux, macOS, Windows
# Capture baseline trace
renacer capture ./original_binary -- args > baseline.trace
# Compare against transpiled
renacer compare baseline.trace ./transpiled_binary -- args
Integration with Batuta:
- Phase 4 (Validation): Behavioral equivalence testing
APR-QA: Model Quality Assurance
APR-QA provides a comprehensive QA playbook for APR models:
- Test Generation: Automatic QA test generation for APR models
- Model Validation: Verify model correctness and integrity
- Benchmark Runner: Performance benchmarks on APR models
- Coverage Reports: Model coverage analysis and reporting
# Generate QA tests for an APR model
apr-qa gen model.apr --output tests/
# Run QA suite
apr-qa run tests/ --report report.html
# Quick validation
apr-qa validate model.apr
Integration with Batuta:
- Stack quality gates for APR model artifacts
- Integration with certeza for CI/CD pipelines
- Works with aprender (training) and realizar (inference)
Additional Support Tools
Trueno-RAG (v0.1.0)
Retrieval-Augmented Generation pipeline built on Trueno:
- Vector similarity search
- Document chunking
- Embedding generation
Trueno-Graph
Graph data structures and algorithms:
- Property graphs
- Traversal operations
- Connected component analysis
Trueno-DB
Embedded database with Trueno compute:
- Column-store backend
- SQL-like query interface
- ACID transactions
Tool Ecosystem Map
┌─────────────────────────────────────────────────────────────────┐
│ Batuta (Orchestration) │
├─────────────────────────────────────────────────────────────────┤
│ Transpilers │ Support Tools │ Data/ML │
│ ├── Depyler │ ├── Ruchy │ ├── Alimentar │
│ ├── Decy │ ├── PMAT │ ├── Aprender │
│ └── Bashrs │ ├── APR-QA │ └── Realizar │
│ │ └── Renacer │ │
├─────────────────────────────────────────────────────────────────┤
│ Visualization │ Extensions │ Registry │
│ ├── Trueno-Viz │ ├── Trueno-RAG │ └── Pacha │
│ └── Presentar │ ├── Trueno-Graph │ │
│ │ └── Trueno-DB │ │
└─────────────────────────────────────────────────────────────────┘
Further Reading
Navigate: Table of Contents | Foundation Libraries
Ruchy: Systems Scripting to Rust
“Write scripts with shell-like ergonomics, get idiomatic Rust with extreme quality.”
Ruchy is a systems scripting language that transpiles to idiomatic Rust. It bridges the gap between quick shell scripts and production-grade Rust code, with built-in extreme TDD methodology.
Overview
| Attribute | Value |
|---|---|
| Version | 3.213.0 |
| Layer | L3: Transpilers |
| Direction | Script → Rust |
| Repository | github.com/paiml/ruchy |
Why Ruchy?
The Shell Script Problem
Shell scripts are:
- Quick to write
- Hard to maintain
- Impossible to test properly
- Platform-dependent
- Error-prone (silent failures)
The Rust Solution Problem
Rust is:
- Safe and fast
- Verbose for simple tasks
- Steep learning curve for scripts
- Overkill for one-off automation
Ruchy: Best of Both Worlds
Shell Ergonomics + Rust Safety = Ruchy
Capabilities
script_to_rust
Transpile ruchy scripts to idiomatic Rust:
#!/usr/bin/env ruchy
# Ruchy script - shell-like syntax
let files = glob("src/**/*.rs")
for file in files {
let content = read(file)
if content.contains("TODO") {
println("Found TODO in {file}")
}
}
Transpiles to:
use std::fs;
use glob::glob;
fn main() -> anyhow::Result<()> {
let files: Vec<_> = glob("src/**/*.rs")?.collect();
for file in files {
let file = file?;
let content = fs::read_to_string(&file)?;
if content.contains("TODO") {
println!("Found TODO in {}", file.display());
}
}
Ok(())
}
shell_semantics
Shell-like semantics with Rust safety guarantees:
# Pipeline syntax
let result = cat("data.txt") | grep("error") | wc("-l")
# Command execution with proper error handling
let output = exec("cargo", ["build", "--release"])?
# Environment variables
let home = env("HOME")
let path = env("PATH").split(":")
# Process management
let pid = spawn("./server", ["--port", "8080"])
wait(pid)?
wasm_target
Compile ruchy scripts to WebAssembly:
# Compile to WASM
ruchy build --target wasm32-unknown-unknown script.rcy
# Run in browser or Node.js
node run_wasm.js
extreme_tdd
Built-in extreme TDD methodology:
#!/usr/bin/env ruchy
#[test]
fn test_file_processing() {
let temp = tempfile()
write(temp, "hello\nworld\n")
let lines = read_lines(temp)
assert_eq(lines.len(), 2)
assert_eq(lines[0], "hello")
}
# Property-based testing
#[proptest]
fn test_reverse_invariant(s: String) {
assert_eq(s.reverse().reverse(), s)
}
Integration with Batuta
Ruchy integrates seamlessly with the batuta orchestration pipeline:
#!/usr/bin/env ruchy
# Automated migration pipeline
let project = env("PROJECT_PATH")
# Phase 1: Analysis
println("Analyzing {project}...")
let analysis = batuta::analyze(project)?
# Phase 2: Transpilation
if analysis.languages.contains("python") {
println("Transpiling Python code...")
batuta::transpile(project, ["--incremental"])?
}
# Phase 3: Validation
println("Running validation...")
let result = batuta::validate(project)?
if result.passed {
println("Migration successful!")
} else {
println("Validation failed: {result.errors}")
exit(1)
}
Integration with Renacer
Automate syscall tracing with ruchy:
#!/usr/bin/env ruchy
# Performance regression testing
let binary = "target/release/myapp"
let baseline = "golden_traces/baseline.json"
# Capture new trace
let trace = renacer::trace(binary, ["--format", "json"])?
# Compare with baseline
let diff = renacer::compare(baseline, trace)?
if diff.regression_detected {
println("Performance regression detected!")
println("Syscall count: {diff.baseline_count} -> {diff.current_count}")
exit(1)
}
println("No regression detected")
CLI Usage
# Run a ruchy script
ruchy run script.rcy
# Transpile to Rust
ruchy transpile script.rcy -o output.rs
# Build to binary
ruchy build script.rcy
# Build to WASM
ruchy build --target wasm32 script.rcy
# Run tests
ruchy test script.rcy
# Format code
ruchy fmt script.rcy
Example: CI/CD Automation
#!/usr/bin/env ruchy
# ci.rcy - CI pipeline in ruchy
# Run linting
println("Running clippy...")
exec("cargo", ["clippy", "--", "-D", "warnings"])?
# Run tests with coverage
println("Running tests...")
exec("cargo", ["llvm-cov", "--lcov", "--output-path", "lcov.info"])?
# Check coverage threshold
let coverage = parse_lcov("lcov.info")
if coverage.line_rate < 0.95 {
println("Coverage {coverage.line_rate * 100}% < 95% threshold")
exit(1)
}
# Build release
println("Building release...")
exec("cargo", ["build", "--release"])?
println("CI passed!")
Comparison
| Feature | Shell | Python | Rust | Ruchy |
|---|---|---|---|---|
| Quick scripts | Yes | Yes | No | Yes |
| Type safety | No | No | Yes | Yes |
| Error handling | Poor | Ok | Excellent | Excellent |
| Performance | Ok | Ok | Excellent | Excellent |
| Testability | Poor | Good | Excellent | Excellent |
| Cross-platform | No | Yes | Yes | Yes |
| WASM support | No | No | Yes | Yes |
Key Takeaways
- Shell ergonomics: Write scripts as easily as bash
- Rust output: Get safe, fast, idiomatic Rust code
- Extreme TDD: Built-in testing methodology
- WASM ready: Compile to WebAssembly
- Batuta integration: Drive migration pipelines
Previous: Bashrs: Rust to Shell Next: Batuta: Workflow Orchestrator
PMAT: Quality Analysis
“PMAT (Pragmatic Metrics & Analysis Tool) provides TDG scoring, complexity analysis, and adaptive quality assessment for Batuta workflows.”
Overview
PMAT is Batuta’s quality analysis tool that measures code quality and generates actionable roadmaps:
- TDG (Technical Debt Grade): A-F grade for code quality
- Complexity analysis: Cyclomatic and cognitive complexity metrics
- Adaptive analysis: Muda (waste) elimination through smart analysis
- Roadmap generation: Prioritized task lists for improvement
- Multi-language support: Python, C, C++, Rust, Shell
Installation
# Install from crates.io
cargo install pmat
# Verify installation
pmat --version
# Output: pmat 2.199.0
Basic Usage
TDG Scoring
Calculate Technical Debt Grade for a project:
# Analyze current directory
pmat tdg .
# Output:
# 📊 Technical Debt Grade (TDG): B
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Complexity: 72/100 (Good)
# Maintainability: 68/100 (Fair)
# Test Coverage: 85/100 (Excellent)
# Documentation: 45/100 (Poor)
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Overall Score: 67.5/100 → Grade B
Complexity Analysis
Measure code complexity:
# Analyze complexity (JSON output)
pmat analyze complexity src/ --format json
# Output:
# {
# "files": [
# {
# "path": "src/main.rs",
# "cyclomatic_complexity": 12,
# "cognitive_complexity": 8,
# "lines_of_code": 245
# }
# ],
# "total_complexity": 12,
# "average_complexity": 3.2
# }
Language Detection
Detect languages in a project:
pmat detect languages /path/to/project
# Output:
# Python: 65% (12,450 lines)
# C: 25% (4,780 lines)
# Shell: 10% (1,920 lines)
Batuta Integration
Batuta uses PMAT for Phase 1 (Analysis):
# Batuta automatically runs PMAT
batuta analyze /path/to/project
# Internally calls:
pmat tdg /path/to/project
pmat analyze complexity /path/to/project --format json
pmat detect languages /path/to/project
Output integrates into Batuta’s analysis phase:
Phase 1: Analysis [████████████████████] 100%
✓ Language detection (Python: 65%, C: 25%, Shell: 10%)
✓ TDG score: B (67.5/100)
✓ Complexity: Medium (avg: 3.2)
✓ Recommendations: 5 optimizations identified
TDG Scoring System
Grade Scale
| Grade | Score | Interpretation |
|---|---|---|
| A | 90-100 | Excellent - minimal technical debt |
| B | 80-89 | Good - manageable technical debt |
| C | 70-79 | Fair - moderate technical debt |
| D | 60-69 | Poor - significant technical debt |
| F | <60 | Critical - severe technical debt |
Components
TDG is calculated from four weighted metrics:
- Complexity (30%): Cyclomatic and cognitive complexity
- Maintainability (25%): Code duplication, naming, structure
- Test Coverage (25%): Unit test coverage percentage
- Documentation (20%): Inline comments, API docs, README
Formula:
TDG = (Complexity × 0.30) + (Maintainability × 0.25) +
(TestCoverage × 0.25) + (Documentation × 0.20)
Complexity Metrics
Cyclomatic Complexity
Number of independent paths through code:
| Complexity | Rating | Action |
|---|---|---|
| 1-10 | Simple | No action needed |
| 11-20 | Moderate | Consider refactoring |
| 21-50 | Complex | Refactor recommended |
| >50 | Very Complex | Refactor required |
Example:
#![allow(unused)]
fn main() {
fn example(x: i32) -> i32 {
if x > 0 { // +1
if x > 10 { // +1
x * 2
} else { // +1
x + 1
}
} else {
x - 1
}
}
// Cyclomatic Complexity: 3
}
Cognitive Complexity
Measures how difficult code is to understand:
- Nested conditions: +1 per level
- Recursion: +1
- Logical operators: +1 per operator
- Goto statements: +5
Lower is better - aim for cognitive complexity < 15.
Adaptive Analysis (Muda Elimination)
PMAT implements Muda (waste elimination) by skipping redundant analysis:
File Caching
Skip analysis of unchanged files:
# First run: analyzes all files
pmat analyze complexity src/
# Second run: only analyzes changed files
pmat analyze complexity src/
# ⏭️ Skipped 42 unchanged files (Muda elimination)
# 📊 Analyzed 3 changed files
Incremental TDG
Update TDG score incrementally:
# Initial full analysis
pmat tdg . --full
# Incremental update (only changed files)
pmat tdg . --incremental
# ⚡ Incremental TDG: B → A (3 files improved)
Roadmap Generation
PMAT generates prioritized improvement roadmaps:
pmat roadmap generate /path/to/project
# Output:
# 📋 Improvement Roadmap
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Priority 1 (Critical):
# • Reduce complexity in src/pipeline.rs (CC: 45)
# • Add tests for src/converter.rs (0% coverage)
#
# Priority 2 (High):
# • Document public API in src/lib.rs
# • Refactor src/analyzer.rs (duplicated code)
#
# Priority 3 (Medium):
# • Improve naming in src/utils.rs
# • Add examples to README.md
Command-Line Options
pmat [COMMAND] [OPTIONS]
COMMANDS:
tdg Calculate Technical Debt Grade
analyze Run specific analysis
detect Detect project attributes
roadmap Generate improvement roadmap
work Workflow management
ANALYZE SUBCOMMANDS:
complexity Measure code complexity
coverage Analyze test coverage
duplication Detect code duplication
DETECT SUBCOMMANDS:
languages Detect programming languages
frameworks Detect ML frameworks
OPTIONS:
--format <FORMAT> Output format: text, json, html [default: text]
--full Force full analysis (disable caching)
--strict Fail on warnings
-h, --help Print help
-V, --version Print version
Workflow Management
PMAT integrates with Batuta’s workflow:
# Continue from last task
pmat work continue
# Start specific task
pmat work start BATUTA-008
# List available tasks
pmat work list
# Show workflow status
pmat work status
Example output:
📋 Workflow Status
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase 3: ML Library Conversion (60%)
In Progress:
• BATUTA-008: NumPy → Trueno [████████░░] 80%
• BATUTA-009: sklearn → Aprender [██████░░░░] 60%
Pending:
• BATUTA-010: PyTorch → Realizar
• BATUTA-012: PARF Analysis
Configuration
Configure PMAT via .pmat.toml:
[analysis]
# Skip patterns
skip = [
"target/",
"node_modules/",
"*.pyc"
]
# Complexity thresholds
max_cyclomatic_complexity = 15
max_cognitive_complexity = 20
[tdg]
# Custom weights
complexity_weight = 0.30
maintainability_weight = 0.25
coverage_weight = 0.25
documentation_weight = 0.20
[muda]
# Enable adaptive analysis
enable_caching = true
cache_dir = ".pmat-cache/"
Integration with Make
Add PMAT to Makefile:
# Run TDG analysis
tdg:
\t@command -v pmat >/dev/null 2>&1 || { echo "Error: pmat not installed"; exit 1; }
\tpmat tdg src/
# Quality gate (fail if TDG < B)
quality: lint test coverage tdg
\t@echo "✅ All quality gates passed"
Usage:
make tdg # Calculate TDG score
make quality # Run all quality checks
Version
Current version: 2.199.0
Check installed version:
pmat --version
Update to latest:
cargo install pmat --force
Next Steps
- Renacer: Syscall Tracing: Runtime validation
- TDG Scoring: Deep dive into TDG calculation
- Phase 1: Analysis: Batuta’s analysis workflow
Navigate: Table of Contents
OIP: Defect Intelligence
“OIP (Organizational Intelligence Plugin) provides ML-powered defect pattern analysis and spectrum-based fault localization.”
Overview
OIP analyzes git history and test coverage to identify defect patterns and locate bugs:
- SBFL Fault Localization: Tarantula, Ochiai, DStar algorithms
- Defect Classification: ML-based commit labeling
- Training Data Extraction: Convert git history to ML training data
- RAG Enhancement: Knowledge retrieval with trueno-rag
- Ensemble Models: Weighted multi-model predictions
Installation
# Install from crates.io
cargo install oip
# Verify installation
oip --version
# Output: oip 0.3.1
Basic Usage
Training Data Extraction
Extract defect patterns from git history:
oip extract-training-data --repo /path/to/project --max-commits 500
# Output:
# Training Data Statistics:
# Total examples: 146
# Avg confidence: 0.84
#
# Class Distribution:
# ASTTransform: 53 (36.3%)
# OwnershipBorrow: 43 (29.5%)
# ComprehensionBugs: 12 (8.2%)
# ...
Fault Localization
Find suspicious lines using SBFL:
oip localize \
--passed-coverage passed.lcov \
--failed-coverage failed.lcov \
--formula tarantula \
--top-n 10
# Output:
# 🎯 Tarantula Hotspot Report
# Line | Suspiciousness | Status
# ------|----------------|--------
# 142 | 0.950 | 🔴 HIGH
# 287 | 0.823 | 🔴 HIGH
# 56 | 0.612 | 🟡 MEDIUM
SBFL Formulas
OIP supports multiple fault localization formulas:
| Formula | Description | Best For |
|---|---|---|
| Tarantula | Classic SBFL | General use |
| Ochiai | Cosine similarity | High precision |
| DStar2 | D* with power 2 | Balanced |
| DStar3 | D* with power 3 | Aggressive |
Suspiciousness Calculation
Tarantula formula:
suspiciousness = (failed(line) / total_failed) /
((failed(line) / total_failed) + (passed(line) / total_passed))
Defect Pattern Categories
OIP classifies defects into these categories:
| Category | Description | Example |
|---|---|---|
| TraitBounds | Missing or incorrect trait bounds | T: Clone + Send |
| ASTTransform | Syntax/structure issues | Macro expansion bugs |
| OwnershipBorrow | Ownership/lifetime errors | Use after move |
| ConfigurationErrors | Config/environment issues | Missing feature flag |
| ConcurrencyBugs | Race conditions | Data races |
| SecurityVulnerabilities | Security issues | Buffer overflow |
| TypeErrors | Type mismatches | Wrong generic |
| MemorySafety | Memory bugs | Dangling pointer |
Advanced Features
RAG Enhancement
Use knowledge retrieval for better localization:
oip localize \
--passed-coverage passed.lcov \
--failed-coverage failed.lcov \
--rag \
--knowledge-base bugs.yaml \
--fusion rrf
Ensemble Models
Combine multiple models for higher accuracy:
oip localize \
--passed-coverage passed.lcov \
--failed-coverage failed.lcov \
--ensemble \
--ensemble-model trained-model.bin \
--include-churn
Calibrated Predictions
Get confidence-calibrated outputs:
oip localize \
--passed-coverage passed.lcov \
--failed-coverage failed.lcov \
--calibrated \
--calibration-model calibration.bin \
--confidence-threshold 0.7
Integration with Batuta
OIP integrates with Batuta’s validation phase:
# Batuta can invoke OIP for fault analysis
batuta validate --fault-localize
Comparison with pmat
| Capability | pmat | oip |
|---|---|---|
| SATD Detection | ✅ | ❌ |
| TDG Scoring | ✅ | ❌ |
| Complexity Analysis | ✅ | ❌ |
| Fault Localization | ❌ | ✅ |
| Defect ML | ❌ | ✅ |
| RAG Enhancement | ❌ | ✅ |
Key insight: pmat is for static analysis BEFORE tests run. OIP is for fault analysis AFTER tests fail.
Command Reference
oip [COMMAND] [OPTIONS]
COMMANDS:
analyze Analyze GitHub organization
summarize Summarize analysis report
review-pr Review PR with context
extract-training-data Extract training data from git
train-classifier Train ML classifier
export Export features
localize SBFL fault localization
LOCALIZE OPTIONS:
--passed-coverage <PATH> LCOV from passing tests
--failed-coverage <PATH> LCOV from failing tests
--formula <FORMULA> tarantula, ochiai, dstar2, dstar3
--top-n <N> Top suspicious lines
--rag Enable RAG enhancement
--ensemble Use ensemble model
--calibrated Calibrated predictions
Version
Current version: 0.3.1
Next Steps
- PMAT: Static Analysis: Pre-test quality checks
- Probar: Runtime Testing: Test execution and coverage
- Phase 4: Validation: Batuta’s validation workflow
Navigate: Table of Contents
Probar: Runtime Testing
“Probar (Spanish: ‘to test/prove’) is a Rust-native testing framework for WASM games and web applications.”
Overview
Probar provides comprehensive runtime testing capabilities:
- Browser Automation: Chrome DevTools Protocol (CDP)
- Visual Regression: Perceptual image diffing
- WASM Coverage: Block-level coverage instrumentation
- TUI Testing: Presentar YAML falsification
- Pixel Coverage: Heatmap visualization
- Fault Localization: Tarantula SBFL (basic)
Installation
# Cargo.toml
[dev-dependencies]
jugar-probar = "0.2"
# The crate is published as jugar-probar on crates.io
# (the name "probar" was taken)
Key Features
Browser Automation
Control browsers via CDP:
#![allow(unused)]
fn main() {
use jugar_probar::{Browser, BrowserConfig, Page};
#[tokio::test]
async fn test_login() -> Result<(), Box<dyn std::error::Error>> {
let browser = Browser::launch(BrowserConfig::default()).await?;
let page = browser.new_page().await?;
page.goto("https://example.com/login").await?;
page.fill("#username", "testuser").await?;
page.fill("#password", "secret").await?;
page.click("#submit").await?;
assert!(page.wait_for_selector(".dashboard").await.is_ok());
Ok(())
}
}
Visual Regression Testing
Compare screenshots with perceptual diffing:
#![allow(unused)]
fn main() {
use jugar_probar::{VisualRegressionTester, VisualRegressionConfig, MaskRegion};
let tester = VisualRegressionTester::new(
VisualRegressionConfig::default()
.with_threshold(0.02) // 2% pixel difference allowed
.with_color_threshold(10) // Per-channel tolerance
);
// Add masks for dynamic content
let comparison = ScreenshotComparison::new()
.with_mask(MaskRegion::new(0, 0, 100, 50)) // Header
.with_mask(MaskRegion::new(0, 500, 800, 100)); // Footer
let result = tester.compare_images(&baseline, ¤t)?;
assert!(result.matches, "Visual regression: {}% diff", result.diff_percentage);
}
TUI Testing (Presentar)
Test terminal UIs with falsification protocol:
#![allow(unused)]
fn main() {
use jugar_probar::{
TerminalSnapshot, TerminalAssertion,
PresentarConfig, validate_presentar_config
};
// Load presentar YAML config
let config = PresentarConfig::default();
let result = validate_presentar_config(&config);
assert!(result.is_ok());
// Test terminal output
let snapshot = TerminalSnapshot::from_string(
"CPU 45% ████████░░░░░░░░ 4 cores\n\
MEM 60% ██████████░░░░░░ 8GB/16GB",
80, 24
);
let assertions = [
TerminalAssertion::Contains("CPU".into()),
TerminalAssertion::NotContains("ERROR".into()),
TerminalAssertion::CharAt { x: 0, y: 0, expected: 'C' },
];
for assertion in &assertions {
assertion.check(&snapshot)?;
}
}
Pixel Coverage Heatmaps
Visualize UI coverage:
#![allow(unused)]
fn main() {
use jugar_probar::pixel_coverage::{PixelCoverageTracker, HeatmapConfig};
let mut tracker = PixelCoverageTracker::new(800, 600);
// Record pixel interactions during tests
tracker.record_click(100, 200);
tracker.record_hover(150, 250);
// Generate heatmap
let heatmap = tracker.generate_heatmap(HeatmapConfig::viridis());
heatmap.save_png("coverage_heatmap.png")?;
}
WASM Coverage
Block-level coverage for WASM modules:
#![allow(unused)]
fn main() {
use jugar_probar::coverage::{CoverageCollector, CoverageConfig, Granularity};
let collector = CoverageCollector::new(
CoverageConfig::default()
.with_granularity(Granularity::Block)
);
// Execute WASM with coverage
let report = collector.execute_with_coverage(wasm_module)?;
println!("Coverage: {:.1}%", report.summary().line_coverage * 100.0);
}
Feature Flags
| Feature | Description |
|---|---|
browser | Enable CDP browser control (chromiumoxide, tokio) |
runtime | Enable WASM runtime (wasmtime) |
derive | Enable derive macros for type-safe selectors |
[dev-dependencies]
jugar-probar = { version = "0.2", features = ["browser", "runtime"] }
Brick Architecture
Probar’s unique Brick Architecture where tests ARE the interface:
#![allow(unused)]
fn main() {
use jugar_probar::brick::{Brick, BrickAssertion, BrickBudget};
struct StatusBrick {
message: String,
is_visible: bool,
}
impl Brick for StatusBrick {
fn brick_name(&self) -> &'static str {
"StatusBrick"
}
fn assertions(&self) -> &[BrickAssertion] {
&[
BrickAssertion::TextVisible,
BrickAssertion::ContrastRatio(4.5), // WCAG AA
]
}
fn budget(&self) -> BrickBudget {
BrickBudget::uniform(50) // 50ms render budget
}
fn verify(&self) -> BrickVerification {
// Verify assertions...
}
}
}
Comparison with Other Tools
| Capability | probar | pmat | oip |
|---|---|---|---|
| Browser Automation | ✅ | ❌ | ❌ |
| Visual Regression | ✅ | ❌ | ❌ |
| WASM Coverage | ✅ | ❌ | ❌ |
| TUI Testing | ✅ | ❌ | ❌ |
| SATD Detection | ❌ | ✅ | ❌ |
| TDG Scoring | ❌ | ✅ | ❌ |
| Defect ML | ❌ | ❌ | ✅ |
Key insight: probar executes tests and measures runtime behavior. pmat analyzes static code. oip analyzes test results.
Toyota Way Principles
Probar applies Toyota Way principles:
| Principle | Implementation |
|---|---|
| Poka-Yoke | Type-safe selectors prevent stringly-typed errors |
| Muda | Zero-copy memory views eliminate serialization |
| Jidoka | Soft Jidoka (LogAndContinue vs Stop) |
| Heijunka | Superblock tiling for amortized scheduling |
Quality Standards
- 95% minimum test coverage
- Zero tolerance for panic paths (
deny(unwrap_used, expect_used)) - ZERO JavaScript - pure Rust compiling to
.wasm
Version
Current version: 0.2.x (crates.io: jugar-probar)
Next Steps
- PMAT: Static Analysis: Pre-test quality checks
- OIP: Defect Intelligence: Post-test fault analysis
- Phase 4: Validation: Batuta’s validation workflow
Navigate: Table of Contents
Renacer: Syscall Tracing
“See what your code really does. Every syscall, every allocation, every I/O.”
Renacer is a pure Rust system call tracer with source-aware correlation. It captures what your binary actually does at the kernel level, enabling golden trace comparison and performance regression detection.
Overview
| Attribute | Value |
|---|---|
| Version | 0.6.5 |
| Layer | L5: Quality & Profiling |
| Type | Syscall Tracer |
| Repository | github.com/paiml/renacer |
Why Renacer?
The Observability Gap
Traditional profiling shows you:
- CPU time per function
- Memory allocations
- Call stacks
But misses:
- Actual I/O operations
- System call patterns
- Kernel-level behavior
- Resource contention
Renacer Fills the Gap
Your Code → Syscalls → Kernel → Hardware
↑
Renacer captures here
Capabilities
syscall_trace
Trace all system calls made by a binary:
# Basic tracing
$ renacer -- ./target/release/myapp
# Output
read(3, "config...", 4096) = 156
openat(AT_FDCWD, "data.csv", O_RDONLY) = 4
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, ...) = 0x7f...
write(1, "Processing...", 13) = 13
flamegraph
Generate flamegraphs from syscall traces:
# Generate flamegraph
$ renacer --flamegraph -- ./target/release/myapp
📊 Flamegraph saved to: flamegraph.svg
# With filtering
$ renacer --flamegraph --filter "write|read" -- ./myapp
golden_trace_comparison
Compare traces for semantic equivalence:
# Capture baseline
$ renacer --format json -- ./baseline > golden.json
# Compare new version
$ renacer --format json -- ./new_version > current.json
$ renacer compare golden.json current.json
Comparison Results:
Syscall count: 1,234 → 1,456 (+18%)
Write operations: 45 → 42 (-7%)
Memory allocations: 23 → 89 (+287%) ⚠️
REGRESSION DETECTED: Memory allocations increased significantly
Output Formats
Summary Statistics
$ renacer --summary -- ./myapp
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
58.67 0.000748 6 113 write
9.57 0.000122 9 13 mmap
4.63 0.000059 9 6 mprotect
2.51 0.000032 6 5 rt_sigaction
------ ----------- ----------- --------- --------- ----------------
100.00 0.001275 7 178 2 total
JSON Format
$ renacer --format json -- ./myapp
{
"version": "0.6.5",
"binary": "./myapp",
"syscalls": [
{
"name": "openat",
"args": ["AT_FDCWD", "config.toml", "O_RDONLY"],
"result": 3,
"duration_ns": 1234
},
{
"name": "read",
"args": ["3", "...", "4096"],
"result": 256,
"duration_ns": 456
}
],
"summary": {
"total_syscalls": 178,
"total_duration_ns": 1275000,
"by_type": {
"write": 113,
"mmap": 13,
"read": 12
}
}
}
Source-Aware Tracing
$ renacer -s -- ./myapp
# Output includes source locations
src/main.rs:42 openat("config.toml") = 3
src/config.rs:15 read(3, ..., 4096) = 256
src/process.rs:89 mmap(NULL, 1MB) = 0x7f...
Integration with Batuta
Performance Validation
Configure performance assertions in renacer.toml:
# renacer.toml
[[assertion]]
name = "orchestration_latency"
type = "critical_path"
max_duration_ms = 5000
fail_on_violation = true
[[assertion]]
name = "max_syscall_budget"
type = "span_count"
max_spans = 10000
fail_on_violation = true
[[assertion]]
name = "memory_allocation_budget"
type = "memory_usage"
max_bytes = 1073741824 # 1GB
fail_on_violation = true
Golden Trace Workflow
# 1. Capture golden traces for examples
$ ./scripts/capture_golden_traces.sh
# 2. Run validation in CI
$ cargo test --test golden_trace_validation
# 3. Compare on changes
$ renacer compare golden_traces/baseline.json new_trace.json
Integration with Certeza
Renacer integrates with certeza for comprehensive quality validation:
#![allow(unused)]
fn main() {
// In tests
#[test]
fn test_performance_budget() {
let trace = renacer::trace("./target/release/myapp")?;
// Assert syscall budget
assert!(trace.total_syscalls() < 1000);
// Assert no unexpected file access
assert!(!trace.has_syscall("openat", "/etc/passwd"));
// Assert memory budget
assert!(trace.total_memory_allocated() < 100 * 1024 * 1024);
}
}
Anti-Pattern Detection
Renacer can detect common performance anti-patterns:
Tight Loop Detection
[[assertion]]
name = "detect_tight_loop"
type = "anti_pattern"
pattern = "TightLoop"
threshold = 0.7
fail_on_violation = true
Detects:
⚠️ Tight loop detected at src/process.rs:145
10,000 iterations without I/O
Consider: batch processing, yielding
God Process Detection
[[assertion]]
name = "prevent_god_process"
type = "anti_pattern"
pattern = "GodProcess"
threshold = 0.8
fail_on_violation = false # Warning only
Detects:
⚠️ God process pattern at src/main.rs
Single process handling 95% of work
Consider: delegation to worker processes
CLI Reference
# Basic tracing
renacer -- ./binary [args...]
# Summary statistics
renacer --summary -- ./binary
# Timing information
renacer --timing -- ./binary
# JSON output
renacer --format json -- ./binary
# Source correlation
renacer -s -- ./binary
# Flamegraph generation
renacer --flamegraph -- ./binary
# Compare traces
renacer compare baseline.json current.json
# Filter syscalls
renacer --filter "read|write" -- ./binary
# Assertions
renacer --config renacer.toml -- ./binary
Example: CI Integration
# .github/workflows/ci.yml
- name: Capture syscall trace
run: |
renacer --format json -- ./target/release/myapp > trace.json
- name: Compare with golden trace
run: |
renacer compare golden_traces/baseline.json trace.json
- name: Check performance assertions
run: |
renacer --config renacer.toml -- ./target/release/myapp
Key Takeaways
- Full visibility: See every syscall your code makes
- Golden traces: Detect regressions automatically
- Source correlation: Link syscalls to code locations
- Anti-patterns: Detect performance issues early
- CI integration: Automated performance validation
Previous: PMAT: Quality Analysis Next: Oracle Mode: Intelligent Query Interface
MCP Tooling
The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external tools and data sources. The PAIML stack provides first-class MCP support through two complementary crates:
| Crate | Version | Purpose |
|---|---|---|
| pmcp | v1.8.6 | Low-level Rust SDK for building MCP servers and clients |
| pforge | v0.1.4 | High-level declarative framework for MCP servers |
Why MCP?
MCP enables AI assistants (like Claude) to:
- Execute tools and functions
- Access external data sources
- Integrate with APIs and services
- Maintain stateful sessions
┌─────────────────┐ MCP Protocol ┌─────────────────┐
│ AI Assistant │ ◄─────────────────► │ MCP Server │
│ (Claude) │ │ (Your Tools) │
└─────────────────┘ └─────────────────┘
Stack Integration
MCP tooling integrates with the broader PAIML ecosystem:
┌─────────────────────────────────────────────────────────┐
│ MCP Server (pforge) │
├─────────────────────────────────────────────────────────┤
│ Tool: train_model │ Tool: query_data │
│ → Entrenar │ → Trueno-DB │
├───────────────────────┼─────────────────────────────────┤
│ Tool: run_inference │ Tool: visualize │
│ → Realizar │ → Trueno-Viz │
└─────────────────────────────────────────────────────────┘
Quick Start
Option 1: pforge (Recommended)
For most use cases, pforge provides the fastest path to a working MCP server:
# Install pforge CLI
cargo install pforge-cli
# Create new server
pforge new my-ml-server
cd my-ml-server
# Run server
pforge serve
Option 2: pmcp (Low-Level)
For custom implementations or advanced use cases:
use pmcp::{Server, Tool, ToolHandler};
#[tokio::main]
async fn main() {
let server = Server::new("my-server")
.with_tool(MyTool::new())
.build();
server.serve_stdio().await.unwrap();
}
Use Cases
| Use Case | Recommended Approach |
|---|---|
| Simple tool server | pforge with YAML config |
| Complex business logic | pforge with native handlers |
| Custom protocol needs | pmcp directly |
| Embedded in larger app | pmcp as library |
Next Steps
- pmcp: Rust MCP SDK - Deep dive into the SDK
- pforge: Declarative Framework - YAML-based server development
pmcp: Rust MCP SDK
pmcp (v1.8.6) is a high-quality Rust SDK for the Model Context Protocol with full TypeScript SDK compatibility.
Installation
[dependencies]
pmcp = "1.8"
Features
| Feature | Description |
|---|---|
| Full MCP compliance | Compatible with TypeScript SDK |
| Async-first | Built on Tokio for high performance |
| Type-safe | Rust’s type system prevents runtime errors |
| Transport agnostic | stdio, HTTP, WebSocket support |
| Schema generation | Automatic JSON Schema via schemars |
Architecture
┌─────────────────────────────────────────────────────────┐
│ pmcp SDK │
├─────────────────────────────────────────────────────────┤
│ Server │ Client │ Transport │
│ - Tool registry │ - Tool calling │ - Stdio │
│ - Resource mgmt │ - Resource read │ - HTTP/SSE │
│ - Prompt system │ - Prompt list │ - WebSocket │
└─────────────────────────────────────────────────────────┘
Basic Server
use pmcp::{Server, ServerBuilder};
use pmcp::tool::{Tool, ToolBuilder, ToolHandler};
use async_trait::async_trait;
struct GreetTool;
#[async_trait]
impl ToolHandler for GreetTool {
async fn call(&self, args: serde_json::Value) -> pmcp::Result<serde_json::Value> {
let name = args["name"].as_str().unwrap_or("World");
Ok(serde_json::json!({
"greeting": format!("Hello, {}!", name)
}))
}
}
#[tokio::main]
async fn main() -> pmcp::Result<()> {
let server = ServerBuilder::new("greeting-server")
.version("1.0.0")
.tool(
ToolBuilder::new("greet")
.description("Greet someone by name")
.param("name", "string", "Name to greet", true)
.handler(GreetTool)
.build()
)
.build();
server.serve_stdio().await
}
Tool Definition
Tools are the primary way to expose functionality:
#![allow(unused)]
fn main() {
use pmcp::tool::{ToolBuilder, ToolSchema};
let tool = ToolBuilder::new("analyze_code")
.description("Analyze source code for issues")
.param("code", "string", "Source code to analyze", true)
.param("language", "string", "Programming language", false)
.param("strict", "boolean", "Enable strict mode", false)
.handler(AnalyzeHandler)
.build();
}
Resources
Resources provide read-only data access:
#![allow(unused)]
fn main() {
use pmcp::resource::{Resource, ResourceBuilder};
let resource = ResourceBuilder::new("file://config.yaml")
.name("Configuration")
.description("Application configuration")
.mime_type("application/yaml")
.handler(ConfigResourceHandler)
.build();
}
Prompts
Prompts are reusable message templates:
#![allow(unused)]
fn main() {
use pmcp::prompt::{Prompt, PromptBuilder};
let prompt = PromptBuilder::new("code_review")
.description("Review code for best practices")
.argument("code", "Code to review", true)
.argument("focus", "Area to focus on", false)
.build();
}
Transport Options
Stdio (Default)
#![allow(unused)]
fn main() {
server.serve_stdio().await?;
}
HTTP with SSE
#![allow(unused)]
fn main() {
server.serve_http("127.0.0.1:8080").await?;
}
WebSocket
#![allow(unused)]
fn main() {
server.serve_websocket("127.0.0.1:8081").await?;
}
Integration with PAIML Stack
Entrenar Integration
#![allow(unused)]
fn main() {
use pmcp::tool::ToolHandler;
use entrenar::train::Trainer;
struct TrainModelTool {
trainer: Trainer,
}
#[async_trait]
impl ToolHandler for TrainModelTool {
async fn call(&self, args: serde_json::Value) -> pmcp::Result<serde_json::Value> {
let config_path = args["config"].as_str().unwrap();
// Load YAML config and train
let metrics = self.trainer.train_from_yaml(config_path)?;
Ok(serde_json::to_value(metrics)?)
}
}
}
Realizar Integration
#![allow(unused)]
fn main() {
use realizar::inference::InferenceEngine;
struct InferenceTool {
engine: InferenceEngine,
}
#[async_trait]
impl ToolHandler for InferenceTool {
async fn call(&self, args: serde_json::Value) -> pmcp::Result<serde_json::Value> {
let prompt = args["prompt"].as_str().unwrap();
let response = self.engine.generate(prompt).await?;
Ok(serde_json::json!({ "response": response }))
}
}
}
Error Handling
#![allow(unused)]
fn main() {
use pmcp::{Error, ErrorCode};
// Return structured errors
Err(Error::new(
ErrorCode::InvalidParams,
"Missing required parameter: name"
))
}
Testing
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
use pmcp::testing::MockClient;
#[tokio::test]
async fn test_greet_tool() {
let client = MockClient::new(server);
let result = client.call_tool("greet", json!({"name": "Alice"})).await;
assert_eq!(result["greeting"], "Hello, Alice!");
}
}
}
Best Practices
- Use descriptive tool names -
analyze_python_codenotanalyze - Document all parameters - Include description and required flag
- Return structured JSON - Not raw strings
- Handle errors gracefully - Use proper error codes
- Keep tools focused - One tool, one purpose
See Also
- pforge - Declarative framework built on pmcp
- MCP Specification - Official protocol docs
pforge: Declarative MCP Framework
pforge (v0.1.4) is a zero-boilerplate framework for building MCP servers using YAML configuration.
Installation
cargo install pforge-cli
Quick Start
# Create new project
pforge new my-server
cd my-server
# Project structure:
# my-server/
# ├── pforge.yaml # Server configuration
# ├── src/
# │ └── handlers/ # Native Rust handlers
# └── Cargo.toml
# Run the server
pforge serve
Configuration (pforge.yaml)
forge:
name: ml-tools-server
version: 0.1.0
transport: stdio
description: "ML tools for model training and inference"
tools:
# Native Rust handler
- type: native
name: train_model
description: "Train a model using YAML configuration"
handler:
path: handlers::train_model
params:
config_path:
type: string
required: true
description: "Path to training YAML config"
epochs:
type: integer
required: false
description: "Override number of epochs"
# CLI handler - execute shell commands
- type: cli
name: list_models
description: "List available models"
command: "ls -la models/"
# HTTP proxy handler
- type: http
name: huggingface_search
description: "Search HuggingFace Hub"
endpoint: "https://huggingface.co/api/models"
method: GET
params:
search:
type: string
required: true
# Pipeline handler - chain tools
- type: pipeline
name: train_and_export
description: "Train model and export to GGUF"
steps:
- tool: train_model
params:
config_path: "{{config}}"
- tool: export_gguf
params:
model_path: "{{previous.model_path}}"
Handler Types
Native Handlers
Full Rust implementation with type safety:
#![allow(unused)]
fn main() {
// src/handlers/mod.rs
use pforge_runtime::prelude::*;
pub async fn train_model(args: ToolArgs) -> ToolResult {
let config_path = args.get_string("config_path")?;
let epochs = args.get_optional_int("epochs");
// Your training logic here
let metrics = run_training(config_path, epochs).await?;
Ok(json!({
"status": "completed",
"metrics": metrics
}))
}
}
CLI Handlers
Execute shell commands:
tools:
- type: cli
name: run_benchmark
description: "Run performance benchmark"
command: "cargo bench --bench inference"
timeout_ms: 60000
working_dir: "./benchmarks"
HTTP Handlers
Proxy external APIs:
tools:
- type: http
name: fetch_model_info
description: "Get model info from registry"
endpoint: "https://api.example.com/models/{{model_id}}"
method: GET
headers:
Authorization: "Bearer {{env.API_TOKEN}}"
Pipeline Handlers
Chain multiple tools:
tools:
- type: pipeline
name: full_workflow
description: "Complete ML workflow"
steps:
- tool: validate_data
params:
path: "{{data_path}}"
- tool: train_model
params:
data: "{{previous.validated_path}}"
- tool: evaluate_model
params:
model: "{{previous.model_path}}"
Resources
Define read-only data sources:
resources:
- uri: "file://config/default.yaml"
name: "Default Configuration"
description: "Default training configuration"
mime_type: "application/yaml"
- uri: "db://experiments"
name: "Experiment History"
description: "Past experiment results"
handler:
path: handlers::get_experiments
Prompts
Reusable prompt templates:
prompts:
- name: code_review
description: "Review code for ML best practices"
arguments:
- name: code
description: "Code to review"
required: true
- name: focus
description: "Specific area to focus on"
required: false
template: |
Review this ML code for best practices:
```{{language}}
{{code}}
```
{{#if focus}}Focus on: {{focus}}{{/if}}
Environment Variables
Reference environment variables:
forge:
name: secure-server
tools:
- type: http
name: api_call
endpoint: "{{env.API_ENDPOINT}}"
headers:
Authorization: "Bearer {{env.API_KEY}}"
CLI Commands
# Create new project
pforge new <name>
# Serve MCP server
pforge serve [--port 8080] [--transport stdio|http|ws]
# Validate configuration
pforge validate
# Generate Rust code (without running)
pforge codegen
# List defined tools
pforge list tools
# Test a specific tool
pforge test <tool_name> --args '{"param": "value"}'
Integration Examples
Entrenar Training Server
forge:
name: entrenar-mcp
version: 0.1.0
tools:
- type: native
name: train
description: "Train model from YAML config"
handler:
path: handlers::entrenar_train
params:
config: { type: string, required: true }
- type: native
name: quantize
description: "Quantize model to 4-bit"
handler:
path: handlers::entrenar_quantize
params:
model_path: { type: string, required: true }
bits: { type: integer, required: false, default: 4 }
Realizar Inference Server
forge:
name: realizar-mcp
version: 0.1.0
tools:
- type: native
name: generate
description: "Generate text with LLM"
handler:
path: handlers::realizar_generate
params:
prompt: { type: string, required: true }
max_tokens: { type: integer, required: false, default: 256 }
temperature: { type: number, required: false, default: 0.7 }
Trueno-DB Query Server
forge:
name: trueno-db-mcp
version: 0.1.0
tools:
- type: native
name: query
description: "Execute SQL query"
handler:
path: handlers::trueno_query
params:
sql: { type: string, required: true }
- type: native
name: vector_search
description: "Semantic vector search"
handler:
path: handlers::trueno_vector_search
params:
query: { type: string, required: true }
top_k: { type: integer, required: false, default: 10 }
MCP Registry
pforge servers can be published to the MCP Registry:
# Publish to registry
pforge publish
# Registry entry
# Name: io.github.paiml/my-server
# Install: cargo install my-server-mcp
Best Practices
- Keep tools atomic - One tool, one responsibility
- Use pipelines for workflows - Chain atomic tools
- Validate inputs - Use JSON Schema constraints
- Document thoroughly - Good descriptions help AI assistants
- Use native handlers for complex logic - CLI/HTTP for simple cases
- Test with
pforge test- Validate before deployment
See Also
- pmcp - Low-level SDK that pforge builds on
- pforge GitHub - Source and examples
- MCP Registry - Published servers
Visualization & Apps
The Sovereign AI Stack includes a complete visualization and application layer built on GPU-accelerated primitives. This eliminates the need for Python-based tools like Streamlit, Gradio, or Panel.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Presentar (App Framework) │
│ - YAML-driven configuration │
│ - Auto-display for .apr/.ald files │
│ - Quality scoring (F-A grade) │
├─────────────────────────────────────────────────────────────────┤
│ Trueno-Viz (GPU Rendering) v0.1.1 │
│ - WGSL shaders for paths, fills, text │
│ - WebGPU + WASM targets │
│ - 60fps rendering pipeline │
├─────────────────────────────────────────────────────────────────┤
│ Trueno (Compute Foundation) v0.7.3 │
│ - SIMD vectorization │
│ - GPU compute dispatch │
│ - Backend: CPU/WASM/WebGPU │
└─────────────────────────────────────────────────────────────────┘
Components
| Component | Version | Purpose |
|---|---|---|
| Trueno-Viz | 0.1.1 | GPU rendering primitives (paths, fills, text, charts) |
| Presentar | 0.1.0 | YAML-driven app framework with auto-display |
Design Principles
Following the Toyota Way:
- Muda (Waste Elimination): No Python GIL, no runtime interpretation, no server round-trips
- Jidoka (Built-in Quality): Compile-time type safety, deterministic rendering
- Poka-yoke (Mistake Proofing): Schema validation at load time, not runtime
80/20 Rule
The visualization layer follows the stack’s 80/20 principle:
- 80% Pure Stack: All rendering via Trueno-Viz GPU primitives (WGSL shaders)
- 20% Minimal External:
winitfor cross-platform windowing (WASM lacks native window APIs)fontduefor font rasterization (platform-specific font hinting)
Use Cases
- Model Dashboards: Display Aprender model performance metrics
- Data Exploration: Interactive views of Alimentar datasets
- Inference UIs: Real-time prediction interfaces
- Quality Reports: TDG score visualization
Further Reading
- Trueno-Viz: GPU Rendering - Low-level rendering primitives
- Presentar: App Framework - High-level application framework
Navigate: Table of Contents | Foundation Libraries
Trueno-Viz: GPU Rendering Primitives
Version: 0.1.1 | Crate: trueno-viz
Trueno-Viz provides GPU-accelerated 2D rendering primitives built on Trueno’s compute foundation. It serves as the rendering backend for Presentar and any visualization needs in the Sovereign AI Stack.
Position in Stack
Presentar (Apps)
│
▼
Trueno-Viz (Rendering) ← YOU ARE HERE
│
▼
Trueno (Compute)
Core Abstractions
Canvas
The primary drawing surface:
#![allow(unused)]
fn main() {
pub struct Canvas<'gpu> {
context: &'gpu GpuContext,
commands: Vec<DrawCommand>,
viewport: Viewport,
}
impl Canvas<'_> {
pub fn clear(&mut self, color: Color);
pub fn draw(&mut self, cmd: DrawCommand);
pub fn present(&mut self);
}
}
Draw Commands
All rendering reduces to these primitives:
#![allow(unused)]
fn main() {
pub enum DrawCommand {
// Geometry
Path { points: Vec<Point>, closed: bool, style: StrokeStyle },
Fill { path: PathRef, color: Color, rule: FillRule },
Rect { bounds: Rect, radius: CornerRadius, style: BoxStyle },
Circle { center: Point, radius: f32, style: BoxStyle },
// Text (fontdue rasterization, GPU compositing)
Text { content: String, position: Point, style: TextStyle },
// Images (Trueno tensor → GPU texture)
Image { tensor: TensorRef, bounds: Rect, sampling: Sampling },
// Compositing
Group { children: Vec<DrawCommand>, transform: Transform2D },
Clip { bounds: Rect, child: Box<DrawCommand> },
Opacity { alpha: f32, child: Box<DrawCommand> },
}
}
WGSL Shader Pipeline
Trueno-Viz uses WebGPU Shading Language for GPU rendering:
// Fill shader
@vertex fn vs_fill(in: VertexInput) -> VertexOutput {
var out: VertexOutput;
out.position = vec4<f32>(in.position, 0.0, 1.0);
out.color = in.color;
return out;
}
@fragment fn fs_fill(in: VertexOutput) -> @location(0) vec4<f32> {
return in.color;
}
Anti-Aliasing Strategy
| Technique | Use Case | Implementation |
|---|---|---|
| Hardware MSAA | Solid fills | 4x MSAA via WebGPU |
| SDF | Text, icons | Shader-based, resolution-independent |
| Analytical AA | Lines, curves | Edge distance in fragment shader |
// Analytical AA for lines
@fragment fn fs_line(in: LineVertexOutput) -> @location(0) vec4<f32> {
let dist = abs(in.edge_distance);
let alpha = 1.0 - smoothstep(in.line_width - 1.0, in.line_width, dist);
return vec4<f32>(in.color.rgb, in.color.a * alpha);
}
Chart Primitives
Built on the Grammar of Graphics (Wilkinson, 2005):
#![allow(unused)]
fn main() {
pub enum ChartType {
Line { series: Vec<Series>, interpolation: Interpolation },
Bar { series: Vec<Series>, orientation: Orientation },
Scatter { series: Vec<Series>, size_encoding: Option<String> },
Heatmap { matrix: TensorRef, color_scale: ColorScale },
Histogram { data: TensorRef, bins: BinStrategy },
}
impl ChartType {
pub fn to_commands(&self, bounds: Rect, theme: &Theme) -> Vec<DrawCommand>;
}
}
Color System
Perceptually uniform color operations:
#![allow(unused)]
fn main() {
impl Color {
/// CIELAB color space (Levkowitz & Herman, 1992)
pub fn to_lab(&self) -> LabColor;
/// WCAG 2.1 contrast ratio
pub fn contrast_ratio(&self, other: &Color) -> f32 {
let l1 = self.relative_luminance();
let l2 = other.relative_luminance();
(l1.max(l2) + 0.05) / (l1.min(l2) + 0.05)
}
}
}
Performance Targets
| Operation | Target | Backend |
|---|---|---|
| Path tessellation (1K points) | <1ms | Trueno SIMD |
| Fill rendering (10K triangles) | <2ms | WebGPU |
| Text layout (1K glyphs) | <5ms | fontdue + GPU |
| Chart update (100K points) | <16ms | Full pipeline |
Backend Support
| Backend | Status | Notes |
|---|---|---|
| WebGPU (native) | Stable | Primary target |
| WebGPU (WASM) | Stable | Browser deployment |
| WGPU fallback | Stable | Vulkan/Metal/DX12 |
Integration with Trueno
Trueno-Viz leverages Trueno for:
- Tensor → Texture: Direct GPU upload for image data
- SIMD tessellation: Path point processing
- Color math: LAB/sRGB conversions
#![allow(unused)]
fn main() {
// Load tensor as GPU texture
let tensor: Tensor<f32> = trueno::load("image.bin")?;
let texture = canvas.upload_tensor(&tensor)?;
canvas.draw(DrawCommand::Image {
tensor: texture,
bounds: Rect::new(0.0, 0.0, 256.0, 256.0),
sampling: Sampling::Linear,
});
}
Recent Changes (v0.1.1)
- WebGPU compute physics demo
- WASM target support
- Comprehensive benchmark suite
Navigate: Table of Contents | Presentar | Trueno
Presentar: Sovereign AI Visualization & App Framework
Version: 0.1.0 | Status: Specification Complete
Presentar is a PURE WASM visualization and rapid application framework built entirely on Sovereign AI Stack primitives. It replaces Streamlit, Gradio, and Panel with 60fps GPU-accelerated rendering, compile-time type safety, and deterministic reproducibility.
Position in the Stack
┌─────────────────────────────────────────────────────────────────┐
│ Presentar (Visualization & Apps) ← YOU ARE HERE │
├─────────────────────────────────────────────────────────────────┤
│ Trueno-Viz (GPU Rendering Primitives) │
├─────────────────────────────────────────────────────────────────┤
│ Trueno (SIMD/GPU Compute) v0.7.3 │
├─────────────────────────────────────────────────────────────────┤
│ Aprender (ML) | Realizar (Inference) | Alimentar (Data) │
└─────────────────────────────────────────────────────────────────┘
Core Principles
| Principle | Implementation |
|---|---|
| 80% Pure Stack | All rendering via trueno-viz GPU primitives |
| 20% Minimal External | Only winit (windowing) + fontdue (fonts) |
| WASM-First | Browser deployment without server dependencies |
| YAML-Driven | Declarative app configuration |
| Graded Quality | Every app receives F-A score via TDG metrics |
Auto-Display: Convention Over Configuration
Presentar auto-generates UIs from Sovereign AI Stack file formats:
| File Type | Generated UI |
|---|---|
.apr (Aprender model) | ModelCard + inference panel |
.ald (Alimentar dataset) | DataCard + DataTable |
app.yaml | Custom layout from YAML |
Mixed .apr/.ald | Split-view grid |
# Point at a directory, get an app
presentar --serve ./fraud-detector/
# Bundle for deployment
presentar --bundle ./fraud-detector/ -o app.wasm
YAML App Configuration
presentar: "0.1"
name: "fraud-detection-dashboard"
version: "1.0.0"
# Data sources (Alimentar .ald files)
data:
transactions:
source: "pacha://datasets/transactions:latest"
format: "ald"
refresh: "5m"
# Model references (Aprender .apr files)
models:
fraud_detector:
source: "pacha://models/fraud-detector:1.2.0"
format: "apr"
# Layout definition (12-column responsive grid)
layout:
type: "dashboard"
columns: 12
sections:
- id: "metrics"
span: [1, 4]
widgets:
- type: "metric"
label: "Fraud Rate"
value: "{{ data.predictions | filter(fraud=true) | percentage }}"
- id: "main-chart"
span: [5, 12]
widgets:
- type: "chart"
chart_type: "line"
data: "{{ data.transactions }}"
x: "timestamp"
y: "amount"
Quality Scoring
Every Presentar app receives a TDG score (0-100, F-A):
| Category | Weight | Metrics |
|---|---|---|
| Structural | 25 | Widget complexity, layout depth |
| Performance | 20 | Frame time, memory, bundle size |
| Accessibility | 20 | WCAG AA, keyboard nav, ARIA |
| Data Quality | 15 | Completeness, freshness, schema |
| Documentation | 10 | Manifest, model/data cards |
| Consistency | 10 | Theme adherence, naming |
Integration with Batuta Workflow
Presentar apps integrate with Batuta’s 5-phase workflow:
Phase 1: Analysis → presentar analyze app.yaml
Phase 2: Transpile → (N/A - pure Rust)
Phase 3: Optimize → presentar optimize --wasm-opt
Phase 4: Validate → presentar test (zero-dep harness)
Phase 5: Deploy → presentar --bundle → pacha publish
presentar-test: Zero-Dependency E2E Testing
Critical constraint: No playwright, selenium, npm, or C bindings.
#![allow(unused)]
fn main() {
use presentar_test::*;
#[presentar_test]
fn inference_flow() {
let mut h = Harness::new(include_bytes!("fixtures/app.tar"));
h.type_text("[data-testid='input-amount']", "1500")
.click("[data-testid='predict-btn']");
h.assert_text_contains("[data-testid='result']", "Fraud Score:");
}
#[presentar_test]
fn visual_regression() {
let mut h = Harness::new(include_bytes!("fixtures/app.tar"));
Snapshot::assert_match("app-default", h.screenshot("[data-testid='app-root']"), 0.001);
}
}
Determinism guarantees:
- Fixed DPI: 1.0
- Font antialiasing: Grayscale only
- Fixed viewport: 1280x720
- Embedded test font (Inter)
Trueno-Viz GPU Primitives
Presentar renders via Trueno-Viz draw commands:
#![allow(unused)]
fn main() {
pub enum DrawCommand {
Path { points: Vec<Point>, closed: bool, style: StrokeStyle },
Fill { path: PathRef, color: Color, rule: FillRule },
Rect { bounds: Rect, radius: CornerRadius, style: BoxStyle },
Text { content: String, position: Point, style: TextStyle },
Image { tensor: TensorRef, bounds: Rect, sampling: Sampling },
}
}
Anti-aliasing strategy:
- Hardware MSAA (4x) for fills
- Analytical AA for lines/curves
- SDF for text rendering
Pacha Registry Integration
# Fetch models and datasets from Pacha
models:
classifier:
source: "pacha://models/mnist-cnn:1.0.0"
data:
training:
source: "pacha://datasets/mnist:latest"
Lineage tracking follows W3C PROV-DM for full provenance.
Performance Targets
| Operation | Target | Backend |
|---|---|---|
| Path tessellation (1K points) | <1ms | Trueno SIMD |
| Fill rendering (10K triangles) | <2ms | WebGPU |
| Full frame (complex dashboard) | <16ms | 60fps |
| Bundle size | <500KB | WASM |
Ruchy Script Integration (Future)
Embedded scripting for dynamic behavior:
scripts:
on_load: |
let data = load_dataset("transactions")
let filtered = data.filter(|row| row.amount > 100)
set_state("filtered_data", filtered)
Security: Resource limits (1M instructions, 16MB memory, 10ms slice) prevent DoS.
Comparison with Alternatives
| Feature | Presentar | Streamlit | Gradio |
|---|---|---|---|
| Runtime | WASM (no server) | Python | Python |
| Performance | 60fps GPU | ~10fps | ~10fps |
| Type Safety | Compile-time | Runtime | Runtime |
| Bundle Size | <500KB | ~50MB | ~30MB |
| Testing | Zero-dep harness | Manual | Manual |
| Reproducibility | Deterministic | Non-deterministic | Non-deterministic |
presentar-terminal: Native TUI Backend
For terminal-based applications, presentar-terminal provides efficient character-cell rendering with the same Brick Architecture as the WASM stack.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ presentar-terminal (TUI) │
├─────────────────────────────────────────────────────────────────┤
│ CellBuffer + DiffRenderer (efficient updates) │
├─────────────────────────────────────────────────────────────────┤
│ crossterm 0.28 (terminal control) │
└─────────────────────────────────────────────────────────────────┘
Key Components
| Component | Purpose |
|---|---|
CellBuffer | Character-cell buffer with RGBA colors |
DiffRenderer | Efficient partial updates (only changed cells) |
Modifiers | Text styling (bold, italic, underline) |
Color | RGBA colors with transparency support |
Example Usage
#![allow(unused)]
fn main() {
use presentar_terminal::{CellBuffer, Color, DiffRenderer, Modifiers};
// Create buffer
let mut buffer = CellBuffer::new(80, 24);
// Write colored text
buffer.update(0, 0, "H", Color::GREEN, Color::TRANSPARENT, Modifiers::NONE);
buffer.update(1, 0, "i", Color::GREEN, Color::TRANSPARENT, Modifiers::NONE);
// Render to terminal with diff optimization
let mut renderer = DiffRenderer::new();
renderer.flush(&mut buffer, &mut std::io::stdout())?;
}
Widgets Available
- Table: Data tables with sorting and selection
- Gauge: Progress bars and meters
- Sparkline: Inline mini-charts
- ForceGraph: Force-directed network visualization
- Treemap: Hierarchical data visualization
- Heatmap: 2D density visualization
- BoxPlot/ViolinPlot: Statistical distributions
Stack Dashboards
Batuta uses presentar-terminal for its TUI dashboards:
# Stack health dashboard
cargo run --example stack_graph_tui --features native
# Oracle RAG dashboard
cargo run --example rag_oracle_demo --features native
Why Not ratatui?
presentar-terminal replaces ratatui for stack consistency:
| Feature | presentar-terminal | ratatui |
|---|---|---|
| Stack native | Yes | No |
| Diff rendering | Built-in | Manual |
| Color model | RGBA f32 | Limited |
| Brick Architecture | Yes | No |
| PROBAR-SPEC-009 | Compliant | N/A |
Academic Foundation
Key references (see full spec for 30+ citations):
- Czaplicki (2012): Elm Architecture
- Haas et al. (2017): WebAssembly performance model
- Mitchell et al. (2019): Model Cards
- Ohno (1988): Toyota Production System (Jidoka)
Navigate: Table of Contents | Trueno-Viz | Trueno
Stack Diagnostics & ML Insights
The Stack Diagnostics module provides ML-driven insights for monitoring PAIML stack health, implementing Toyota Way principles for observability.
Overview
┌─────────────────────────────────────────────────────────────────────────┐
│ SOVEREIGN AI STACK HEALTH DASHBOARD │
│ Timestamp: 2024-12-07 15:30:45 │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ANDON STATUS: 🟢 All systems healthy │
│ │
│ STACK SUMMARY │
│ Total Components: 24 │
│ Healthy: 22 (92%) │
│ Warnings: 2 (8%) │
│ Critical: 0 (0%) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Toyota Way Principles
The diagnostics system implements several Toyota Production System concepts:
| Principle | Implementation |
|---|---|
| Mieruka | ASCII dashboards make health visible at a glance |
| Jidoka | ML anomaly detection surfaces issues automatically |
| Genchi Genbutsu | Evidence-based diagnosis from actual dependency data |
| Andon | Red/Yellow/Green status with stop-the-line alerts |
| Yokoten | Cross-component insight sharing via knowledge graph |
Andon Status System
The Andon system provides visual health indicators:
#![allow(unused)]
fn main() {
use batuta::{HealthStatus, QualityGrade};
// Status from quality grade
let status = HealthStatus::from_grade(QualityGrade::A);
assert_eq!(status, HealthStatus::Green);
// Visual indicators
println!("{} Green - All systems healthy", HealthStatus::Green.icon());
println!("{} Yellow - Attention needed", HealthStatus::Yellow.icon());
println!("{} Red - Stop-the-line", HealthStatus::Red.icon());
}
Status Transitions
| Quality Grade | Health Status | Action |
|---|---|---|
| A+, A | 🟢 Green | Normal operation |
| A-, B+ | 🟡 Yellow | Attention needed |
| B, C, D, F | 🔴 Red | Stop-the-line |
Component Metrics
Each stack component tracks key quality metrics:
#![allow(unused)]
fn main() {
use batuta::{ComponentMetrics, ComponentNode, QualityStackLayer as StackLayer};
// Create component with metrics
let mut node = ComponentNode::new("trueno", "0.7.4", StackLayer::Compute);
node.metrics = ComponentMetrics {
demo_score: 95.5, // PMAT quality score
coverage: 92.0, // Test coverage %
mutation_score: 85.0, // Mutation testing kill rate
complexity_avg: 4.2, // Cyclomatic complexity
satd_count: 3, // Self-Admitted Technical Debt
dead_code_pct: 0.5, // Dead code percentage
grade: QualityGrade::APlus,
};
node.update_health();
}
Graph Analytics
The system computes graph-level metrics for dependency analysis:
PageRank
Identifies critical components based on dependency centrality:
#![allow(unused)]
fn main() {
use batuta::StackDiagnostics;
let mut diag = StackDiagnostics::new();
// Add components...
let metrics = diag.compute_metrics()?;
// Top components by PageRank
for (name, score) in metrics.top_by_pagerank(5) {
println!("{}: {:.3}", name, score);
}
}
Betweenness Centrality
Finds bottleneck components that many paths pass through:
#![allow(unused)]
fn main() {
// Find components with high betweenness (potential bottlenecks)
let bottlenecks = metrics.bottlenecks(0.5);
for name in bottlenecks {
println!("Bottleneck: {}", name);
}
}
Depth Analysis
Measures dependency chain depth from root nodes:
#![allow(unused)]
fn main() {
for (name, depth) in &metrics.depth_map {
println!("{} at depth {}", name, depth);
}
println!("Maximum depth: {}", metrics.max_depth);
}
ML Anomaly Detection
Isolation Forest
The Isolation Forest algorithm detects anomalies by measuring isolation:
#![allow(unused)]
fn main() {
use batuta::IsolationForest;
let mut forest = IsolationForest::new(100, 256, 42);
// Fit on component metrics
let data = vec![
vec![90.0, 85.0, 80.0, 5.0], // Normal
vec![88.0, 82.0, 78.0, 5.5], // Normal
vec![30.0, 20.0, 15.0, 25.0], // Anomaly!
];
forest.fit(&data);
// Score data points (higher = more anomalous)
let scores = forest.score(&data);
}
Detecting Anomalies in Stack
#![allow(unused)]
fn main() {
// Detect anomalies in component metrics
let anomalies = forest.detect_anomalies(&diagnostics, 0.5);
for anomaly in &anomalies {
println!("{}: {} (score: {:.3})",
anomaly.component,
anomaly.description,
anomaly.score
);
if let Some(rec) = &anomaly.recommendation {
println!(" Recommendation: {}", rec);
}
}
}
Anomaly Categories
| Category | Trigger | Example |
|---|---|---|
QualityRegression | Demo score < 70 | “Score dropped from 90 to 65” |
CoverageDrop | Coverage < 50% | “Coverage at 45% (target: 80%)” |
ComplexityIncrease | Avg complexity > 15 | “Complexity grew to 18.5” |
DependencyRisk | Dead code > 10% | “15% dead code detected” |
BuildTimeSpike | Build time increase | “Build time +40%” |
Error Forecasting
Predict future error trends using exponential smoothing:
#![allow(unused)]
fn main() {
use batuta::ErrorForecaster;
let mut forecaster = ErrorForecaster::new(0.3);
// Add historical observations
forecaster.observe(5.0);
forecaster.observe(8.0);
forecaster.observe(12.0);
forecaster.observe(10.0);
// Forecast next 4 periods
let forecast = forecaster.forecast(4);
println!("Predicted errors: {:?}", forecast);
// Check accuracy metrics
let metrics = forecaster.error_metrics();
println!("MAE: {:.2}", metrics.mae);
println!("RMSE: {:.2}", metrics.rmse);
}
Dashboard Rendering
Generate ASCII dashboards for terminal display:
#![allow(unused)]
fn main() {
use batuta::{render_dashboard, StackDiagnostics};
let diag = StackDiagnostics::new();
// Add components and anomalies...
let output = render_dashboard(&diag);
println!("{}", output);
}
Running the Demo
cargo run --example stack_diagnostics_demo --features native
This demonstrates:
- Phase 1: Andon Status Board
- Phase 2: Component Metrics
- Phase 3: Graph Analytics
- Phase 4: Isolation Forest Anomaly Detection
- Phase 5: Error Forecasting
- Phase 6: Dashboard Rendering
Integration with CLI
The diagnostics system integrates with batuta stack:
# Stack health dashboard
batuta stack status --diagnostics
# Run anomaly detection
batuta stack check --ml
# Forecast error trends
batuta stack forecast --days 7
Best Practices
- Regular Monitoring: Run diagnostics as part of CI/CD
- Threshold Tuning: Adjust anomaly threshold based on stack maturity
- Evidence Collection: Always include evidence in anomaly reports
- Action Items: Provide actionable recommendations
See Also
Oracle Mode
“Ask the Oracle, receive the wisdom of the stack.”
Oracle Mode is the intelligent query interface for the Sovereign AI Stack. Instead of manually researching which components to use, Oracle Mode guides you to the optimal solution based on your requirements.
Overview
Oracle Mode provides:
- Knowledge Graph: Complete registry of stack components with capabilities
- Natural Language Interface: Query in plain English
- Intelligent Recommendations: Algorithm and backend selection
- Code Generation: Ready-to-use examples
┌──────────────────────────────────────────────────────────────────┐
│ ORACLE MODE ARCHITECTURE │
└──────────────────────────────────────────────────────────────────┘
┌─────────────────┐
│ Natural Query │
│ "Train RF" │
└────────┬────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ QUERY ENGINE │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Domain │ │ Algorithm │ │ Performance │ │
│ │ Detection │ │ Extraction │ │ Hints │ │
│ └─────────────┘ └──────────────┘ └──────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE GRAPH │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Layer 0: Primitives → trueno, trueno-db, trueno-graph │ │
│ │ Layer 1: ML → aprender │ │
│ │ Layer 2: Pipeline → entrenar, realizar │ │
│ │ Layer 3: Transpilers → depyler, decy, bashrs, ruchy │ │
│ │ Layer 4: Orchestration→ batuta, repartir │ │
│ │ Layer 5: Quality → certeza, pmat, renacer │ │
│ │ Layer 6: Data → alimentar │ │
│ └───────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ RECOMMENDER │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Component │ │ Backend │ │ Distribution │ │
│ │ Selection │ │ Selection │ │ Decision │ │
│ └─────────────┘ └──────────────┘ └──────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────┐
│ Response │
│ + Code Example │
└─────────────────┘
The Sovereign AI Stack
Oracle Mode knows all 20 components in the stack:
| Layer | Components | Purpose |
|---|---|---|
| L0: Primitives | trueno, trueno-db, trueno-graph, trueno-viz, trueno-rag | SIMD/GPU compute, vector storage, graph ops, RAG |
| L1: ML | aprender | First-principles ML algorithms |
| L2: Pipeline | entrenar, realizar | Training loops, inference runtime |
| L3: Transpilers | depyler, decy, bashrs, ruchy | Python/C transpilers + Rust↔Shell bidirectional |
| L4: Orchestration | batuta, repartir, pforge | Migration workflow, distributed compute, MCP servers |
| L5: Quality | certeza, pmat, renacer | Testing, profiling, syscall tracing |
| L6: Data | alimentar, pacha | Data loading, model/recipe registry |
Basic Usage
CLI Interface
# List all stack components
$ batuta oracle --list
# Show component details
$ batuta oracle --show trueno
# Find components by capability
$ batuta oracle --capabilities simd
# Query integration patterns
$ batuta oracle --integrate aprender realizar
# Interactive mode
$ batuta oracle --interactive
Interactive Mode
$ batuta oracle --interactive
🔮 Oracle Mode - Ask anything about the Sovereign AI Stack
oracle> How do I train a random forest on 1M samples?
📊 Analysis:
Problem class: Supervised Learning
Algorithm: random_forest
Data size: Large (1M samples)
💡 Primary Recommendation: aprender
Path: aprender::tree::RandomForest
Confidence: 95%
Rationale: Random forest is ideal for large tabular datasets
🔧 Backend: SIMD
Rationale: SIMD vectorization optimal for 1M samples with High complexity
📦 Supporting Components:
- trueno (95%): SIMD-accelerated tensor operations
- alimentar (70%): Parallel data loading
💻 Code Example:
use aprender::tree::RandomForest;
use alimentar::Dataset;
let dataset = Dataset::from_csv("data.csv")?;
let (x, y) = dataset.split_features_target("label")?;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.n_jobs(-1) // Use all cores
.fit(&x, &y)?;
📚 Related Queries:
- How to optimize random forest hyperparameters?
- How to serialize trained models with realizar?
- How to distribute training with repartir?
Backend Selection
Oracle Mode uses Amdahl’s Law and PCIe transfer overhead (Gregg & Hazelwood, 2011) to select the optimal compute backend.
The 5× Rule
GPU dispatch is only beneficial when compute time exceeds 5× the PCIe transfer time:
If compute_time > 5 × transfer_time → Use GPU
Otherwise → Use SIMD
Backend Decision Matrix
| Operation | Complexity | Small Data | Large Data | GPU Available |
|---|---|---|---|---|
| Element-wise | O(n) | Scalar | SIMD | SIMD (memory-bound) |
| Reductions | O(n) | Scalar | SIMD | SIMD |
| Matrix mult | O(n³) | SIMD | GPU | GPU |
| Conv2D | O(n²k²) | SIMD | GPU | GPU |
| Attention | O(n²d) | SIMD | GPU | GPU |
Backend Selection Example
oracle> What backend for 2048×2048 matrix multiplication?
🎯 Backend Selection:
Operation: Matrix multiplication
Size: 2048 × 2048 = 4.2M elements
Complexity: O(n³) = 8.6B FLOPs
PCIe Transfer: 4.2M × 4 bytes × 2 = 34 MB
Transfer time: 34 MB / 32 GB/s = 1.06 ms
Compute time: 8.6B FLOPs / 20 TFLOPS = 0.43 ms
Ratio: 0.43 / 1.06 = 0.41× (< 5×)
💡 Recommendation: SIMD
Rationale: PCIe overhead dominates. Use trueno SIMD backend.
GPU becomes beneficial at ~8192×8192.
Distribution Decision
Oracle uses Amdahl’s Law for distribution decisions:
Speedup = 1 / ((1 - P) + P/N)
Where:
P = Parallel fraction of workload
N = Number of nodes
Distribution Example
oracle> Should I distribute random forest on 4 nodes?
📊 Amdahl's Law Analysis:
Algorithm: Random Forest
Parallel fraction: 0.95 (tree training is parallelizable)
Nodes: 4
Theoretical speedup: 1 / (0.05 + 0.95/4) = 3.48×
Communication overhead: ~10% per node = 40%
Effective speedup: 3.48 × 0.6 = 2.09×
💡 Recommendation: Yes, distribute with repartir
Expected speedup: 2.09×
Break-even: 2+ nodes
📦 Code Example:
use repartir::{Executor, WorkStealing};
use aprender::tree::RandomForest;
let executor = Executor::new()
.with_workers(4)
.with_scheduler(WorkStealing);
let forest = executor.map(
trees.chunks(25),
|chunk| train_tree_subset(chunk, &data)
).await?;
Knowledge Graph Queries
Find by Capability
oracle> What components support GPU?
🔍 Components with GPU capability:
- trueno: SIMD-accelerated tensor operations with GPU dispatch
- realizar: GPU-accelerated inference runtime
Find by Domain
oracle> What do I need for graph analytics?
🧠 Graph Analytics Components:
- trueno-graph: Graph traversal and algorithms
- trueno-db: Vector storage with graph indexes
Integration Patterns
oracle> How do I integrate depyler with aprender?
🔗 Integration: depyler → aprender
Pattern: sklearn_migration
Description: Convert sklearn code to aprender
Example:
# Original Python (sklearn)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100)
model.fit(X, y)
# After depyler transpilation
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.fit(&x, &y)?;
Academic Foundations
Oracle Mode is grounded in peer-reviewed research:
| Concept | Reference | Application |
|---|---|---|
| PCIe overhead | Gregg & Hazelwood (2011) | Backend selection |
| Amdahl’s Law | Amdahl (1967) | Distribution decisions |
| Roofline model | Williams et al. (2009) | Performance bounds |
| SIMD vectorization | Fog (2022) | Optimization hints |
| Decision trees | Breiman (2001) | Algorithm recommendations |
JSON Output
For programmatic access, use --format json:
$ batuta oracle --format json "random forest large data"
{
"problem_class": "Supervised Learning",
"algorithm": "random_forest",
"primary": {
"component": "aprender",
"path": "aprender::tree::RandomForest",
"confidence": 0.95,
"rationale": "Random forest is ideal for large tabular datasets"
},
"supporting": [
{
"component": "trueno",
"confidence": 0.95,
"rationale": "SIMD-accelerated tensor operations"
}
],
"compute": {
"backend": "SIMD",
"rationale": "SIMD vectorization optimal for large datasets"
},
"distribution": {
"needed": false,
"rationale": "Single-node sufficient for this workload size"
},
"code_example": "use aprender::tree::RandomForest;..."
}
Code Output
For Unix pipeline composition, use --format code to extract raw Rust code with no ANSI escapes and no metadata:
# From a natural language query
$ batuta oracle "train a random forest" --format code
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
# From a cookbook recipe
$ batuta oracle --recipe ml-random-forest --format code
# From an integration pattern
$ batuta oracle --integrate "aprender,realizar" --format code
# Pipe through rustfmt and copy
$ batuta oracle --recipe training-lora --format code | rustfmt | pbcopy
# Dump all recipes with delimiter comments
$ batuta oracle --cookbook --format code
// --- ml-random-forest ---
use aprender::prelude::*;
...
// --- ml-serving ---
use realizar::prelude::*;
...
Code output follows the Jidoka principle: when no code is available, the process exits with code 1 and a stderr diagnostic rather than emitting garbage. Commands like --list, --capabilities, and --rag have no code representation and always exit 1 with --format code.
TDD Test Companions
Every code example — both cookbook recipes and recommender-generated snippets — includes a TDD test companion: a #[cfg(test)] module with 3-4 focused tests. Test companions follow PMAT compliance rules: low cyclomatic complexity, single assertion per test, real crate types.
When using --format code, test companions are appended after the main code:
$ batuta oracle --recipe ml-random-forest --format code
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
#[cfg(test)]
mod tests {
#[test]
fn test_random_forest_construction() {
let n_estimators = 100;
let max_depth = Some(10);
assert!(n_estimators > 0);
assert!(max_depth.unwrap() > 0);
}
#[test]
fn test_prediction_count_matches_input() {
let n_samples = 50;
let predictions = vec![0usize; n_samples];
assert_eq!(predictions.len(), n_samples);
}
#[test]
fn test_feature_importance_sums_to_one() {
let importances = vec![0.4, 0.35, 0.25];
let sum: f64 = importances.iter().sum();
assert!((sum - 1.0).abs() < 1e-10);
}
}
Test companion categories:
| Recipe Type | Test Approach |
|---|---|
| Pure Rust (28 recipes) | Full #[cfg(test)] mod tests block |
| Python+Rust (2 recipes) | Test Rust portion only |
| WASM (3 recipes) | #[cfg(all(test, not(target_arch = "wasm32")))] guard |
| Recommender (5 examples) | Embedded in code_example string |
Recommender code examples (batuta oracle "train a model" --format code) also include test companions inline, so the output is always test-ready.
# Count test companions across all recipes
$ batuta oracle --cookbook --format code 2>/dev/null | grep -c '#\[cfg('
34
# Pipe a recipe with tests through rustfmt
$ batuta oracle --recipe ml-random-forest --format code | rustfmt
See docs/specifications/code-snippets.md for the full specification with Popperian falsification protocol.
Programmatic API
Use Oracle Mode from Rust code:
#![allow(unused)]
fn main() {
use batuta::oracle::{Recommender, OracleQuery, DataSize};
// Natural language query
let recommender = Recommender::new();
let response = recommender.query("train random forest on 1M samples");
println!("Primary: {}", response.primary.component);
println!("Backend: {:?}", response.compute.backend);
// Structured query with constraints
let query = OracleQuery::new("neural network training")
.with_data_size(DataSize::samples(1_000_000))
.with_hardware(HardwareSpec::with_gpu(16.0))
.sovereign_only();
let response = recommender.query_structured(&query);
if response.distribution.needed {
println!("Distribute with: {:?}", response.distribution.tool);
}
}
RAG Oracle (APR-Powered)
The RAG Oracle extends Oracle Mode with Retrieval-Augmented Generation for stack documentation. It indexes all CLAUDE.md and README.md files from stack components and provides semantic search.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ RAG ORACLE PIPELINE │
└─────────────────────────────────────────────────────────────────┘
┌─────────────┐ ┌─────────────────┐ ┌─────────────────────────┐
│ Source │ │ Semantic │ │ Content-Addressable │
│ Docs │ → │ Chunker │ → │ Index (BLAKE3) │
│ (P0-P3) │ │ (Code-aware) │ │ (Poka-Yoke) │
└─────────────┘ └─────────────────┘ └─────────────────────────┘
↓
┌─────────────┐ ┌─────────────────┐ ┌─────────────────────────┐
│ Results │ │ RRF Fusion │ │ Hybrid Retrieval │
│ + Scores │ ← │ (k=60) │ ← │ (BM25 + Dense) │
└─────────────┘ └─────────────────┘ └─────────────────────────┘
Toyota Production System Integration
The RAG Oracle applies Toyota Way principles:
| Principle | Implementation |
|---|---|
| Jidoka | Stop-on-error validation (NaN/Inf detection, dimension mismatch) |
| Poka-Yoke | Content hashing prevents stale indexes (BLAKE3) |
| Heijunka | Load-leveled reindexing via priority queue |
| Muda | Delta-only updates skip unchanged documents |
| Kaizen | Model hash tracking for continuous improvement |
Index Persistence (Section 9.7)
The RAG index is persisted to disk for fast startup and offline usage:
Cache Location: ~/.cache/batuta/rag/
Cache Files:
~/.cache/batuta/rag/
├── manifest.json # Version, checksums, timestamps
├── index.json # Inverted index (BM25 terms)
└── documents.json # Document metadata + chunks
Integrity Validation (Jidoka):
- BLAKE3 checksums for index.json and documents.json
- Version compatibility check (major version must match)
- Checksum mismatch triggers load failure (stop-on-error)
Persistence Flow:
Index (CLI) Persist Load (CLI)
─────────── ─────── ──────────
batuta oracle ┌───────┐ batuta oracle
--rag-index ────▶ │ Cache │ ────▶ --rag "query"
└───────┘
│
▼
batuta oracle ──────▶ Stats
--rag-stats (no full load)
batuta oracle ──────▶ Full Rebuild (two-phase save)
--rag-index-force
RAG CLI Commands
# Index all stack documentation (CLAUDE.md, README.md)
$ batuta oracle --rag-index
📚 RAG Indexer (Heijunka Mode)
──────────────────────────────────────────────────
Scanning stack repositories...
✓ trueno/CLAUDE.md ████████░░░░░░░ (12 chunks)
✓ trueno/README.md ██████░░░░░░░░░ (8 chunks)
✓ aprender/CLAUDE.md ██████████░░░░░ (15 chunks)
...
Complete: 16 documents, 142 chunks indexed
Vocabulary: 2847 unique terms
Avg doc length: 89.4 tokens
# Query with RAG
$ batuta oracle --rag "How do I use SIMD for matrix operations?"
🔍 RAG Oracle Mode
──────────────────────────────────────────────────
Index: 16 documents, 142 chunks
Query: How do I use SIMD for matrix operations?
1. [trueno] trueno/CLAUDE.md#42 ████████░░ 78%
Trueno provides SIMD-accelerated tensor ops...
2. [trueno] trueno/README.md#15 ██████░░░░ 62%
Matrix multiplication with AVX2/AVX-512...
# Show TUI dashboard (native only)
$ batuta oracle --rag-dashboard
# Show cache statistics (fast, manifest only)
$ batuta oracle --rag-stats
📊 RAG Index Statistics
──────────────────────────────────────────────────
Version: 1.0.0
Batuta version: 0.6.2
Indexed at: 2025-01-30 14:23:45 UTC
Sources:
- trueno: 4 docs, 42 chunks
- aprender: 3 docs, 38 chunks
- hf-ground-truth-corpus: 12 docs, 100 chunks
# Force rebuild (old cache retained until save completes)
$ batuta oracle --rag-index-force
Force rebuild requested (old cache retained until save)...
📚 RAG Indexer (Heijunka Mode)
...
RAG TUI Dashboard
The dashboard shows real-time index health, query latency, and retrieval quality:
┌─ Oracle RAG Dashboard ──────────────────────────────────────┐
│ Index Health: 95% | Docs: 16 | Chunks: 142 │
├─────────────────────────────────────────────────────────────┤
│ │
│ Index Status Query Latency │
│ ───────────── ───────────── │
│ > trueno ████████░░ 42 ▁▂▃▄▅▆▇█▆▅▃▂▁ │
│ aprender █████████░ 38 avg: 12ms p99: 45ms │
│ realizar ██████░░░░ 24 │
│ entrenar █████░░░░░ 18 Retrieval Quality │
│ ───────────────── │
│ Recent Queries MRR 0.847 ████████░░ │
│ ───────────── NDCG 0.791 ███████░░░ │
│ 12:34:56 "SIMD tensor" trueno R@10 0.923 █████████░ │
│ 12:34:41 "train model" aprender │
│ │
├─────────────────────────────────────────────────────────────┤
│ [q]uit [r]efresh [↑/↓]navigate │
└─────────────────────────────────────────────────────────────┘
Hybrid Retrieval
RAG Oracle uses hybrid retrieval combining:
- BM25 (Sparse): Term-based matching with IDF weighting
- Dense Retrieval: Embedding-based semantic similarity (placeholder for trueno-db)
- RRF Fusion: Reciprocal Rank Fusion (k=60) combines both rankings
RRF Score = Σ 1/(k + rank) for each retriever
Scalar Int8 Rescoring (Two-Stage Retrieval)
For large-scale dense retrieval, the RAG Oracle implements scalar int8 rescoring based on the HuggingFace embedding quantization research:
┌─────────────────────────────────────────────────────────────────┐
│ TWO-STAGE RESCORING PIPELINE │
└─────────────────────────────────────────────────────────────────┘
Stage 1: Fast Approximate Search Stage 2: Precise Rescoring
──────────────────────────────── ──────────────────────────
┌─────────────┐ ┌─────────────────────────┐
│ Query (f32) │ │ Top 4k candidates │
│ → int8 │ ─────────────────────▶ │ (from Stage 1) │
│ │ i8 × i8 dot product │ │
└─────────────┘ O(n) fast scan │ f32 × i8 rescoring │
│ │ with scale factor │
▼ │ │
┌─────────────┐ │ Final top-k ranking │
│ Index (int8)│ └─────────────────────────┘
│ 4× smaller │
└─────────────┘
Benefits:
- 4× memory reduction (f32 → int8)
- 99% accuracy retention with rescoring
- 3.66× speedup via SIMD acceleration
SIMD Backend Detection:
| Backend | Ops/Cycle | Platforms |
|---|---|---|
| AVX-512 | 64 | Intel Skylake-X, Ice Lake |
| AVX2 | 32 | Intel Haswell+, AMD Zen+ |
| NEON | 16 | ARM64 (M1/M2, Raspberry Pi) |
| Scalar | 1 | Universal fallback |
Quantization (Kaizen):
The quantization uses absmax symmetric quantization with Welford’s online algorithm for numerically stable calibration:
scale = absmax / 127
quantized[i] = clamp(round(x[i] / scale), -128, 127)
Run the Demo:
# Run the scalar int8 rescoring demo
cargo run --example int8_rescore_demo --features native
# Output:
# 🚀 Scalar Int8 Rescoring Retriever Demo
# 🖥️ Detected SIMD Backend: AVX-512
# Int8 operations per cycle: 64
# 📊 Memory Comparison (10 documents × 384 dims):
# f32 storage: 15360 bytes
# int8 storage: 4320 bytes
# Compression: 3.56×
See docs/specifications/retriever-spec.md for the full specification with 100-point Popperian falsification checklist.
Document Priority (Genchi Genbutsu)
Documents are indexed with priority levels:
| Priority | Source | Trigger |
|---|---|---|
| P0 | CLAUDE.md | Every commit |
| P1 | README.md, Cargo.toml, pyproject.toml | On release |
| P2 | docs/.md, src/**/.py | Weekly scan |
| P3 | examples/.rs, tests/**/.py, Docstrings | Monthly scan |
Ground Truth Corpora (Cross-Language)
The RAG Oracle indexes external ground truth corpora for cross-language ML pattern discovery:
┌─────────────────────────────────────────────────────────────────┐
│ GROUND TRUTH CORPUS ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Rust Stack │ │ Python Corpus │ │
│ │ (trueno, etc) │ │ (hf-gtc) │ │
│ │ CLAUDE.md │ │ CLAUDE.md │ │
│ │ README.md │ │ src/**/*.py │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ └─────────────┬─────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ RAG Oracle Index (BM25 + Dense) │ │
│ │ Cross-language search for ML patterns │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Query: "How do I tokenize text for BERT?" │
│ ↓ │
│ Results: hf-gtc/preprocessing/tokenization.py │
│ + candle/trueno Rust equivalent │
│ │
└─────────────────────────────────────────────────────────────────┘
HuggingFace Ground Truth Corpus
Location: ../hf-ground-truth-corpus
A curated collection of production-ready Python recipes for HuggingFace ML workflows:
- 95%+ test coverage with property-based testing (Hypothesis)
- Module structure:
hf_gtc.hub,hf_gtc.inference,hf_gtc.preprocessing,hf_gtc.training - Cross-references: Maps Python patterns to Rust equivalents (candle/trueno)
Query Examples:
# Query for Python ML patterns
$ batuta oracle --rag "How do I tokenize text for BERT?"
# Returns: hf_gtc/preprocessing/tokenization.py + candle equivalent
$ batuta oracle --rag "sentiment analysis pipeline"
# Returns: hf_gtc/inference/pipelines.py patterns
Extending Ground Truth
To add new ground truth corpora:
- Add directory to
python_corpus_dirsinsrc/cli/oracle.rs:cmd_oracle_rag_index() - Ensure corpus has CLAUDE.md and README.md for P0/P1 indexing
- Python source in
src/**/*.pyis indexed as P2 - Run
batuta oracle --rag-indexto rebuild index
Python Chunking
Python files use specialized delimiters for semantic chunking:
| Delimiter | Purpose |
|---|---|
\ndef | Function definitions |
\nclass | Class definitions |
\n def | Method definitions |
\nasync def | Async function definitions |
\n## | Markdown section headers |
Programmatic RAG API
#![allow(unused)]
fn main() {
use batuta::oracle::rag::{RagOracle, ChunkerConfig, SemanticChunker};
// Create RAG Oracle
let oracle = RagOracle::new();
// Query the index
let results = oracle.query("SIMD tensor operations");
for result in results {
println!("{}: {} (score: {:.2})",
result.component,
result.source,
result.score
);
}
// Custom chunking
let config = ChunkerConfig::new(512, 64, &["\n## ", "\nfn "]);
let chunker = SemanticChunker::from_config(&config);
let chunks = chunker.split(content);
}
Auto-Update System
The RAG index stays fresh automatically through a three-layer freshness system:
Layer 1: Shell Auto-Fresh (ora-fresh)
On every shell login, ora-fresh runs in the background to check index freshness:
# Runs automatically on shell login (non-blocking)
ora-fresh
# Manual check
ora-fresh
✅ Index is fresh (3h old)
# When stale
ora-fresh
📚 Stack changed since last index, refreshing...
ora-fresh checks two conditions:
- Stale marker:
~/.cache/batuta/rag/.stale(set by post-commit hooks) - Age: Index older than 24 hours
Layer 2: Post-Commit Hooks (26 repos)
Every commit in any Sovereign AI Stack repository touches a stale marker file:
# .git/hooks/post-commit (installed in all 26 stack repos)
#!/bin/bash
touch "$HOME/.cache/batuta/rag/.stale" 2>/dev/null
This is a zero-overhead signal — the next ora-fresh invocation picks it up and triggers a reindex. No work is done at commit time beyond a single touch call.
Layer 3: Fingerprint-Based Change Detection (BLAKE3)
When a reindex is triggered, BLAKE3 content fingerprints prevent unnecessary work:
batuta oracle --rag-index
✅ Index is current (no files changed since last index)
Each indexed file has a DocumentFingerprint containing:
- Content hash: BLAKE3 hash of file contents
- Chunker config hash: Detects chunking parameter changes
- Model hash: Detects embedding model changes
If no fingerprints have changed, the entire reindex is skipped instantly.
┌─────────────────────────────────────────────────────────────────┐
│ AUTO-UPDATE FLOW │
└─────────────────────────────────────────────────────────────────┘
git commit ─────▶ post-commit hook
touch ~/.cache/batuta/rag/.stale
│
▼
shell login ────▶ ora-fresh (background)
checks .stale marker + 24h age
│
▼
batuta oracle ──▶ fingerprint check (BLAKE3)
--rag-index compare content hashes
skip if nothing changed
│
(changed)│(unchanged)
│ └──▶ "Index is current"
▼
Full reindex (~30s)
Persist new fingerprints
Manual Commands
# Check freshness (instant)
ora-fresh
# Reindex with change detection (skips if current)
batuta oracle --rag-index
# Force full reindex (ignores fingerprints)
batuta oracle --rag-index-force
Key Takeaways
- Query naturally: Ask in plain English, get precise answers
- Trust the math: Backend selection based on PCIe and Amdahl analysis
- Complete stack: All 20 components indexed with capabilities
- Code ready: Get working examples, not just recommendations
- Reproducible: JSON output for automation and CI/CD
Next Steps
Try Oracle Mode yourself:
# Run the Oracle demo
cargo run --example oracle_demo --features native
# Run the RAG Oracle demo
cargo run --example rag_oracle_demo --features native
# Run the Scalar Int8 Rescoring demo
cargo run --example int8_rescore_demo --features native
# Index stack documentation for RAG
batuta oracle --rag-index
# Query with RAG
batuta oracle --rag "How do I train a model?"
# Start interactive mode
batuta oracle --interactive
# Query from CLI
batuta oracle "How do I migrate sklearn to Rust?"
Previous: Renacer: Syscall Tracing Next: Example Overview
Data Platforms Integration
Batuta provides a unified interface for integrating with enterprise data platforms while maintaining sovereignty over your ML infrastructure. The batuta data command visualizes the ecosystem and shows how PAIML stack components map to commercial alternatives.
Toyota Way Principles
The data platforms integration embodies key Lean principles:
| Principle | Application |
|---|---|
| Genchi Genbutsu | Direct platform API queries - go to the source |
| Poka-Yoke | OS-level egress filtering for sovereignty enforcement |
| Heijunka | Adaptive throttling for shared resources |
| Jidoka | Schema drift detection stops the line |
| Muda | Federation over migration (zero-copy where possible) |
| Andon | Cost estimation before query execution |
Supported Platforms
Databricks
DATABRICKS
├── Unity Catalog
│ └── Schemas, Tables, Views
├── Delta Lake
│ └── Parquet storage, Transaction log, Time travel
├── MLflow
│ └── Experiment tracking, Model registry, Model serving
└── Spark
└── DataFrames, Structured Streaming, MLlib
PAIML Mappings:
- Delta Lake → Alimentar (.ald format) - Alternative
- Unity Catalog → Pacha Registry - Alternative
- MLflow → Entrenar experiment tracking - Alternative
- Spark DataFrames → Trueno tensors - Alternative
Snowflake
SNOWFLAKE
├── Virtual Warehouse
│ └── Compute clusters, Result cache, Auto-scaling
├── Iceberg Tables
│ └── Open format, Schema evolution, Partition pruning
├── Snowpark
│ └── Python UDFs, Java/Scala UDFs, ML functions
└── Data Sharing
└── Secure shares, Reader accounts, Marketplace
PAIML Mappings:
- Iceberg Tables → Alimentar (.ald) - Compatible (open format)
- Snowpark Python → Depyler transpilation - Transpiles
- Snowpark ML → Aprender - Alternative
AWS
AWS
├── Storage
│ ├── S3 (Objects, Versioning, Lifecycle)
│ ├── Glue Catalog (Databases, Tables, Crawlers)
│ └── Lake Formation
├── Compute
│ ├── EMR, Lambda, ECS/EKS
├── ML
│ ├── SageMaker (Training, Endpoints, Pipelines)
│ ├── Bedrock (Foundation models, Fine-tuning, Agents)
│ └── Comprehend
└── Analytics
└── Athena, Redshift, QuickSight
PAIML Mappings:
- S3 → Alimentar sync - Compatible
- Glue Catalog → Pacha Registry - Alternative
- SageMaker Training → Entrenar - Alternative
- Bedrock → Realizar + serve module - Alternative
- Lambda Python → Depyler transpilation - Transpiles
HuggingFace
HUGGINGFACE
├── Hub
│ └── Models, Datasets, Spaces, Organizations
├── Transformers
│ └── Models, Tokenizers, Pipelines
├── Datasets
│ └── Streaming, Arrow format, Processing
└── Inference API
└── Serverless, Dedicated, TEI/TGI
PAIML Mappings:
- Hub → Pacha Registry - Alternative
- Transformers → Realizar (via GGUF) - Compatible
- Datasets Arrow → Alimentar (.ald) - Compatible
- GGUF models → Realizar inference - Uses
CLI Usage
View All Platforms
batuta data tree
Filter by Platform
batuta data tree --platform databricks
batuta data tree --platform snowflake
batuta data tree --platform aws
batuta data tree --platform huggingface
View PAIML Integration Mappings
batuta data tree --integration
Output shows all 31 integration points:
PAIML ↔ DATA PLATFORMS INTEGRATION
==================================
STORAGE & CATALOGS
├── [ALT] Alimentar (.ald) ←→ Delta Lake
├── [CMP] Alimentar (.ald) ←→ Iceberg Tables
├── [CMP] Alimentar (sync) ←→ S3
├── [ALT] Pacha Registry ←→ Unity Catalog
├── [ALT] Pacha Registry ←→ Glue Catalog
├── [ALT] Pacha Registry ←→ HuggingFace Hub
COMPUTE & PROCESSING
├── [ALT] Trueno ←→ Spark DataFrames
├── [ALT] Trueno ←→ Snowpark
├── [ALT] Trueno ←→ EMR
├── [TRN] Depyler → Rust ←→ Snowpark Python
├── [TRN] Depyler → Rust ←→ Lambda Python
├── [ALT] Trueno-Graph ←→ Neptune/GraphQL
ML TRAINING
├── [ALT] Aprender ←→ MLlib
├── [ALT] Aprender ←→ Snowpark ML
├── [ALT] Entrenar ←→ SageMaker Training
├── [ALT] Entrenar ←→ MLflow Tracking
├── [ALT] Entrenar ←→ SageMaker Experiments
├── [USE] Entrenar ←→ W&B
MODEL SERVING
├── [ALT] Realizar ←→ MLflow Serving
├── [ALT] Realizar ←→ SageMaker Endpoints
├── [ALT] Realizar + serve ←→ Bedrock
├── [USE] Realizar ←→ GGUF models
├── [CMP] Realizar (via GGUF) ←→ HF Transformers
ORCHESTRATION
├── [ORC] Batuta ←→ Databricks Workflows
├── [ORC] Batuta ←→ Snowflake Tasks
├── [ORC] Batuta ←→ Step Functions
├── [ORC] Batuta ←→ Airflow/Prefect
Legend: [CMP]=Compatible [ALT]=Alternative [USE]=Uses
[TRN]=Transpiles [ORC]=Orchestrates
JSON Output
batuta data tree --format json
batuta data tree --platform aws --format json
batuta data tree --integration --format json
Integration Types
| Code | Type | Description |
|---|---|---|
| CMP | Compatible | Works directly with PAIML component |
| ALT | Alternative | PAIML provides sovereign alternative |
| USE | Uses | PAIML component consumes this format |
| TRN | Transpiles | Depyler converts code to Rust |
| ORC | Orchestrates | Batuta can coordinate workflows |
Data Sovereignty Tiers
The integration supports four sovereignty levels:
#![allow(unused)]
fn main() {
pub enum DataSovereigntyTier {
/// All data stays on-premises, no external calls
FullySovereign,
/// Private cloud (AWS GovCloud, Azure Gov)
HybridSovereign,
/// Standard private cloud deployment
PrivateCloud,
/// Standard commercial cloud
Standard,
}
}
Architecture
┌─────────────────────────────────────────────────────────────┐
│ BATUTA ORCHESTRATOR │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌─────────────┐ │
│ │Databricks│ │Snowflake │ │ AWS │ │ HuggingFace │ │
│ │ Adapter │ │ Adapter │ │ Adapter │ │ Adapter │ │
│ └────┬────┘ └────┬─────┘ └────┬────┘ └──────┬──────┘ │
│ │ │ │ │ │
│ └────────────┴──────┬──────┴──────────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ Unified │ │
│ │ Data API │ │
│ └──────┬──────┘ │
│ │ │
│ ┌──────────────────────┼──────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────────┐ ┌─────────┐ │
│ │Alimentar│ │ Pacha │ │ Entrenar│ │
│ │(.ald) │ │ Registry │ │Tracking │ │
│ └────────┘ └──────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
Kaizen Recommendations
Based on Toyota Way analysis, future enhancements include:
- Cost Andon Cord - Pre-flight cost estimation before expensive queries
- Resumable Sync - Stateful checkpointing for long-running transfers
- Schema Drift Detection - Jidoka-style automatic stops on upstream changes
- Adaptive Throttling - Heijunka-based rate limiting for shared warehouses
- Federation Architecture - Virtual catalogs to eliminate migration waste
- Information Flow Control - Taint tracking for data provenance
See Also
- Oracle Mode - Query the stack for recommendations
- HuggingFace Integration - Detailed HF Hub operations
- Alimentar Specification - Data format details
Visualization Frameworks Integration
Batuta provides ecosystem visualization for Python data visualization and ML demo frameworks, showing how they map to sovereign Rust replacements. The batuta viz command displays framework hierarchies and PAIML replacement mappings.
Core Principle
Python visualization frameworks are replaced by sovereign Rust alternatives. No Python runtime dependencies are permitted in the PAIML stack. Python code is transpiled to Rust via Depyler.
Framework Replacement Matrix
| Python Framework | PAIML Replacement | Migration Path |
|---|---|---|
| Gradio | Presentar | Depyler transpilation |
| Streamlit | Presentar | Depyler transpilation |
| Panel | Trueno-Viz | Depyler transpilation |
| Dash | Presentar + Trueno-Viz | Depyler transpilation |
| Matplotlib | Trueno-Viz | Direct API mapping |
| Plotly | Trueno-Viz | Direct API mapping |
Toyota Way Principles
| Principle | Application |
|---|---|
| Genchi Genbutsu | Direct visualization enables first-hand observation |
| Poka-Yoke | Python interpreter eliminated from production |
| Heijunka | Frame-rate limiting prevents GPU saturation |
| Jidoka | Explicit component trees for predictable rendering |
| Muda | Signal-based rendering eliminates wasted computation |
| Kanban | Visual data flow with explicit signal graphs |
CLI Usage
View All Frameworks
batuta viz tree
Output:
VISUALIZATION FRAMEWORKS ECOSYSTEM
==================================
GRADIO (Python) → Presentar (Rust)
├── Interface
│ └── Interface → Presentar::QuickApp
├── Blocks
│ └── Blocks → Presentar::Layout
├── Components
│ ├── Image → Trueno-Viz::ImageView
│ ├── Audio → Presentar::AudioPlayer
│ ├── Chatbot → Realizar + Presentar
│ └── DataFrame → Trueno-Viz::DataGrid
└── Deployment
└── HuggingFace Spaces → Batuta deploy
STREAMLIT (Python) → Presentar (Rust)
├── Widgets
│ ├── Input → Presentar::Widgets
│ └── Display → Presentar + Trueno-Viz
├── Caching
│ ├── @st.cache_data → Trueno::TensorCache
│ └── session_state → Presentar::State
└── Deployment
└── Streamlit Cloud → Batuta deploy
...
Filter by Framework
batuta viz tree --framework gradio
batuta viz tree --framework streamlit
batuta viz tree --framework panel
batuta viz tree --framework dash
View PAIML Replacement Mappings
batuta viz tree --integration
Output:
PAIML REPLACEMENTS FOR PYTHON VIZ
=================================
UI FRAMEWORKS
├── [REP] Presentar::QuickApp ← gr.Interface
├── [REP] Presentar::Layout ← gr.Blocks
├── [REP] Presentar::App ← dash.Dash
├── [REP] Presentar::Layout ← st.columns/sidebar
VISUALIZATION
├── [REP] Trueno-Viz::Chart ← dcc.Graph
├── [REP] Trueno-Viz::Chart ← st.plotly_chart
├── [REP] Trueno-Viz::DataGrid ← st.dataframe
├── [REP] Trueno-Viz::GPURaster ← datashader
COMPONENTS
├── [REP] Presentar::TextInput ← st.text_input
├── [REP] Presentar::Slider ← st.slider
├── [REP] Trueno-Viz::ImageView ← gr.Image
STATE & CACHING
├── [REP] Presentar::State ← st.session_state
├── [REP] Trueno::TensorCache ← @st.cache_data
├── [REP] Presentar::on_event ← @callback
DEPLOYMENT
├── [REP] Batuta deploy ← HuggingFace Spaces
├── [REP] Batuta deploy ← Streamlit Cloud
├── [REP] Batuta deploy ← Dash Enterprise
Legend: [REP]=Replaces (Python eliminated)
Summary: 21 Python components replaced by sovereign Rust alternatives
Zero Python dependencies in production
JSON Output
batuta viz tree --format json
batuta viz tree --framework streamlit --format json
batuta viz tree --integration --format json
Why Replace Python Frameworks?
Gradio → Presentar
Problems with Gradio:
- Python server restarts on every interaction
- ~2s cold start time
- ~100ms interaction latency
- No offline capability
Presentar Benefits:
- Persistent state with sub-millisecond updates
- ~50ms cold start
- ~16ms interaction latency (60fps)
- WebAssembly deployment for edge/offline
Streamlit → Presentar
Problems with Streamlit:
- Full script reruns on each interaction (Muda)
- ~3s cold start, ~200ms latency
- ~8MB bundle size
- ~200MB memory usage
Presentar Benefits:
- Signal-based reactivity (minimal DOM updates)
- Compile-time type checking
- ~500KB bundle size
- ~20MB memory usage
Panel → Trueno-Viz
Problems with Panel:
- 6+ HoloViz dependencies (Panel, HoloViews, Datashader, Bokeh, Param, Colorcet)
- WebGL rendering (older API)
- Python GIL contention
Trueno-Viz Benefits:
- Single unified library
- Native WebGPU rendering
- Rust memory safety for big data
- Billion-point rendering capability
Dash → Presentar + Trueno-Viz
Problems with Dash:
- Callback spaghetti (invisible data dependencies)
- Large Plotly.js bundle
- WebGL performance limits
Presentar + Trueno-Viz Benefits:
- Explicit signal graph (debuggable)
- Smaller WASM bundle
- WebGPU for maximum performance
Performance Comparison
| Metric | Gradio | Streamlit | Dash | Presentar |
|---|---|---|---|---|
| Cold start | ~2s | ~3s | ~1s | ~50ms |
| Interaction | ~100ms | ~200ms | ~80ms | ~16ms |
| Bundle size | ~5MB | ~8MB | ~3MB | ~500KB |
| Memory | ~150MB | ~200MB | ~100MB | ~20MB |
| GPU | No | No | WebGL | WebGPU |
| Offline | No | No | No | Yes |
| WASM | No | No | No | Yes |
Component Mapping Reference
Gradio Components
| Gradio | Presentar/Trueno-Viz |
|---|---|
gr.Interface | Presentar::QuickApp |
gr.Blocks | Presentar::Layout |
gr.Image | Trueno-Viz::ImageView |
gr.Audio | Presentar::AudioPlayer |
gr.Chatbot | Realizar + Presentar |
gr.DataFrame | Trueno-Viz::DataGrid |
Streamlit Components
| Streamlit | Presentar/Trueno-Viz |
|---|---|
st.write | Presentar::Text |
st.dataframe | Trueno-Viz::DataGrid |
st.plotly_chart | Trueno-Viz::Chart |
st.text_input | Presentar::TextInput |
st.slider | Presentar::Slider |
st.selectbox | Presentar::Select |
st.session_state | Presentar::State |
@st.cache_data | Trueno::TensorCache |
Dash Components
| Dash | Presentar/Trueno-Viz |
|---|---|
dash.Dash | Presentar::App |
dcc.Graph | Trueno-Viz::Chart |
dcc.Input | Presentar::TextInput |
dash_table | Trueno-Viz::DataGrid |
@callback | Presentar::on_event |
See Also
- Presentar: App Framework - Detailed Presentar documentation
- Trueno-Viz: GPU Rendering - Trueno-Viz capabilities
batuta viz- CLI reference
Example Overview
This chapter provides runnable examples demonstrating batuta’s capabilities across the Sovereign AI Stack.
Running Examples
All examples are in the examples/ directory and can be run with:
cargo run --example <example_name>
Some examples require specific features:
# Examples requiring oracle-mode
cargo run --example oracle_demo --features oracle-mode
# Examples requiring inference
cargo run --example serve_demo --features inference
# Examples requiring native features (TUI, tracing)
cargo run --example stack_graph_tui --features native
Example Categories
Core Pipeline Examples
| Example | Description | Features |
|---|---|---|
pipeline_demo | 5-phase transpilation pipeline with Jidoka validation | - |
backend_selection | Cost-based GPU/SIMD/Scalar selection | - |
moe_routing | Mixture-of-Experts backend routing | - |
full_transpilation | End-to-end transpilation workflow | - |
ML Framework Conversion
| Example | Description | Features |
|---|---|---|
numpy_conversion | NumPy → Trueno operation mapping | - |
sklearn_conversion | scikit-learn → Aprender migration | - |
pytorch_conversion | PyTorch → Realizar conversion | - |
Oracle Mode Examples
| Example | Description | Features |
|---|---|---|
oracle_demo | Knowledge graph queries | oracle-mode |
oracle_local_demo | Local workspace discovery | oracle-mode |
rag_oracle_demo | RAG-enhanced oracle queries | oracle-mode |
Stack Management
| Example | Description | Features |
|---|---|---|
stack_dogfood | Self-analysis of batuta codebase | native |
stack_graph_tui | TUI visualization of stack dependencies | native |
stack_quality_demo | Quality metrics across stack | native |
stack_diagnostics_demo | Comprehensive stack health check | native |
publish_status_demo | crates.io publish status checker | - |
sovereign_stack_e2e | End-to-end stack validation | - |
Infrastructure Components
| Example | Description | Features |
|---|---|---|
trueno_zram_demo | SIMD compression with trueno-zram | - |
trueno_ublk_demo | GPU block device acceleration | - |
repartir_distributed | Distributed computing patterns | - |
multi_machine_demo | Multi-node GPU/SIMD orchestration | - |
Model Serving
| Example | Description | Features |
|---|---|---|
serve_demo | Privacy-tiered model serving | inference |
whisper_apr_demo | Whisper ASR inference | inference |
pepita_kernel_demo | GPU kernel interfaces | - |
int8_rescore_demo | INT8 quantized inference | inference |
Content & Data
| Example | Description | Features |
|---|---|---|
content_demo | Content analysis and generation | - |
hf_catalog_demo | HuggingFace catalog integration | - |
parf_analysis | PARF (Project ARtifact Format) analysis | - |
MCP Integration
| Example | Description | Features |
|---|---|---|
mcp_demo | MCP server integration | - |
custom_plugin | Custom plugin development | - |
graph_tui_demo | Graph visualization TUI | native |
Quick Start Examples
1. Pipeline Demo (No Features Required)
cargo run --example pipeline_demo
Demonstrates the 5-phase transpilation pipeline with Jidoka (stop-on-error) validation.
2. Oracle Local Demo
cargo run --example oracle_local_demo --features oracle-mode
Discovers PAIML projects in ~/src and shows their development state (Clean/Dirty/Unpushed).
3. Stack Quality Demo
cargo run --example stack_quality_demo --features native
Analyzes quality metrics across the Sovereign AI Stack components.
4. Backend Selection Demo
cargo run --example backend_selection
Shows cost-based GPU/SIMD/Scalar backend selection using the 5× PCIe rule.
Example Dependencies
Some examples have external dependencies:
- Model files: Examples in
serve_demo,whisper_apr_demorequire GGUF/APR model files - GPU: CUDA examples require NVIDIA GPU with CUDA toolkit
- Network:
hf_catalog_demorequires internet access for HuggingFace API
Building All Examples
Verify all examples compile:
cargo check --examples
cargo check --examples --features oracle-mode,native,inference
Navigate: Table of Contents | Next: Python ML Example
Python Ml Example
This chapter is under development.
Coming soon: Detailed information about python ml example.
Navigate: Table of Contents
Numpy Trueno
This chapter is under development.
Coming soon: Detailed information about numpy trueno.
Navigate: Table of Contents
Sklearn Aprender
This chapter is under development.
Coming soon: Detailed information about sklearn aprender.
Navigate: Table of Contents
Pytorch Realizar
This chapter is under development.
Coming soon: Detailed information about pytorch realizar.
Navigate: Table of Contents
C Library Example
This chapter is under development.
Coming soon: Detailed information about c library example.
Navigate: Table of Contents
C Memory
This chapter is under development.
Coming soon: Detailed information about c memory.
Navigate: Table of Contents
C Ownership
This chapter is under development.
Coming soon: Detailed information about c ownership.
Navigate: Table of Contents
C Ffi
This chapter is under development.
Coming soon: Detailed information about c ffi.
Navigate: Table of Contents
Shell Script Example
This chapter is under development.
Coming soon: Detailed information about shell script example.
Navigate: Table of Contents
Shell Commands
This chapter is under development.
Coming soon: Detailed information about shell commands.
Navigate: Table of Contents
Shell Errors
This chapter is under development.
Coming soon: Detailed information about shell errors.
Navigate: Table of Contents
Shell Cli
This chapter is under development.
Coming soon: Detailed information about shell cli.
Navigate: Table of Contents
Mixed Language Example
This chapter is under development.
Coming soon: Detailed information about mixed language example.
Navigate: Table of Contents
Mixed Modules
This chapter is under development.
Coming soon: Detailed information about mixed modules.
Navigate: Table of Contents
Mixed Gradual
This chapter is under development.
Coming soon: Detailed information about mixed gradual.
Navigate: Table of Contents
Mixed Testing
This chapter is under development.
Coming soon: Detailed information about mixed testing.
Navigate: Table of Contents
Config Overview
This chapter is under development.
Coming soon: Detailed information about config overview.
Navigate: Table of Contents
Config Reference
This chapter is under development.
Coming soon: Detailed information about config reference.
Navigate: Table of Contents
Config Project
This chapter is under development.
Coming soon: Detailed information about config project.
Navigate: Table of Contents
Config Transpilation
This chapter is under development.
Coming soon: Detailed information about config transpilation.
Navigate: Table of Contents
Config Optimization
This chapter is under development.
Coming soon: Detailed information about config optimization.
Navigate: Table of Contents
Config Validation
This chapter is under development.
Coming soon: Detailed information about config validation.
Navigate: Table of Contents
Config Build
This chapter is under development.
Coming soon: Detailed information about config build.
Navigate: Table of Contents
Workflow State
This chapter is under development.
Coming soon: Detailed information about workflow state.
Navigate: Table of Contents
Custom Flags
This chapter is under development.
Coming soon: Detailed information about custom flags.
Navigate: Table of Contents
Cli Overview
This chapter is under development.
Coming soon: Detailed information about cli overview.
Navigate: Table of Contents
Cli Analyze
This chapter is under development.
Coming soon: Detailed information about cli analyze.
Navigate: Table of Contents
Cli Init
This chapter is under development.
Coming soon: Detailed information about cli init.
Navigate: Table of Contents
Cli Transpile
This chapter is under development.
Coming soon: Detailed information about cli transpile.
Navigate: Table of Contents
Cli Optimize
This chapter is under development.
Coming soon: Detailed information about cli optimize.
Navigate: Table of Contents
Cli Validate
This chapter is under development.
Coming soon: Detailed information about cli validate.
Navigate: Table of Contents
Cli Build
This chapter is under development.
Coming soon: Detailed information about cli build.
Navigate: Table of Contents
Cli Report
This chapter is under development.
Coming soon: Detailed information about cli report.
Navigate: Table of Contents
Cli Status
This chapter is under development.
Coming soon: Detailed information about cli status.
Navigate: Table of Contents
Cli Reset
This chapter is under development.
Coming soon: Detailed information about cli reset.
Navigate: Table of Contents
batuta oracle
Query the Sovereign AI Stack knowledge graph for component recommendations, backend selection, and integration patterns.
Synopsis
batuta oracle [OPTIONS] [QUERY]
Description
Oracle Mode provides an intelligent query interface to the Sovereign AI Stack. It analyzes your requirements and recommends:
- Primary component for your task
- Supporting components that integrate well
- Compute backend (Scalar/SIMD/GPU/Distributed)
- Code examples ready to use
Options
| Option | Description |
|---|---|
--list | List all stack components |
--show <component> | Show details about a specific component |
--capabilities <cap> | Find components by capability (e.g., simd, ml, transpilation) |
--integrate <from> <to> | Show integration pattern between two components |
--interactive | Start interactive query mode |
--format <format> | Output format: text (default), json, markdown, or code |
--rag | Use RAG-based retrieval from indexed stack documentation |
--rag-index | Index/reindex stack documentation for RAG queries |
--rag-index-force | Clear cache and rebuild index from scratch |
--rag-stats | Show cache statistics (fast, manifest only) |
--rag-dashboard | Launch TUI dashboard for RAG index statistics |
--local | Show local workspace status (~/src PAIML projects) |
--dirty | Show only dirty (uncommitted changes) projects |
--publish-order | Show safe publish order respecting dependencies |
-h, --help | Print help information |
Examples
List Stack Components
$ batuta oracle --list
📚 Sovereign AI Stack Components:
Layer 0: Compute Primitives
- trueno v0.8.8: SIMD-accelerated tensor operations + simulation testing framework
- trueno-db v0.3.7: High-performance vector database
- trueno-graph v0.1.4: Graph analytics engine
- trueno-viz v0.1.5: Visualization toolkit
Layer 1: ML Algorithms
- aprender v0.19.0: First-principles ML library
Layer 2: Training & Inference
- entrenar v0.3.0: Training loop framework
- realizar v0.3.0: ML inference runtime
...
Query Component Details
$ batuta oracle --show aprender
📦 Component: aprender v0.19.0
Layer: ML Algorithms
Description: Next-generation machine learning library in pure Rust
Capabilities:
- random_forest (Machine Learning)
- gradient_boosting (Machine Learning)
- clustering (Machine Learning)
- neural_networks (Machine Learning)
Integrates with:
- trueno: Uses SIMD-accelerated tensor operations
- realizar: Exports models for inference
- alimentar: Loads training data
References:
[1] Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32
[2] Chen & Guestrin (2016). XGBoost: A Scalable Tree Boosting System
Find by Capability
$ batuta oracle --capabilities simd
🔍 Components with 'simd' capability:
- trueno: SIMD-accelerated tensor operations
Natural Language Query
$ batuta oracle "How do I train a random forest on 1M samples?"
📊 Analysis:
Problem class: Supervised Learning
Algorithm: random_forest
Data size: Large (1M samples)
💡 Primary Recommendation: aprender
Path: aprender::tree::RandomForest
Confidence: 95%
🔧 Backend: SIMD
Rationale: SIMD vectorization optimal for 1M samples
💻 Code Example:
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
Integration Patterns
$ batuta oracle --integrate depyler aprender
🔗 Integration: depyler → aprender
Pattern: sklearn_migration
Description: Convert sklearn code to aprender
Before (Python/sklearn):
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100)
After (Rust/aprender):
use aprender::tree::RandomForest;
let model = RandomForest::new().n_estimators(100);
Interactive Mode
$ batuta oracle --interactive
🔮 Oracle Mode - Ask anything about the Sovereign AI Stack
oracle> What's the fastest way to do matrix multiplication?
📊 Analysis:
Problem class: Linear Algebra
💡 Primary Recommendation: trueno
Confidence: 85%
Rationale: SIMD-accelerated matrix operations
💻 Code Example:
use trueno::prelude::*;
let a = Tensor::from_vec(vec![1.0, 2.0, 3.0, 4.0]).reshape([2, 2]);
let b = Tensor::from_vec(vec![5.0, 6.0, 7.0, 8.0]).reshape([2, 2]);
let c = a.matmul(&b);
oracle> exit
Goodbye!
JSON Output
$ batuta oracle --format json "random forest"
{
"problem_class": "Supervised Learning",
"algorithm": "random_forest",
"primary": {
"component": "aprender",
"path": "aprender::tree::RandomForest",
"confidence": 0.9,
"rationale": "Random forest for supervised learning"
},
"compute": {
"backend": "SIMD",
"rationale": "SIMD vectorization optimal"
},
"distribution": {
"needed": false,
"rationale": "Single-node sufficient"
}
}
Code Output
Extract raw code snippets for piping to other tools. No ANSI escapes, no metadata — just code. All code output includes TDD test companions (#[cfg(test)] modules) appended after the main code:
# Extract code from a recipe (includes test companion)
$ batuta oracle --recipe ml-random-forest --format code
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
#[cfg(test)]
mod tests {
#[test]
fn test_random_forest_construction() {
let n_estimators = 100;
assert!(n_estimators > 0);
}
// ... 2-3 more focused tests
}
# Natural language queries also include test companions
$ batuta oracle "train a model" --format code > example.rs
# Pipe to rustfmt and clipboard
$ batuta oracle --recipe training-lora --format code | rustfmt | pbcopy
# Dump all cookbook recipes as code (each includes test companion)
$ batuta oracle --cookbook --format code > all_recipes.rs
# Count test companions
$ batuta oracle --cookbook --format code 2>/dev/null | grep -c '#\[cfg('
34
# Commands without code exit with code 1
$ batuta oracle --list --format code
No code available for --list (try --format text)
$ echo $?
1
When the requested context has no code available (e.g., --list, --capabilities, --rag), the process exits with code 1 and a stderr diagnostic suggesting --format text.
RAG-Based Query
Query using Retrieval-Augmented Generation from indexed stack documentation:
$ batuta oracle --rag "How do I fine-tune a model with LoRA?"
🔍 RAG Oracle Query: "How do I fine-tune a model with LoRA?"
📄 Retrieved Documents (RRF-fused):
1. entrenar/CLAUDE.md (score: 0.847)
"LoRA (Low-Rank Adaptation) enables parameter-efficient fine-tuning..."
2. aprender/CLAUDE.md (score: 0.623)
"For training workflows, entrenar provides autograd and optimization..."
💡 Recommendation:
Use `entrenar` for LoRA fine-tuning with quantization support (QLoRA).
💻 Code Example:
use entrenar::lora::{LoraConfig, LoraTrainer};
let config = LoraConfig::new()
.rank(16)
.alpha(32.0)
.target_modules(&["q_proj", "v_proj"]);
let trainer = LoraTrainer::new(model, config);
trainer.train(&dataset)?;
Index Stack Documentation
Build or update the RAG index from stack CLAUDE.md files and ground truth corpora:
$ batuta oracle --rag-index
📚 RAG Indexer (Heijunka Mode)
──────────────────────────────────────────────────
Scanning Rust stack repositories...
✓ trueno/CLAUDE.md ████████████░░░ (12 chunks)
✓ trueno/README.md ████████░░░░░░░ (8 chunks)
✓ aprender/CLAUDE.md ██████████████░ (15 chunks)
✓ realizar/CLAUDE.md ████████░░░░░░░ (8 chunks)
...
Scanning Python ground truth corpora...
✓ hf-ground-truth-corpus/CLAUDE.md ██████░░░░░░░░░ (6 chunks)
✓ hf-ground-truth-corpus/README.md ████████████░░░ (12 chunks)
✓ src/hf_gtc/hub/search.py ████░░░░░░░░░░░ (4 chunks)
✓ src/hf_gtc/preprocessing/tokenization.py ██████░░░░░░░░ (6 chunks)
...
──────────────────────────────────────────────────
Complete: 28 documents, 186 chunks indexed
Vocabulary: 3847 unique terms
Avg doc length: 89.4 tokens
Reindexer: 28 documents tracked
Query Ground Truth Corpora
Query for Python ML patterns and get cross-language results:
$ batuta oracle --rag "How do I tokenize text for BERT?"
🔍 RAG Oracle Mode
──────────────────────────────────────────────────
Index: 28 documents, 186 chunks
Query: How do I tokenize text for BERT?
1. [hf-ground-truth-corpus] src/hf_gtc/preprocessing/tokenization.py#12 ████████░░ 82%
def preprocess_text(text: str) -> str:
text = text.strip().lower()...
2. [trueno] trueno/CLAUDE.md#156 ██████░░░░ 65%
For text preprocessing, trueno provides...
3. [hf-ground-truth-corpus] hf-ground-truth-corpus/README.md#42 █████░░░░░ 58%
from hf_gtc.preprocessing.tokenization import preprocess_text...
$ batuta oracle --rag "sentiment analysis pipeline"
# Returns Python pipeline patterns + Rust inference equivalents
RAG Cache Statistics
Show index statistics without a full load (reads manifest only):
$ batuta oracle --rag-stats
📊 RAG Index Statistics
──────────────────────────────────────────────────
Version: 1.0.0
Batuta version: 0.6.2
Indexed at: 2025-01-30 14:23:45 UTC
Cache path: /home/user/.cache/batuta/rag
Sources:
- trueno: 4 docs, 42 chunks (commit: abc123)
- aprender: 3 docs, 38 chunks (commit: def456)
- hf-ground-truth-corpus: 12 docs, 100 chunks
Force Rebuild Index
Rebuild from scratch, ignoring fingerprint-based skip. The old cache is retained until the new index is saved (crash-safe two-phase write):
$ batuta oracle --rag-index-force
Force rebuild requested (old cache retained until save)...
📚 RAG Indexer (Heijunka Mode)
──────────────────────────────────────────────────
Scanning Rust stack repositories...
✓ trueno/CLAUDE.md ████████████░░░ (12 chunks)
...
Complete: 28 documents, 186 chunks indexed
Index saved to /home/user/.cache/batuta/rag
RAG Dashboard
Launch the TUI dashboard to monitor RAG index health:
$ batuta oracle --rag-dashboard
┌─────────────────────────────────────────────────────────────┐
│ RAG Oracle Dashboard │
├─────────────────────────────────────────────────────────────┤
│ Index Status: HEALTHY Last Updated: 2 hours ago │
├─────────────────────────────────────────────────────────────┤
│ Documents by Priority: │
│ P0 (Critical): ████████████████████ 12 CLAUDE.md │
│ P1 (High): ████████████ 8 README.md │
│ P2 (Medium): ██████ 4 docs/ │
│ P3 (Low): ████ 2 examples/ │
├─────────────────────────────────────────────────────────────┤
│ Retrieval Quality (last 24h): │
│ MRR: 0.847 ████████████████░░░░ │
│ Recall@5: 0.923 ██████████████████░░ │
│ NDCG@10: 0.891 █████████████████░░░ │
├─────────────────────────────────────────────────────────────┤
│ Reindex Queue (Heijunka): │
│ - entrenar/CLAUDE.md (staleness: 0.72) │
│ - realizar/CLAUDE.md (staleness: 0.45) │
└─────────────────────────────────────────────────────────────┘
Local Workspace Discovery
Discover PAIML projects in ~/src with development state awareness:
$ batuta oracle --local
🏠 Local Workspace Status (PAIML projects in ~/src)
📊 Summary:
Total projects: 42
✅ Clean: 28
🔧 Dirty: 10
📤 Unpushed: 4
┌──────────────────┬──────────┬───────────┬────────┬─────────────────┐
│ Project │ Local │ Crates.io │ State │ Git Status │
├──────────────────┼──────────┼───────────┼────────┼─────────────────┤
│ trueno │ 0.11.0 │ 0.11.0 │ ✅ Clean │ │
│ aprender │ 0.24.0 │ 0.24.0 │ ✅ Clean │ │
│ depyler │ 3.21.0 │ 3.20.0 │ 🔧 Dirty │ 15 mod, 3 new │
│ entrenar │ 0.5.0 │ 0.5.0 │ 📤 Unpushed │ 2 ahead │
│ batuta │ 0.5.0 │ 0.5.0 │ ✅ Clean │ │
└──────────────────┴──────────┴───────────┴────────┴─────────────────┘
💡 Dirty projects use crates.io version for deps (stable)
Development State Legend
| State | Icon | Meaning |
|---|---|---|
| Clean | ✅ | No uncommitted changes, safe to use local version |
| Dirty | 🔧 | Active development, use crates.io version for deps |
| Unpushed | 📤 | Clean but has unpushed commits |
Key Insight: Dirty projects don’t block the stack! The crates.io version is stable and should be used for dependencies while local development continues.
Show Only Dirty Projects
Filter to show only projects with uncommitted changes:
$ batuta oracle --dirty
🔧 Dirty Projects (active development)
┌──────────────────┬──────────┬───────────┬─────────────────────────┐
│ Project │ Local │ Crates.io │ Changes │
├──────────────────┼──────────┼───────────┼─────────────────────────┤
│ depyler │ 3.21.0 │ 3.20.0 │ 15 modified, 3 untracked│
│ renacer │ 0.10.0 │ 0.9.0 │ 8 modified │
│ pmat │ 0.20.0 │ 0.19.0 │ 22 modified, 5 untracked│
└──────────────────┴──────────┴───────────┴─────────────────────────┘
💡 These projects are safe to skip - crates.io versions are stable.
Focus on --publish-order for clean projects ready to release.
Publish Order
Show the safe publish order respecting inter-project dependencies:
$ batuta oracle --publish-order
📦 Suggested Publish Order (topological sort)
Step 1: trueno-graph (0.1.9 → 0.1.10)
✅ Ready - no blockers
Dependencies: (none)
Step 2: aprender (0.23.0 → 0.24.0)
✅ Ready - no blockers
Dependencies: trueno
Step 3: entrenar (0.4.0 → 0.5.0)
✅ Ready - no blockers
Dependencies: aprender
Step 4: depyler (3.20.0 → 3.21.0)
⚠️ Blocked: 15 uncommitted changes
Dependencies: aprender, entrenar
Step 5: batuta (0.4.9 → 0.5.0)
⚠️ Blocked: waiting for depyler
Dependencies: all stack components
────────────────────────────────────────
📊 Summary:
Ready to publish: 3 projects
Blocked: 2 projects
💡 Run 'cargo publish' in order shown above.
Skip blocked projects - they'll use crates.io stable versions.
Auto-Update System
The RAG index stays fresh automatically through three layers:
Layer 1: Shell Auto-Fresh (ora-fresh)
# Runs automatically on shell login (non-blocking background check)
# Manual invocation:
$ ora-fresh
✅ Index is fresh (3h old)
# When a stack repo has been committed since last index:
$ ora-fresh
📚 Stack changed since last index, refreshing...
Layer 2: Post-Commit Hooks
All 26 stack repos have a post-commit hook that touches a stale marker:
# Installed in .git/hooks/post-commit across all stack repos
touch "$HOME/.cache/batuta/rag/.stale" 2>/dev/null
Layer 3: Fingerprint-Based Change Detection
On reindex, BLAKE3 content fingerprints skip work when nothing changed:
# Second run detects no changes via fingerprints
$ batuta oracle --rag-index
✅ Index is current (no files changed since last index)
# Force reindex ignores fingerprints (old cache retained until save)
$ batuta oracle --rag-index-force
Force rebuild requested (old cache retained until save)...
📚 RAG Indexer (Heijunka Mode)
...
Complete: 5016 documents, 264369 chunks indexed
Each DocumentFingerprint tracks:
- Content hash (BLAKE3 of file contents)
- Chunker config hash (detect parameter changes)
- Model hash (detect embedding model changes)
Exit Codes
| Code | Description |
|---|---|
0 | Success |
1 | General error / no code available (--format code on non-code context) |
2 | Invalid arguments |
See Also
- Oracle Mode: Intelligent Query Interface - Full documentation
batuta analyze- Project analysisbatuta transpile- Code transpilation
Previous: batuta reset
Next: Migration Strategy
batuta stack
PAIML Stack dependency orchestration commands.
Synopsis
batuta stack <COMMAND>
Commands
| Command | Description |
|---|---|
check | Check dependency health across the PAIML stack |
drift | Detect version drift across published stack crates |
gate | Enforce A- quality threshold for all components |
publish-status | Check which crates need publishing (O(1) cached) |
quality | Analyze quality metrics across the PAIML stack |
release | Coordinate releases across the PAIML stack |
status | Show stack health status dashboard |
sync | Synchronize dependencies across the stack |
tree | Display hierarchical tree of PAIML stack components |
versions | Check latest versions from crates.io |
batuta stack tree
Display a visual hierarchical tree of all 21 PAIML stack components.
Usage
batuta stack tree [OPTIONS]
Options
| Option | Description |
|---|---|
--format <FORMAT> | Output format: ascii (default), json, dot |
--health | Show health status and version information |
--filter <LAYER> | Filter by layer name |
Layers
| Layer | Components |
|---|---|
core | trueno, trueno-viz, trueno-db, trueno-graph, trueno-rag |
ml | aprender, aprender-shell, aprender-tsp |
inference | realizar, renacer, alimentar, entrenar |
orchestration | batuta, certeza, presentar, pacha |
distributed | repartir |
transpilation | ruchy, decy, depyler |
docs | sovereign-ai-stack-book |
Examples
# ASCII tree (default)
batuta stack tree
# Output:
# PAIML Stack (21 crates)
# ├── core
# │ ├── trueno
# │ ├── trueno-viz
# │ └── ...
# ├── ml
# │ └── ...
# JSON output for tooling
batuta stack tree --format json
# Graphviz DOT for visualization
batuta stack tree --format dot | dot -Tpng -o stack.png
# Filter to specific layer
batuta stack tree --filter core
# Show health status
batuta stack tree --health
batuta stack check
Analyze dependency health across the PAIML ecosystem.
Usage
batuta stack check [OPTIONS]
Options
| Option | Description |
|---|---|
--project <NAME> | Specific project to check (default: all) |
--format <FORMAT> | Output format: text, json, markdown |
--strict | Fail on any warnings |
--verify-published | Verify crates.io versions exist |
--workspace <PATH> | Path to workspace root |
Examples
# Check all projects
batuta stack check
# Check specific project with strict mode
batuta stack check --project trueno --strict
# JSON output for CI
batuta stack check --format json --verify-published
batuta stack drift
Detect version drift across published PAIML stack crates. This command checks if any published stack crate is using an outdated version of another stack crate as a dependency.
Usage
batuta stack drift [OPTIONS]
Options
| Option | Description |
|---|---|
--fix | Generate fix commands for drift issues |
--workspace <PATH> | Workspace root containing stack crates |
--format <FORMAT> | Output format: text (default), json |
--quiet, -q | Only output if drift detected |
Automatic Blocking
By default, batuta blocks all commands if stack drift is detected. This ensures that releases and operations only proceed with a healthy stack.
# Attempting any command with drift detected:
batuta analyze .
# Output:
# 🔴 Stack Drift Detected - Cannot Proceed
#
# trueno-rag 0.1.5: trueno 0.10.1 → 0.11.0 (MINOR)
# entrenar 0.5.0: aprender 0.21 → 0.23 (MINOR)
#
# Stack drift detected. Fix dependencies before proceeding.
# Run: batuta stack drift --fix
To bypass in emergencies (not recommended):
batuta --unsafe-skip-drift-check analyze .
Drift Severity
| Severity | Example | Impact |
|---|---|---|
MAJOR | 0.6 → 0.11 | Likely breaking changes |
MINOR | 0.10.1 → 0.11.0 | New features, possible deprecations |
PATCH | 0.11.0 → 0.11.1 | Bug fixes only |
Examples
# Check for drift across published crates
batuta stack drift
# Output:
# 📦 Stack Drift Analysis
# ════════════════════════════════════════════════════════════
#
# trueno-rag 0.1.5:
# └─ trueno: 0.10.1 → 0.11.0 (MINOR)
#
# entrenar 0.5.0:
# └─ aprender: 0.21 → 0.23 (MINOR)
#
# repartir 2.0.0:
# └─ trueno: 0.6 → 0.11.0 (MAJOR)
#
# ⚠️ 3 crates with drift detected
# Generate fix commands
batuta stack drift --fix --workspace ~/src
# Output:
# cd ~/src/trueno-rag && sed -i 's/trueno = "0.10"/trueno = "0.11"/' Cargo.toml
# cd ~/src/entrenar && sed -i 's/aprender = "0.21"/aprender = "0.23"/' Cargo.toml
# cd ~/src/repartir && sed -i 's/trueno = "0.6"/trueno = "0.11"/' Cargo.toml
# JSON output for CI/tooling
batuta stack drift --format json
CI Integration
Add to your CI pipeline to catch drift early:
- name: Check Stack Drift
run: cargo run --quiet -- stack drift --quiet
# Exits 0 if no drift, 1 if drift detected
batuta stack gate
Enforce A- quality threshold across all PAIML stack components. This command is designed for CI/CD pipelines and pre-commit hooks to block releases or commits when any component falls below the quality threshold.
Usage
batuta stack gate [OPTIONS]
Options
| Option | Description |
|---|---|
--workspace <PATH> | Path to workspace root (default: parent of current directory) |
--quiet, -q | Quiet mode - only output on failure |
Quality Threshold
The quality gate enforces an A- minimum (SQI ≥ 85) for all stack components. Components below this threshold are blocked and will cause the gate to fail.
| Grade | SQI Range | Gate Status |
|---|---|---|
| A+ | 95-100% | PASS |
| A | 90-94% | PASS |
| A- | 85-89% | PASS |
| B+ | 80-84% | BLOCKED |
| B | 70-79% | BLOCKED |
| C | 60-69% | BLOCKED |
| D | 50-59% | BLOCKED |
| F | 0-49% | BLOCKED |
Enforcement Points
The quality gate is enforced at multiple points in the development workflow:
| Point | Trigger | Action |
|---|---|---|
| Pre-commit | git push | Blocks push if any component < A- |
| Release | batuta stack release | Blocks release by default (use --no-verify to skip) |
| CI Pipeline | Pull request | Blocks PR merge if quality gate fails |
| Manual | make stack-gate | Returns exit code 1 if failed |
Examples
# Run quality gate check
batuta stack gate
# Output:
# ╔════════════════════════════════════════════════════╗
# ║ Stack Quality Gate - A- Enforcement ║
# ╚════════════════════════════════════════════════════╝
#
# trueno SQI: 95.9 Grade: A+ ✅ PASS
# aprender SQI: 96.2 Grade: A+ ✅ PASS
# batuta SQI: 94.1 Grade: A ✅ PASS
# ...
#
# ✅ All 21 components meet A- quality threshold
# Quiet mode for CI (only outputs on failure)
batuta stack gate --quiet
# Check specific workspace
batuta stack gate --workspace /path/to/paiml
Exit Codes
| Code | Meaning |
|---|---|
| 0 | All components pass the quality gate |
| 1 | One or more components are below A- threshold |
Pre-commit Hook Configuration
Add to .pre-commit-config.yaml:
- repo: local
hooks:
- id: stack-quality-gate
name: Stack Quality Gate (A- enforcement)
entry: cargo run --quiet -- stack gate
language: system
pass_filenames: false
stages: [push]
Makefile Targets
stack-gate: ## Quality gate enforcement
@cargo run --quiet -- stack gate
stack-quality: ## Show detailed quality matrix
@cargo run --quiet -- stack quality
batuta stack quality
Analyze quality metrics across the PAIML stack using PMAT integration.
This command evaluates each stack component against the Stack Quality Matrix, which includes:
- Rust Project Score (0-114): Code quality, testing, documentation
- Repository Score (0-110): CI/CD, security, community health
- README Score (0-20): Documentation completeness
- Hero Image: Visual branding presence
Usage
batuta stack quality [OPTIONS] [COMPONENT]
Options
| Option | Description |
|---|---|
--strict | Require A+ grade for all components |
--format <FORMAT> | Output format: text (default), json |
--verify-hero | Verify hero image exists and meets requirements |
--verbose | Show detailed scoring breakdown |
--workspace <PATH> | Path to workspace root |
Quality Grades
| Grade | SQI Range | Description |
|---|---|---|
| A+ | 95-100% | Exceptional quality |
| A | 90-94% | Excellent quality |
| A- | 85-89% | Very good quality |
| B+ | 80-84% | Good quality |
| B | 70-79% | Acceptable quality |
| C | 60-69% | Needs improvement |
| D | 50-59% | Poor quality |
| F | 0-49% | Failing quality |
Stack Quality Index (SQI)
The SQI is calculated as a weighted composite:
SQI = 0.40 × Rust Score + 0.30 × Repo Score + 0.20 × README Score + 0.10 × Hero Score
Examples
# Check quality of all stack components
batuta stack quality
# Output:
# Stack Quality Report
# ====================
#
# trueno A+ (SQI: 97.2%)
# aprender A (SQI: 92.1%)
# batuta A+ (SQI: 96.8%)
# ...
#
# Summary: 18/25 components at A+ grade
# Overall Stack Grade: A
# Check specific component with verbose output
batuta stack quality trueno --verbose
# Strict mode for CI (fails if any component below A+)
batuta stack quality --strict
# JSON output for tooling
batuta stack quality --format json
# Verify hero images exist
batuta stack quality --verify-hero
Hero Image Requirements
A hero image is required for A+ grade and must be:
- Located at
docs/hero.svg(preferred) ordocs/hero.png - Can also be referenced as first image in README.md
- SVG format preferred for scalability and crisp rendering
- If using PNG: minimum dimensions 1280x640 pixels
batuta stack release
Coordinate releases with automatic dependency ordering.
Usage
batuta stack release [OPTIONS] [CRATE_NAME]
Options
| Option | Description |
|---|---|
--all | Release all crates with changes |
--dry-run | Show what would be released |
--bump <TYPE> | Version bump: patch, minor, major |
--no-verify | Skip quality gate verification |
--yes | Skip interactive confirmation |
--publish | Publish to crates.io |
Examples
# Dry run to see release plan
batuta stack release --all --dry-run
# Release specific crate (and its dependencies)
batuta stack release trueno --bump patch
# Full release with publish
batuta stack release --all --bump minor --publish --yes
batuta stack status
Show health dashboard for the entire stack.
Usage
batuta stack status [OPTIONS]
Options
| Option | Description |
|---|---|
--simple | Simple text output (no TUI) |
--format <FORMAT> | Output format: text, json, markdown |
--tree | Show dependency tree |
batuta stack sync
Synchronize dependency versions across the stack.
Usage
batuta stack sync [OPTIONS] [CRATE_NAME]
Options
| Option | Description |
|---|---|
--all | Sync all crates |
--dry-run | Show what would change |
--align <DEP=VER> | Align specific dependency version |
Examples
# Sync all crates
batuta stack sync --all --dry-run
# Align arrow version across stack
batuta stack sync --all --align "arrow=54.0"
batuta stack versions
Check latest versions of PAIML stack crates from crates.io.
Usage
batuta stack versions [OPTIONS]
Options
| Option | Description |
|---|---|
--outdated | Only show crates with newer versions available |
--format <FORMAT> | Output format: text (default), json |
--offline | Skip network requests (use cached data only) |
--include-prerelease | Include pre-release versions |
Examples
# Check all stack versions
batuta stack versions
# Output:
# 📦 PAIML Stack Versions
# ════════════════════════════════════════════════════════════
# Crate Latest Downloads Description
# ────────────────────────────────────────────────────────────
# trueno 0.8.8 6.3K High-performance SIMD...
# aprender 0.19.0 5.5K Next-generation ML...
# ...
# JSON output for scripting
batuta stack versions --format json
# Only outdated
batuta stack versions --outdated
batuta stack publish-status
Check publish status of all PAIML stack repos with O(1) caching.
This command scans the local workspace for PAIML crates and shows which need publishing. It uses content-addressable caching for O(1) lookups on unchanged repos.
Usage
batuta stack publish-status [OPTIONS]
Options
| Option | Description |
|---|---|
--format <FORMAT> | Output format: text (default), json |
--workspace <PATH> | Workspace root (parent directory containing stack crates) |
--clear-cache | Clear cache and force refresh |
Performance
The publish-status command uses intelligent caching for fast repeated queries:
| Scenario | Time | Description |
|---|---|---|
| Cold cache | ~7s | First run, fetches all data from crates.io |
| Warm cache | <100ms | Subsequent runs, O(1) hash-based lookups |
Cache Invalidation
The cache is automatically invalidated when:
Cargo.tomlcontent changes- Git HEAD moves (new commit)
- crates.io TTL expires (15 minutes)
Cache is stored at ~/.cache/batuta/publish-status.json.
Actions
| Symbol | Action | Description |
|---|---|---|
| ✓ | up to date | Local matches crates.io, repo is clean |
| 📝 | commit | Has uncommitted changes |
| 📦 | PUBLISH | Local version higher than crates.io |
| 🆕 | new | Not yet published to crates.io |
| ⚠️ | behind | Local version behind crates.io (unusual) |
| ❌ | error | Error checking status |
Examples
# Check publish status (fast with warm cache)
batuta stack publish-status
# Output:
# 📦 PAIML Stack Publish Status
# ═════════════════════════════════════════════════════════════════
# Crate Local crates.io Git Action
# ─────────────────────────────────────────────────────────────────
# trueno 0.8.8 0.8.8 clean ✓ up to date
# pacha 0.2.0 0.2.0 clean ✓ up to date
# depyler 3.21.0 3.20.0 33M 8? 📝 commit
# certeza 0.1.0 - clean 🆕 new
# ─────────────────────────────────────────────────────────────────
# 📊 20 crates: 1 publish, 12 commit, 6 up-to-date
# ⚡ 78ms (cache: 20 hits, 0 misses)
# Force cache refresh
batuta stack publish-status --clear-cache
# JSON output for CI/tooling
batuta stack publish-status --format json
Makefile Targets
stack-publish-status: ## Check which crates need publishing (O(1) cached)
@cargo run --quiet -- stack publish-status
stack-publish-status-refresh: ## Force refresh publish status cache
@cargo run --quiet -- stack publish-status --clear-cache
Toyota Way Principles
The stack commands embody Toyota Way principles:
| Principle | Implementation |
|---|---|
| Jidoka | Pre-flight checks stop broken releases |
| Just-in-Time | Pull-based release ordering |
| Heijunka | Version alignment across stack |
| Genchi Genbutsu | Real-time crates.io verification |
| Visual Management | Tree view with health indicators |
batuta hf
HuggingFace Hub integration commands.
Synopsis
batuta hf <COMMAND>
Commands
| Command | Description |
|---|---|
catalog | Query 50+ HuggingFace ecosystem components |
course | Query by Coursera course alignment |
tree | Display HuggingFace ecosystem tree |
search | Search models, datasets, spaces |
info | Get info about a Hub asset |
pull | Download from HuggingFace Hub |
push | Upload to HuggingFace Hub |
batuta hf catalog
Query the HuggingFace ecosystem catalog with 51 components across 6 categories.
Usage
batuta hf catalog [OPTIONS]
Options
| Option | Description |
|---|---|
--component <ID> | Get details for a specific component |
--category <CAT> | Filter by category (hub, deployment, library, training, collaboration, community) |
--tag <TAG> | Filter by tag (e.g., rlhf, lora, quantization) |
--list | List all available components |
--categories | List all categories with component counts |
--tags | List all available tags |
--format <FORMAT> | Output format: table (default), json |
Examples
# List all training components
batuta hf catalog --category training
# Output:
# 📦 HuggingFace Components
# ════════════════════════════════════════════════════════════
# peft PEFT Training & Optimization
# trl TRL Training & Optimization
# bitsandbytes Bitsandbytes Training & Optimization
# ...
# Get component details
batuta hf catalog --component peft
# Output:
# 📦 PEFT
# ════════════════════════════════════════════════════════════
# ID: peft
# Category: Training & Optimization
# Description: Parameter-efficient finetuning for large language models
# Docs: https://huggingface.co/docs/peft
# Repository: https://github.com/huggingface/peft
# PyPI: peft
# Tags: finetuning, lora, qlora, efficient
# Dependencies: transformers, bitsandbytes
# Course Alignments:
# Course 4, Week 1: 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8
# Search by tag
batuta hf catalog --tag rlhf
batuta hf catalog --tag quantization
Component Categories
| Category | Components | Description |
|---|---|---|
| Hub | 7 | Hub & client libraries (models, datasets, spaces) |
| Deployment | 7 | Inference & deployment (TGI, TEI, endpoints) |
| Library | 10 | Core ML libraries (transformers, diffusers, datasets) |
| Training | 10 | Training & optimization (PEFT, TRL, bitsandbytes) |
| Collaboration | 11 | Tools & integrations (Gradio, Argilla, agents) |
| Community | 6 | Community resources (blog, forum, leaderboards) |
batuta hf course
Query HuggingFace components aligned to Coursera specialization courses.
Usage
batuta hf course [OPTIONS]
Options
| Option | Description |
|---|---|
--list | List all 5 courses with component counts |
--course <N> | Show components for course N (1-5) |
--week <N> | Filter by week (requires –course) |
Examples
# List all courses
batuta hf course --list
# Output:
# 📚 Pragmatic AI Labs HuggingFace Specialization
# ════════════════════════════════════════════════════════════
# 5 Courses | 15 Weeks | 60 Hours
#
# Course 1: Foundations of HuggingFace (9 components)
# Course 2: Fine-Tuning and Datasets (5 components)
# Course 3: RAG and Retrieval (3 components)
# Course 4: Advanced Training (RLHF, DPO, PPO) (3 components)
# Course 5: Production Deployment (8 components)
# Get Course 4 (Advanced Fine-Tuning)
batuta hf course --course 4
# Output:
# 📚 Course 4 - Advanced Training (RLHF, DPO, PPO)
# ════════════════════════════════════════════════════════════
# peft Week 1
# bitsandbytes Week 1
# trl Week 2, Week 3
Course Curriculum
| Course | Topic | Key Components |
|---|---|---|
| 1 | Foundations | transformers, tokenizers, safetensors, hub |
| 2 | Datasets & Fine-Tuning | datasets, trainer, evaluate |
| 3 | RAG & Retrieval | sentence-transformers, faiss, outlines |
| 4 | RLHF/DPO/PPO | peft, trl, bitsandbytes |
| 5 | Production | tgi, gradio, optimum, inference-endpoints |
batuta hf tree
Display hierarchical view of HuggingFace ecosystem or PAIML integration map.
Usage
batuta hf tree [OPTIONS]
Options
| Option | Description |
|---|---|
--integration | Show PAIML↔HuggingFace integration map |
--format <FORMAT> | Output format: ascii (default), json |
Examples
# HuggingFace ecosystem tree
batuta hf tree
# Output:
# HuggingFace Ecosystem (6 categories)
# ├── hub
# │ ├── models (700K+ models)
# │ ├── datasets (100K+ datasets)
# │ └── spaces (300K+ spaces)
# ├── libraries
# │ ├── transformers (Model architectures)
# │ └── ...
# PAIML-HuggingFace integration map
batuta hf tree --integration
# Output shows:
# ✓ COMPATIBLE - Interoperates with HF format/API
# ⚡ ALTERNATIVE - PAIML native replacement (pure Rust)
# 🔄 ORCHESTRATES - PAIML wraps/orchestrates HF
# 📦 USES - PAIML uses HF library directly
batuta hf search
Search HuggingFace Hub for models, datasets, or spaces.
Usage
batuta hf search <ASSET_TYPE> <QUERY> [OPTIONS]
Arguments
| Argument | Description |
|---|---|
<ASSET_TYPE> | Type: model, dataset, space |
<QUERY> | Search query string |
Options
| Option | Description |
|---|---|
--task <TASK> | Filter by task (for models) |
--limit <N> | Limit results (default: 10) |
Examples
# Search for Llama models
batuta hf search model "llama 7b" --task text-generation
# Search for speech datasets
batuta hf search dataset "common voice" --limit 5
# Search for Gradio spaces
batuta hf search space "image classifier"
batuta hf info
Get detailed information about a HuggingFace asset.
Usage
batuta hf info <ASSET_TYPE> <REPO_ID>
Examples
# Get model info
batuta hf info model "meta-llama/Llama-2-7b-hf"
# Get dataset info
batuta hf info dataset "mozilla-foundation/common_voice_13_0"
# Get space info
batuta hf info space "gradio/chatbot"
batuta hf pull
Download models, datasets, or spaces from HuggingFace Hub.
Usage
batuta hf pull <ASSET_TYPE> <REPO_ID> [OPTIONS]
Options
| Option | Description |
|---|---|
-o, --output <PATH> | Output directory |
--quantization <Q> | Model quantization (Q4_K_M, Q5_K_M, etc.) |
Examples
# Pull GGUF model with quantization
batuta hf pull model "TheBloke/Llama-2-7B-GGUF" --quantization Q4_K_M
# Pull to specific directory
batuta hf pull model "mistralai/Mistral-7B-v0.1" -o ./models/
# Pull dataset
batuta hf pull dataset "squad" -o ./data/
batuta hf push
Upload models, datasets, or spaces to HuggingFace Hub.
Usage
batuta hf push <ASSET_TYPE> <PATH> --repo <REPO_ID> [OPTIONS]
Options
| Option | Description |
|---|---|
--repo <REPO_ID> | Target repository (required) |
--message <MSG> | Commit message |
Examples
# Push trained model
batuta hf push model ./my-model --repo "myorg/my-classifier"
# Push dataset
batuta hf push dataset ./data/processed --repo "myorg/my-dataset"
# Push Presentar app as Space
batuta hf push space ./my-app --repo "myorg/demo" --message "Initial release"
PAIML-HuggingFace Integration
The integration map shows how PAIML stack components relate to HuggingFace (28 mappings):
| Category | PAIML | HuggingFace | Type |
|---|---|---|---|
| Formats | .apr | pickle/.joblib, safetensors, gguf | ⚡ Alternative |
| realizar/gguf | gguf | ✓ Compatible | |
| realizar/safetensors | safetensors | ✓ Compatible | |
| Data Formats | .ald | parquet/arrow, json/csv | ⚡ Alternative |
| Hub Access | aprender/hf_hub | huggingface_hub | 📦 Uses |
| batuta/hf | huggingface_hub | 🔄 Orchestrates | |
| Registry | pacha | HF Hub registry, MLflow/W&B | ⚡ Alternative |
| Inference | realizar | transformers, TGI | ⚡ Alternative |
| realizar/moe | optimum | ⚡ Alternative | |
| Classical ML | aprender | sklearn, xgboost/lightgbm | ⚡ Alternative |
| Deep Learning | entrenar | PyTorch training | ⚡ Alternative |
| alimentar | datasets | ⚡ Alternative | |
| Compute | trueno | NumPy/PyTorch tensors | ⚡ Alternative |
| repartir | accelerate | ⚡ Alternative | |
| Tokenization | realizar/tokenizer | tokenizers | ✓ Compatible |
| trueno-rag | tokenizers | ✓ Compatible | |
| Apps | presentar | gradio | ⚡ Alternative |
| trueno-viz | visualization | ⚡ Alternative | |
| Quality | certeza | evaluate | ⚡ Alternative |
| MCP Tooling | pforge | LangChain Tools | ⚡ Alternative |
| pmat | code analysis tools | ⚡ Alternative | |
| pmcp | mcp-sdk | ⚡ Alternative |
Legend:
- ✓ COMPATIBLE - Interoperates with HF format/API
- ⚡ ALTERNATIVE - PAIML native replacement (pure Rust)
- 🔄 ORCHESTRATES - PAIML wraps/orchestrates HF
- 📦 USES - PAIML uses HF library directly
Compatible Formats
PAIML can load and save HuggingFace formats:
#![allow(unused)]
fn main() {
// Load GGUF model (realizar)
let model = GGUFModel::from_file("model.gguf")?;
// Load SafeTensors (aprender)
let weights = SafeTensors::load("model.safetensors")?;
// Load HF tokenizer (realizar)
let tokenizer = Tokenizer::from_pretrained("meta-llama/Llama-2-7b-hf")?;
}
Security Features (v1.1.0)
SafeTensors Enforcement
By default, batuta hf pull blocks unsafe pickle-based formats:
# Default: blocks .bin, .pkl, .pt files
batuta hf pull model "repo/model"
# Explicit override for unsafe formats
batuta hf pull model "repo/model" --allow-unsafe
| Extension | Safety | Notes |
|---|---|---|
.safetensors | ✓ Safe | Recommended |
.gguf | ✓ Safe | Quantized |
.json | ✓ Safe | Config |
.bin | ✗ Unsafe | Pickle-based |
.pkl | ✗ Unsafe | Pickle |
.pt | ✗ Unsafe | PyTorch |
Secret Scanning
Automatic scan before push blocks accidental credential exposure:
# Blocked if secrets detected
batuta hf push model ./my-model --repo "org/model"
# Detected patterns:
# - .env files
# - Private keys (.pem, id_rsa)
# - Credential files
Rate Limit Handling
Automatic exponential backoff for API rate limits (429):
- Initial: 1s → 2s → 4s → 8s → 16s
- Max backoff: 60s
- Max retries: 5
- Respects
Retry-Afterheader
Model Card Auto-Generation
# Auto-generates README.md if missing
batuta hf push model ./my-model --repo "org/model"
Generated card includes:
- YAML frontmatter (license, tags)
- Training metrics from certeza
- PAIML stack attribution
Differential Uploads
Only uploads changed files using content-addressable hashing:
# Only uploads modified files
batuta hf push model ./my-model --repo "org/model"
Environment Variables
| Variable | Description |
|---|---|
HF_TOKEN | HuggingFace API token |
HF_HOME | Cache directory |
HF_HUB_OFFLINE | Offline mode |
batuta data
Data platforms integration commands for visualizing and querying the enterprise data ecosystem.
Synopsis
batuta data <COMMAND> [OPTIONS]
Commands
| Command | Description |
|---|---|
tree | Display data platforms ecosystem tree |
Global Options
| Option | Description |
|---|---|
-v, --verbose | Enable verbose output |
-d, --debug | Enable debug output |
-h, --help | Print help |
batuta data tree
Display hierarchical visualization of data platforms and their components, or show PAIML stack integration mappings.
Usage
batuta data tree [OPTIONS]
Options
| Option | Description | Default |
|---|---|---|
--platform <NAME> | Filter by platform (databricks, snowflake, aws, huggingface) | All platforms |
--integration | Show PAIML integration mappings instead of platform tree | false |
--format <FORMAT> | Output format (ascii, json) | ascii |
Examples
View All Platforms
$ batuta data tree
DATA PLATFORMS ECOSYSTEM
========================
DATABRICKS
├── Unity Catalog
│ └── Unity Catalog
│ ├── Schemas
│ ├── Tables
│ └── Views
├── Delta Lake
│ └── Delta Lake
│ ├── Parquet storage
│ ├── Transaction log
│ └── Time travel
...
Filter by Platform
$ batuta data tree --platform snowflake
SNOWFLAKE
├── Virtual Warehouse
│ └── Virtual Warehouse
│ ├── Compute clusters
│ ├── Result cache
│ └── Auto-scaling
├── Iceberg Tables
│ └── Iceberg Tables
│ ├── Open format
│ ├── Schema evolution
│ └── Partition pruning
├── Snowpark
│ └── Snowpark
│ ├── Python UDFs
│ ├── Java/Scala UDFs
│ └── ML functions
└── Data Sharing
└── Data Sharing
├── Secure shares
├── Reader accounts
└── Marketplace
View Integration Mappings
$ batuta data tree --integration
PAIML ↔ DATA PLATFORMS INTEGRATION
==================================
STORAGE & CATALOGS
├── [ALT] Alimentar (.ald) ←→ Delta Lake
├── [CMP] Alimentar (.ald) ←→ Iceberg Tables
├── [CMP] Alimentar (sync) ←→ S3
├── [ALT] Pacha Registry ←→ Unity Catalog
├── [ALT] Pacha Registry ←→ Glue Catalog
├── [ALT] Pacha Registry ←→ HuggingFace Hub
COMPUTE & PROCESSING
├── [ALT] Trueno ←→ Spark DataFrames
├── [ALT] Trueno ←→ Snowpark
├── [ALT] Trueno ←→ EMR
├── [TRN] Depyler → Rust ←→ Snowpark Python
├── [TRN] Depyler → Rust ←→ Lambda Python
├── [ALT] Trueno-Graph ←→ Neptune/GraphQL
ML TRAINING
├── [ALT] Aprender ←→ MLlib
├── [ALT] Aprender ←→ Snowpark ML
├── [ALT] Entrenar ←→ SageMaker Training
├── [ALT] Entrenar ←→ MLflow Tracking
├── [ALT] Entrenar ←→ SageMaker Experiments
├── [USE] Entrenar ←→ W&B
MODEL SERVING
├── [ALT] Realizar ←→ MLflow Serving
├── [ALT] Realizar ←→ SageMaker Endpoints
├── [ALT] Realizar + serve ←→ Bedrock
├── [USE] Realizar ←→ GGUF models
├── [CMP] Realizar (via GGUF) ←→ HF Transformers
ORCHESTRATION
├── [ORC] Batuta ←→ Databricks Workflows
├── [ORC] Batuta ←→ Snowflake Tasks
├── [ORC] Batuta ←→ Step Functions
├── [ORC] Batuta ←→ Airflow/Prefect
Legend: [CMP]=Compatible [ALT]=Alternative [USE]=Uses
[TRN]=Transpiles [ORC]=Orchestrates
Summary: 3 compatible, 16 alternatives, 2 uses, 2 transpiles, 4 orchestrates
Total: 27 integration points
JSON Output
$ batuta data tree --platform databricks --format json
{
"platform": "Databricks",
"categories": [
{
"name": "Unity Catalog",
"components": [
{
"name": "Unity Catalog",
"description": "Unified governance for data and AI",
"sub_components": ["Schemas", "Tables", "Views"]
}
]
},
...
]
}
$ batuta data tree --integration --format json
[
{
"platform_component": "Delta Lake",
"paiml_component": "Alimentar (.ald)",
"integration_type": "Alternative",
"category": "STORAGE & CATALOGS"
},
...
]
Integration Type Legend
| Code | Type | Meaning |
|---|---|---|
CMP | Compatible | Direct interoperability with PAIML component |
ALT | Alternative | PAIML provides a sovereign replacement |
USE | Uses | PAIML component consumes this as input |
TRN | Transpiles | Depyler converts source code to Rust |
ORC | Orchestrates | Batuta can coordinate external workflows |
Supported Platforms
| Platform | Description |
|---|---|
databricks | Unity Catalog, Delta Lake, MLflow, Spark |
snowflake | Virtual Warehouse, Iceberg, Snowpark, Data Sharing |
aws | S3, Glue, SageMaker, Bedrock, EMR, Lambda |
huggingface | Hub, Transformers, Datasets, Inference API |
See Also
batuta hf- HuggingFace Hub operationsbatuta stack- PAIML stack managementbatuta oracle- Intelligent query interface- Data Platforms Integration - Detailed documentation
batuta viz
Visualization frameworks ecosystem commands for viewing Python framework hierarchies and their PAIML Rust replacements.
Synopsis
batuta viz <COMMAND> [OPTIONS]
Commands
| Command | Description |
|---|---|
tree | Display visualization frameworks ecosystem tree |
Global Options
| Option | Description |
|---|---|
-v, --verbose | Enable verbose output |
-d, --debug | Enable debug output |
-h, --help | Print help |
batuta viz tree
Display hierarchical visualization of Python frameworks and their PAIML Rust replacements, or show component replacement mappings.
Usage
batuta viz tree [OPTIONS]
Options
| Option | Description | Default |
|---|---|---|
--framework <NAME> | Filter by framework (gradio, streamlit, panel, dash) | All frameworks |
--integration | Show PAIML replacement mappings | false |
--format <FORMAT> | Output format (ascii, json) | ascii |
Examples
View All Frameworks
$ batuta viz tree
VISUALIZATION FRAMEWORKS ECOSYSTEM
==================================
GRADIO (Python) → Presentar (Rust)
├── Interface
│ └── Interface → Presentar::QuickApp
│ ├── Inputs
│ ├── Outputs
│ └── Examples
├── Blocks
│ └── Blocks → Presentar::Layout
│ ├── Layout
│ ├── Events
│ └── State
├── Components
│ ├── Image → Trueno-Viz::ImageView
│ ├── Audio → Presentar::AudioPlayer
│ ├── Video → Presentar::VideoPlayer
│ ├── Chatbot → Realizar + Presentar
│ ├── DataFrame → Trueno-Viz::DataGrid
│ └── Plot → Trueno-Viz::Chart
└── Deployment
└── Deployment → Batuta deploy
STREAMLIT (Python) → Presentar (Rust)
...
PANEL (Python) → Trueno-Viz (Rust)
...
DASH (Python) → Presentar + Trueno-Viz (Rust)
...
Summary: 4 Python frameworks replaced by 2 Rust libraries
Filter by Framework
$ batuta viz tree --framework gradio
GRADIO (Python) → Presentar (Rust)
├── Interface
│ └── Interface → Presentar::QuickApp
│ ├── Inputs
│ ├── Outputs
│ └── Examples
├── Blocks
│ └── Blocks → Presentar::Layout
├── Components
│ ├── Image → Trueno-Viz::ImageView
│ ├── Audio → Presentar::AudioPlayer
│ ├── Video → Presentar::VideoPlayer
│ ├── Chatbot → Realizar + Presentar
│ ├── DataFrame → Trueno-Viz::DataGrid
│ └── Plot → Trueno-Viz::Chart
└── Deployment
└── Deployment → Batuta deploy
View Replacement Mappings
$ batuta viz tree --integration
PAIML REPLACEMENTS FOR PYTHON VIZ
=================================
UI FRAMEWORKS
├── [REP] Presentar::QuickApp ← gr.Interface
├── [REP] Presentar::Layout ← gr.Blocks
├── [REP] Presentar::App ← dash.Dash
├── [REP] Presentar::Layout ← st.columns/sidebar
VISUALIZATION
├── [REP] Trueno-Viz::Chart ← dcc.Graph
├── [REP] Trueno-Viz::Chart ← st.plotly_chart
├── [REP] Trueno-Viz::DataGrid ← st.dataframe
├── [REP] Trueno-Viz::DataGrid ← dash_table
├── [REP] Trueno-Viz::GPURaster ← datashader
├── [REP] Trueno-Viz::Plot ← matplotlib/plotly/bokeh
COMPONENTS
├── [REP] Presentar::TextInput ← st.text_input
├── [REP] Presentar::Slider ← st.slider
├── [REP] Presentar::Select ← st.selectbox
├── [REP] Presentar::Button ← st.button
├── [REP] Trueno-Viz::ImageView ← gr.Image
STATE & CACHING
├── [REP] Presentar::State ← st.session_state
├── [REP] Trueno::TensorCache ← @st.cache_data
├── [REP] Presentar::on_event ← @callback
DEPLOYMENT
├── [REP] Batuta deploy ← HuggingFace Spaces
├── [REP] Batuta deploy ← Streamlit Cloud
├── [REP] Batuta deploy ← Dash Enterprise
Legend: [REP]=Replaces (Python eliminated)
Summary: 21 Python components replaced by sovereign Rust alternatives
Zero Python dependencies in production
JSON Output
$ batuta viz tree --framework streamlit --format json
{
"framework": "Streamlit",
"replacement": "Presentar",
"categories": [
{
"name": "Widgets",
"components": [
{
"name": "Input",
"description": "User input widgets",
"replacement": "Presentar::Widgets",
"sub_components": ["text_input", "number_input", "slider", "selectbox"]
}
]
}
]
}
Integration Type Legend
| Code | Type | Meaning |
|---|---|---|
REP | Replaces | PAIML component fully replaces Python equivalent |
Note: All mappings are REP (Replaces) - Python is completely eliminated from production deployments.
Supported Frameworks
| Framework | PAIML Replacement | Description |
|---|---|---|
gradio | Presentar | ML demo interfaces |
streamlit | Presentar | Data apps and dashboards |
panel | Trueno-Viz | HoloViz ecosystem visualizations |
dash | Presentar + Trueno-Viz | Plotly enterprise dashboards |
See Also
batuta data- Data platforms integrationbatuta hf- HuggingFace Hub operations- Visualization Frameworks - Detailed documentation
batuta content
Content creation tooling for generating structured prompts for educational and technical content.
Overview
The content command provides tools for generating LLM prompts that follow Toyota Way principles, ensuring high-quality, structured content generation.
Subcommands
batuta content emit
Generate a structured prompt for content creation.
batuta content emit [OPTIONS] --type <TYPE>
Options:
| Option | Short | Description |
|---|---|---|
--type | -t | Content type: hlo, dlo, bch, blp, pdm |
--title | Title or topic for the content | |
--audience | Target audience | |
--word-count | Target word count | |
--level | -l | Course level for detailed outlines: short, standard, extended |
--source-context | Source context paths (comma-separated) | |
--show-budget | Show token budget breakdown | |
--output | -o | Output file (default: stdout) |
Content Types:
| Code | Name | Format | Length |
|---|---|---|---|
hlo | High-Level Outline | YAML/Markdown | 200-1000 lines |
dlo | Detailed Outline | YAML/Markdown | 200-1000 lines |
bch | Book Chapter | Markdown (mdBook) | 2000-5000 words |
blp | Blog Post | Markdown (Zola) | 1000-2500 words |
pdm | Presentar Demo | YAML/Markdown | N/A |
Course Levels
For detailed outlines (dlo), configure the course structure using --level:
| Level | Weeks | Modules | Videos/Module | Weekly Objectives |
|---|---|---|---|---|
short | 1 | 2 | 3 | No |
standard | 3 | 3 | 5 | Yes (3 per week) |
extended | 6 | 6 | 5 | Yes (3 per week) |
All courses include:
- Course description (2-3 sentences)
- 3 course-level learning objectives
- Per module: videos + quiz + reading + lab
Examples:
# Short course (1 week, 2 modules)
batuta content emit -t dlo --title "Quick Start" --level short
# Standard course (3 weeks, 3 modules) - default
batuta content emit -t dlo --title "Complete Course"
# Extended course (6 weeks, 6 modules)
batuta content emit -t dlo --title "Masterclass" --level extended
# Book chapter with audience
batuta content emit -t bch --title "Error Handling" --audience "Beginners"
# Blog post with word count
batuta content emit -t blp --title "Why Rust?" --word-count 1500
batuta content validate
Validate generated content against quality constraints.
batuta content validate --type <TYPE> <FILE>
Options:
| Option | Short | Description |
|---|---|---|
--type | -t | Content type to validate against |
--llm-judge | Use LLM-as-a-Judge for style validation |
Example:
batuta content validate -t bch chapter.md
batuta content types
List all available content types.
batuta content types
Toyota Way Integration
The content module implements Toyota Way principles:
| Principle | Implementation |
|---|---|
| Jidoka | LLM-as-a-Judge validation catches quality issues |
| Poka-Yoke | Structural constraints in templates prevent mistakes |
| Genchi Genbutsu | Source context mandate grounds content in reality |
| Heijunka | Token budgeting levels context usage |
| Kaizen | Dynamic template composition enables improvement |
Output Schema (Detailed Outline)
type: detailed_outline
version: "1.0"
course:
title: string
description: string (2-3 sentences)
duration_weeks: int
total_modules: int
learning_objectives:
- objective: string
- objective: string
- objective: string
weeks: # Only for standard/extended
- week: 1
learning_objectives:
- objective: string
- objective: string
- objective: string
modules:
- id: module_1
week: 1
title: string
description: string
learning_objectives:
- objective: string
videos:
- id: video_1_1
title: string
duration_minutes: int (5-15)
reading:
title: string
duration_minutes: int (15-30)
quiz:
title: string
num_questions: int (5-10)
lab:
title: string
duration_minutes: int (30-60)
Navigate: Table of Contents | CLI Overview
batuta falsify
The falsify command runs the Popperian Falsification Checklist - a 108-item quality assurance protocol based on Toyota Production System (TPS) principles and the scientific method.
Usage
# Run full checklist on current directory
batuta falsify .
# Run on a specific project
batuta falsify /path/to/project
# Output JSON format
batuta falsify . --json
# Critical checks only (fast mode)
batuta falsify . --critical-only
Overview
The checklist implements Sir Karl Popper’s falsification principle: every claim must have explicit rejection criteria. Each of the 108 items is a falsifiable claim about the project’s quality.
Sections
The checklist is organized into 10 sections:
| Section | Items | Focus |
|---|---|---|
| 1. Sovereign Data Governance | 15 | Data residency, privacy, consent |
| 2. ML Technical Debt Prevention | 10 | CACE, entanglement, dead code |
| 3. Hypothesis-Driven Development | 13 | Reproducibility, baselines, statistics |
| 4. Numerical Reproducibility | 15 | IEEE754, cross-platform determinism |
| 5. Performance & Waste Elimination | 15 | PCIe rule, SIMD, latency SLAs |
| 6. Safety & Formal Verification | 10 | Memory safety, fuzzing, Miri |
| 7. Jidoka Automated Gates | 10 | CI/CD circuit breakers |
| 8. Model Cards & Auditability | 10 | Documentation, provenance |
| 9. Cross-Platform & API | 5 | Linux/macOS/Windows, WASM |
| 10. Architectural Invariants | 5 | YAML config, pure Rust testing |
TPS Grades
Results are graded using Toyota Production System terminology:
| Grade | Score | Meaning |
|---|---|---|
| Toyota Standard | 95-100% | Production ready |
| Kaizen Required | 85-94% | Acceptable with improvements |
| Andon Warning | 70-84% | Issues require attention |
| Stop the Line | <70% | Critical issues block release |
Severity Levels
Each check has a severity level:
- Critical: Blocks release if failed
- Major: Requires remediation plan
- Minor: Should be documented
- Info: Informational only
Example Output
╔═══════════════════════════════════════════════════════════════════╗
║ POPPERIAN FALSIFICATION CHECKLIST - Sovereign AI Protocol ║
╚═══════════════════════════════════════════════════════════════════╝
Project: .
Evaluated: 2025-12-11T12:00:00+00:00
Grade: ◐ Kaizen Required
Score: 88.9%
Items: 84/108 passed, 0 failed
─── Jidoka Automated Gates ───
✓ JA-01 Pre-Commit Hook Enforcement [MAJOR]
✓ JA-02 Automated Sovereignty Linting [MAJOR]
✓ JA-03 Data Drift Circuit Breaker [MAJOR]
...
✅ All critical checks passed - Release allowed
Integration with CI
Add to your CI pipeline:
- name: Quality Gate
run: |
batuta falsify . --json > falsification-report.json
# Fail if critical checks fail
batuta falsify . --critical-only || exit 1
TPS Principles Applied
The checklist embodies Toyota Way principles:
- Jidoka: Automated gates stop on quality issues
- Genchi Genbutsu: Evidence-based verification
- Kaizen: Continuous improvement through feedback
- Muda: Waste detection and elimination
- Poka-Yoke: Error-proofing through constraints
Related Commands
batuta stack quality- Stack-wide quality metricsbatuta analyze- Project analysis
Migration Strategy
This chapter is under development.
Coming soon: Detailed information about migration strategy.
Navigate: Table of Contents
Greenfield Brownfield
This chapter is under development.
Coming soon: Detailed information about greenfield brownfield.
Navigate: Table of Contents
Risk Assessment
This chapter is under development.
Coming soon: Detailed information about risk assessment.
Navigate: Table of Contents
Rollback
This chapter is under development.
Coming soon: Detailed information about rollback.
Navigate: Table of Contents
Testing Strategy
This chapter is under development.
Coming soon: Detailed information about testing strategy.
Navigate: Table of Contents
Test Migration
This chapter is under development.
Coming soon: Detailed information about test migration.
Navigate: Table of Contents
Property Testing
This chapter is under development.
Coming soon: Detailed information about property testing.
Navigate: Table of Contents
Regression
This chapter is under development.
Coming soon: Detailed information about regression.
Navigate: Table of Contents
Performance
This chapter is under development.
Coming soon: Detailed information about performance.
Navigate: Table of Contents
Profiling and Performance Tuning
This chapter documents performance profiling techniques and optimization discoveries from the Sovereign AI Stack.
Thread Pool Optimization
The 2.05x Discovery
A major performance breakthrough was discovered through systematic profiling: reducing thread count from 48 to 16 yielded a 2.05x speedup in CPU inference.
| Metric | 48 Threads | 16 Threads | Improvement |
|---|---|---|---|
| Throughput | 12.4 tok/s | 25.4 tok/s | 2.05x |
| Overhead | 3.5x | 1.7x | 2.06x |
| Per-token latency | 80.6 ms | 39.4 ms | 2.05x |
Root Cause Analysis
The default rayon thread pool uses all available logical cores (hyperthreads). For small work units like single-token inference, this causes:
- Cache line bouncing - 48 threads invalidating L1/L2 constantly
- False sharing - Adjacent output writes causing coherency traffic
- Hyperthread contention - HT pairs fighting for same FPU
- Rayon sync overhead - Work units too small for 48-way split
Optimal Thread Count Formula
Optimal threads = min(physical_cores, work_size / cache_line_size)
For Qwen 1.5B with 1536 hidden dimension:
- 1536 elements / 16 elements per cache line = 96 cache lines
- 12-16 threads = 6-8 cache lines per thread (optimal)
- 48 threads = 2 cache lines per thread (too fine-grained)
Implementation
The configure_optimal_thread_pool() function in realizar sets the optimal thread count:
#![allow(unused)]
fn main() {
use realizar::inference::configure_optimal_thread_pool;
// Set to 16 threads (or physical core count)
configure_optimal_thread_pool();
// Or set explicitly via environment
std::env::set_var("RAYON_NUM_THREADS", "16");
}
Profiling Tools
Micro-Level Profiling
cargo run --release --example micro_profile
Profiles individual operations (matmul, attention, FFN) to identify bottlenecks.
Layer-Level Profiling
cargo run --release --example layer_profile
Profiles generation timing to measure per-token latency and throughput.
Thread Sweep
for t in 8 10 12 14 16 18 20 24 32 48; do
echo "=== $t threads ==="
RAYON_NUM_THREADS=$t cargo run --release --example instrumented_forward 2>&1 | grep -E "Throughput|Per token"
done
Results Interpretation
| Symptom | Likely Cause | Solution |
|---|---|---|
| Low throughput, high thread count | Thread overhead | Reduce threads |
| Low bandwidth utilization (<20%) | Compute-bound | SIMD optimization |
| High bandwidth, low throughput | Memory-bound | Better tiling |
| Variable latency | Cache thrashing | Thread affinity |
Tile-Level Profiling (TILING-SPEC-001)
Trueno’s BrickProfiler supports hierarchical tile profiling:
#![allow(unused)]
fn main() {
use trueno::{BrickProfiler, TileLevel};
let mut profiler = BrickProfiler::new();
profiler.enable_tile_profiling();
// Profile a macro tile (L3/Global memory level)
let timer = profiler.start_tile(TileLevel::Macro, 0, 0);
// ... execute computation ...
profiler.stop_tile(timer, elements, flops);
// Get results
println!("{}", profiler.tile_summary());
}
Tile Hierarchy
| Level | Memory | Typical Size | Use Case |
|---|---|---|---|
| Macro | L3/Global | 32MB | Layer-level |
| Midi | L2/Shared | 256KB | Head-level |
| Micro | L1/Registers | 32KB | SIMD-level |
Metrics
| Metric | Formula | Interpretation |
|---|---|---|
| GFLOP/s | flops / seconds / 1e9 | Compute throughput |
| Arithmetic Intensity | flops / bytes | >10 = compute-bound |
| Cache Efficiency | actual / peak | Target >50% |
Remaining Optimization Opportunities
After thread optimization (25.4 tok/s), the remaining gap to 42 tok/s target is 1.66x:
| Optimization | Expected Gain | Status |
|---|---|---|
| Thread count optimization | 2.05x | Done |
| Fuse parallel regions | 1.2-1.3x | Pending |
| SIMD attention (AVX-512) | 1.2-1.4x | Pending |
| Reduce Vec allocations | 1.1x | Pending |
Previous: Optimization Iteration Next: Code Review
Bottlenecks
This chapter is under development.
Coming soon: Detailed information about bottlenecks.
Navigate: Table of Contents
Optimization Iteration
This chapter is under development.
Coming soon: Detailed information about optimization iteration.
Navigate: Table of Contents
Team Workflow
This chapter is under development.
Coming soon: Detailed information about team workflow.
Navigate: Table of Contents
Parallel Development
This chapter covers strategies for parallel development when working with the Sovereign AI Stack, including distributed computing patterns with repartir.
Overview
Parallel development in the stack operates at multiple levels:
- Code-level parallelism: Rayon, SIMD, GPU compute
- Task-level parallelism: repartir work-stealing scheduler
- Machine-level parallelism: Distributed execution across nodes
- Team-level parallelism: Concurrent development workflows
Code-Level Parallelism
SIMD with Trueno
#![allow(unused)]
fn main() {
use trueno::Vector;
// Automatic SIMD (AVX2/AVX-512/NEON)
let a = Vector::from_slice(&[1.0, 2.0, 3.0, 4.0]);
let b = Vector::from_slice(&[5.0, 6.0, 7.0, 8.0]);
let result = a.add(&b)?; // SIMD-accelerated
}
GPU with wgpu
#![allow(unused)]
fn main() {
use repartir::executor::gpu::GpuExecutor;
let gpu = GpuExecutor::new().await?;
println!("Using: {} ({} compute units)",
gpu.device_name(),
gpu.capacity()
);
}
Task-Level Parallelism
Work-Stealing with repartir
The Blumofe & Leiserson work-stealing algorithm provides efficient load balancing:
#![allow(unused)]
fn main() {
use repartir::{Pool, task::{Task, Backend}};
let pool = Pool::builder()
.cpu_workers(num_cpus::get())
.build()?;
// Tasks automatically distributed across workers
for chunk in data.chunks(1000) {
let task = Task::builder()
.binary("./process")
.arg(format!("--data={:?}", chunk))
.backend(Backend::Cpu)
.build()?;
pool.submit(task).await?;
}
}
Backend Selection Strategy
| Workload Size | Complexity | Recommended Backend |
|---|---|---|
| < 1K elements | Any | Scalar (no overhead) |
| 1K - 100K | Low/Medium | SIMD (trueno) |
| > 100K | High (O(n²)+) | GPU (wgpu) |
| > 10M | Any | Distributed (repartir remote) |
Machine-Level Parallelism
Multi-Node Deployment
┌─────────────────────────────────────────────────────────────┐
│ Coordinator Node │
│ (batuta orchestration) │
├─────────────────────────────────────────────────────────────┤
│ repartir RemoteExecutor │
├───────────────┬───────────────┬───────────────┬─────────────┤
│ Worker 1 │ Worker 2 │ Worker 3 │ Worker N │
│ GPU + CPU │ GPU + CPU │ GPU + CPU │ GPU + CPU │
└───────────────┴───────────────┴───────────────┴─────────────┘
Setting Up Workers
# On each worker node
cargo install repartir --features remote
# Start worker daemon
repartir-worker --bind 0.0.0.0:9000
# With TLS (production)
repartir-worker --bind 0.0.0.0:9443 \
--cert ./certs/server.pem \
--key ./certs/server.key
Coordinator Code
#![allow(unused)]
fn main() {
use repartir::executor::remote::RemoteExecutor;
let workers = vec![
"10.0.0.1:9000",
"10.0.0.2:9000",
"10.0.0.3:9000",
];
let executor = RemoteExecutor::builder()
.add_workers(&workers)
.build()
.await?;
// Tasks distributed automatically
for task in tasks {
let result = executor.execute(task).await?;
}
}
Team-Level Parallelism
Git Workflow for Parallel Development
main ─────────────────────────────────────────────────►
│ │ │
▼ ▼ ▼
feature/ml-model feature/api-v2 feature/gpu-opt
│ │ │
└────────────────────┴────────────────────┘
│
▼
Integration Branch
│
▼
CI/CD Pipeline
│
▼
main
Module Boundaries
Structure code for parallel development:
src/
├── core/ # Stable, shared code
│ ├── types.rs
│ └── traits.rs
├── ml/ # Team A: ML features
│ ├── training.rs
│ └── inference.rs
├── api/ # Team B: API features
│ ├── handlers.rs
│ └── routes.rs
└── compute/ # Team C: Compute optimization
├── simd.rs
└── gpu.rs
Batuta Stack Workflow
# Check component health (parallel-safe)
batuta stack check
# Quality gate before merge
batuta stack gate
# Version status
batuta stack versions
Performance Patterns
Amdahl’s Law Considerations
Speedup = 1 / ((1 - P) + P/N)
Where:
P = Parallel fraction of code
N = Number of processors
| Algorithm | Parallel Fraction | 8-Node Speedup |
|---|---|---|
| Random Forest | 0.95 | 5.9x |
| K-Means | 0.85 | 4.4x |
| Linear Regression | 0.90 | 5.0x |
| Neural Network | 0.92 | 5.4x |
Communication Overhead
Minimize cross-node communication:
#![allow(unused)]
fn main() {
// BAD: Fine-grained tasks (high overhead)
for item in items {
executor.execute(process_one(item)).await?;
}
// GOOD: Coarse-grained tasks (batch processing)
for chunk in items.chunks(10_000) {
executor.execute(process_batch(chunk)).await?;
}
}
Monitoring & Debugging
TUI Dashboard
# Monitor distributed job flow
cargo run --bin job-flow --features tui,remote
Logging
#![allow(unused)]
fn main() {
use tracing::{info, debug, span, Level};
let span = span!(Level::INFO, "distributed_task", node = %node_id);
let _guard = span.enter();
info!("Submitting task to {}", node_id);
debug!("Task payload: {:?}", task);
}
Metrics Collection
#![allow(unused)]
fn main() {
use std::time::Instant;
let start = Instant::now();
let result = executor.execute(task).await?;
let duration = start.elapsed();
metrics::histogram!("task_duration_ms", duration.as_millis() as f64);
metrics::counter!("tasks_completed", 1);
}
Best Practices
1. Profile Before Parallelizing
# Use pmat for analysis
pmat check . --analyze-complexity
# Identify hot paths
cargo flamegraph --root
2. Start with Coarse Granularity
Begin with large tasks, then refine if needed.
3. Handle Failures Gracefully
#![allow(unused)]
fn main() {
match executor.execute(task).await {
Ok(result) if result.is_success() => {
// Process result
}
Ok(result) => {
// Task failed, retry or skip
log::warn!("Task failed: {:?}", result.stderr_str());
}
Err(e) => {
// Network/system error, may retry
log::error!("Execution error: {}", e);
}
}
}
4. Use Checkpointing for Long Jobs
#![allow(unused)]
fn main() {
use repartir::checkpoint::CheckpointManager;
let checkpoint = CheckpointManager::new("./checkpoints")?;
for epoch in start_epoch..total_epochs {
// Train epoch
train_epoch(epoch).await?;
// Checkpoint after each epoch
checkpoint.save(&format!("epoch_{}", epoch), &state).await?;
}
}
Navigate: Table of Contents | Code Review | Knowledge Transfer
Code Review
This chapter is under development.
Coming soon: Detailed information about code review.
Navigate: Table of Contents
Knowledge Transfer
This chapter is under development.
Coming soon: Detailed information about knowledge transfer.
Navigate: Table of Contents
Common Issues
This chapter is under development.
Coming soon: Detailed information about common issues.
Navigate: Table of Contents
Transpilation Failures
This chapter is under development.
Coming soon: Detailed information about transpilation failures.
Navigate: Table of Contents
Type Inference
This chapter is under development.
Coming soon: Detailed information about type inference.
Navigate: Table of Contents
Lifetime Errors
This chapter is under development.
Coming soon: Detailed information about lifetime errors.
Navigate: Table of Contents
Performance Regressions
This chapter is under development.
Coming soon: Detailed information about performance regressions.
Navigate: Table of Contents
Debugging
This chapter is under development.
Coming soon: Detailed information about debugging.
Navigate: Table of Contents
Log Analysis
This chapter is under development.
Coming soon: Detailed information about log analysis.
Navigate: Table of Contents
Trace Comparison
This chapter is under development.
Coming soon: Detailed information about trace comparison.
Navigate: Table of Contents
State Inspection
This chapter is under development.
Coming soon: Detailed information about state inspection.
Navigate: Table of Contents
Getting Help
This chapter is under development.
Coming soon: Detailed information about getting help.
Navigate: Table of Contents
Issue Reporting
This chapter is under development.
Coming soon: Detailed information about issue reporting.
Navigate: Table of Contents
Community
This chapter is under development.
Coming soon: Detailed information about community.
Navigate: Table of Contents
Architecture Overview
This chapter is under development.
Coming soon: Detailed information about architecture overview.
Navigate: Table of Contents
State Machine
This chapter is under development.
Coming soon: Detailed information about state machine.
Navigate: Table of Contents
Tool Detection
This chapter is under development.
Coming soon: Detailed information about tool detection.
Navigate: Table of Contents
Config System
This chapter is under development.
Coming soon: Detailed information about config system.
Navigate: Table of Contents
Plugin Architecture
This chapter is under development.
Coming soon: Detailed information about plugin architecture.
Navigate: Table of Contents
Glossary
Essential terms and concepts used throughout the Batuta framework.
Core Concepts
| Term | Definition |
|---|---|
| Batuta | Orchestration framework for the Sovereign AI Stack. From Spanish “baton” - the conductor’s wand. |
| Sovereign AI Stack | 20-component pure Rust ML infrastructure for privacy-preserving AI. |
| Toyota Way | Lean manufacturing principles (Jidoka, Kaizen, Muda, etc.) applied to software. |
Toyota Way Principles
| Principle | Japanese | Meaning |
|---|---|---|
| Jidoka | 自働化 | Built-in quality: stop-the-line on defects |
| Kaizen | 改善 | Continuous improvement |
| Muda | 無駄 | Waste elimination |
| Heijunka | 平準化 | Level scheduling |
| Kanban | 看板 | Visual workflow management |
| Andon | 行灯 | Problem visualization (red/yellow/green) |
| Mieruka | 見える化 | Visual control dashboards |
| Genchi Genbutsu | 現地現物 | Go and see for yourself |
Stack Components
| Component | Layer | Description |
|---|---|---|
| Trueno | Compute | SIMD/GPU tensor primitives |
| Aprender | ML | First-principles ML algorithms |
| Realizar | Inference | LLM inference runtime |
| Depyler | Transpiler | Python to Rust conversion |
| Batuta | Orchestration | Workflow coordination |
| Certeza | Quality | Validation framework |
| PMAT | Quality | Code quality metrics |
Quality Metrics
| Term | Definition |
|---|---|
| Demo Score | PMAT quality metric (0-100 scale) |
| TDG | Technical Debt Grade |
| Quality Gate | A- (85) minimum for production |
| Coverage | Test code coverage percentage |
| Mutation Score | Mutation testing kill rate |
Transpilation Terms
| Term | Definition |
|---|---|
| AST | Abstract Syntax Tree |
| HIR | High-level Intermediate Representation |
| MIR | Mid-level Intermediate Representation |
| FFI | Foreign Function Interface |
| Zero-copy | Memory operations without data copying |
Navigate: Table of Contents
Supported Languages
Batuta supports transpilation from multiple source languages to Rust.
Source Languages
| Language | Transpiler | Status | Features |
|---|---|---|---|
| Python | Depyler | ✅ Stable | Type inference, NumPy/sklearn/PyTorch |
| Shell | Bashrs | ✅ Stable | POSIX compliance, formal verification |
| C/C++ | Decy | 🔄 Beta | Memory safety, ownership inference |
Python Support (Depyler)
Supported Constructs
- Functions and classes
- Type annotations (PEP 484)
- List/dict/set comprehensions
- Context managers (
withstatements) - Decorators
- Async/await
ML Library Mappings
| Python | Rust Equivalent |
|---|---|
numpy | trueno |
sklearn | aprender |
torch | realizar |
pandas | polars (via trueno) |
Shell Support (Bashrs)
Supported Features
- Variable assignment and expansion
- Control flow (if/else, for, while, case)
- Functions
- Pipelines and redirections
- Command substitution
- Arrays
Shell Compatibility
| Shell | Support Level |
|---|---|
| POSIX sh | Full |
| Bash 4.x | Full |
| Bash 5.x | Full |
| Zsh | Partial |
C/C++ Support (Decy)
Supported Constructs
- Functions and structs
- Pointers (with ownership inference)
- Arrays and strings
- Memory allocation/deallocation
- Header file parsing
Safety Analysis
Decy performs automatic safety analysis:
- Buffer overflow detection
- Use-after-free detection
- Memory leak detection
- Null pointer dereference
Target: Rust
All transpilation targets modern Rust (2021 edition) with:
- Full type safety
- Memory safety guarantees
- Zero-cost abstractions
- No unsafe code (where possible)
Navigate: Table of Contents
Dependency Managers
This chapter is under development.
Coming soon: Detailed information about dependency managers.
Navigate: Table of Contents
Optimization Profiles
This chapter is under development.
Coming soon: Detailed information about optimization profiles.
Navigate: Table of Contents
Error Codes
Batuta error codes follow a hierarchical naming convention for easy identification and resolution.
Error Code Format
BATUTA-[PHASE]-[NUMBER]
- PHASE: Which phase generated the error (ANALYZE, TRANSPILE, OPTIMIZE, VALIDATE, BUILD)
- NUMBER: Specific error within that phase
Analysis Phase Errors (BATUTA-A-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-A-001 | Language detection failed | Ensure source files have correct extensions |
BATUTA-A-002 | Dependency analysis timeout | Increase timeout or reduce project scope |
BATUTA-A-003 | TDG calculation error | Check for circular dependencies |
BATUTA-A-004 | ML framework not recognized | Update Batuta to latest version |
Transpilation Phase Errors (BATUTA-T-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-T-001 | Transpiler not found | Install required transpiler (depyler/bashrs/decy) |
BATUTA-T-002 | Syntax error in source | Fix source code syntax |
BATUTA-T-003 | Type inference failed | Add type annotations |
BATUTA-T-004 | Unsupported construct | Check compatibility matrix |
Optimization Phase Errors (BATUTA-O-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-O-001 | SIMD not available | Use fallback backend |
BATUTA-O-002 | GPU memory exhausted | Reduce batch size |
BATUTA-O-003 | Backend selection failed | Check hardware compatibility |
Validation Phase Errors (BATUTA-V-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-V-001 | Output mismatch | Review semantic differences |
BATUTA-V-002 | Test suite failed | Fix failing tests |
BATUTA-V-003 | Syscall trace divergence | Check I/O operations |
Build Phase Errors (BATUTA-B-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-B-001 | Compilation failed | Check Rust compiler output |
BATUTA-B-002 | Linking error | Verify dependencies |
BATUTA-B-003 | Cross-compilation unsupported | Check target architecture |
Quality Gate Errors (BATUTA-Q-*)
| Code | Description | Resolution |
|---|---|---|
BATUTA-Q-001 | Demo score below threshold | Improve code quality to A- (85) |
BATUTA-Q-002 | Coverage insufficient | Add more tests |
BATUTA-Q-003 | Clippy warnings present | Fix linting issues |
Navigate: Table of Contents
Benchmarks
This chapter is under development.
Coming soon: Detailed information about benchmarks.
Navigate: Table of Contents
Primitive Comparison: Trueno vs PyTorch vs llama.cpp
This document provides a rigorous comparison of Trueno’s SIMD primitives against PyTorch’s ATen library and llama.cpp’s GGML backend, demonstrating that Trueno achieves equivalent or superior performance with type-safe Rust.
Executive Summary
| Aspect | Trueno | PyTorch ATen | llama.cpp GGML |
|---|---|---|---|
| Language | Rust (type-safe) | C++ | C |
| Memory Safety | Compile-time | Runtime checks | Manual |
| SIMD Coverage | AVX2, AVX-512, NEON, SSE2 | AVX2, AVX-512 | AVX2, AVX-512, NEON, AMX |
| Dot Product | 4-accumulator FMA | Vec256 FMA | 4-accumulator FMA |
| Softmax | SIMD exp (4.35x speedup) | Sleef-based | SIMD exp + reduce |
| Attention | SIMD-fused (PMAT-017) | Flash Attention | Tiled flash attention |
| Quantization | Int4/Int8/Q5_K/Q6_K | Int8/GPTQ | Q4_K/Q5_K/Q6_K |
Verdict: Trueno matches or exceeds the SIMD performance of both PyTorch and llama.cpp while providing Rust’s compile-time memory safety guarantees.
1. Dot Product Implementation
Trueno AVX2 (4-accumulator, llama.cpp-style)
#![allow(unused)]
fn main() {
// trueno/src/backends/avx2.rs:159-186
unsafe fn dot(a: &[f32], b: &[f32]) -> f32 {
let len = a.len();
let mut i = 0;
// 4 independent accumulators for better ILP (llama.cpp style)
let mut acc0 = _mm256_setzero_ps();
let mut acc1 = _mm256_setzero_ps();
let mut acc2 = _mm256_setzero_ps();
let mut acc3 = _mm256_setzero_ps();
// Process 32 elements at a time (4 × 8) with 4 independent FMA chains
while i + 32 <= len {
let va0 = _mm256_loadu_ps(a.as_ptr().add(i));
let vb0 = _mm256_loadu_ps(b.as_ptr().add(i));
let va1 = _mm256_loadu_ps(a.as_ptr().add(i + 8));
let vb1 = _mm256_loadu_ps(b.as_ptr().add(i + 8));
let va2 = _mm256_loadu_ps(a.as_ptr().add(i + 16));
let vb2 = _mm256_loadu_ps(b.as_ptr().add(i + 16));
let va3 = _mm256_loadu_ps(a.as_ptr().add(i + 24));
let vb3 = _mm256_loadu_ps(b.as_ptr().add(i + 24));
// 4 independent FMA operations - no dependency chain
acc0 = _mm256_fmadd_ps(va0, vb0, acc0);
acc1 = _mm256_fmadd_ps(va1, vb1, acc1);
acc2 = _mm256_fmadd_ps(va2, vb2, acc2);
acc3 = _mm256_fmadd_ps(va3, vb3, acc3);
i += 32;
}
// ... remainder handling
}
}
llama.cpp GGML (Similar 4-accumulator pattern)
// ggml/src/ggml-cpu/vec.cpp - conceptual equivalent
// llama.cpp uses the same 4-accumulator pattern for hiding FMA latency
// The key insight: FMA has 4-cycle latency, 0.5 CPI throughput
// 4 independent accumulators = 4 × 0.5 = 2 FMAs/cycle = near peak
PyTorch ATen (Single accumulator in Vec256)
// aten/src/ATen/cpu/vec/vec256/vec256_float.h
// PyTorch uses a simpler single-accumulator pattern
auto tmp1 = _mm256_fmadd_ps(p5, t, p4);
auto tmp2 = _mm256_fmadd_ps(tmp1, t, p3);
// Sequential dependency chain limits ILP
Analysis: Trueno matches llama.cpp’s 4-accumulator optimization which hides FMA latency. PyTorch’s ATen uses single accumulators, making Trueno 1.5-2x faster for dot products on data that fits in L1/L2.
2. AVX-512 Implementation
Trueno AVX-512 (2-accumulator with reduce intrinsics)
#![allow(unused)]
fn main() {
// trueno/src/backends/avx512.rs:151-192
unsafe fn dot(a: &[f32], b: &[f32]) -> f32 {
let mut acc0 = _mm512_setzero_ps();
let mut acc1 = _mm512_setzero_ps();
// Process 32 elements at a time (2 × 16)
while i + 32 <= len {
let va0 = _mm512_loadu_ps(a.as_ptr().add(i));
let vb0 = _mm512_loadu_ps(b.as_ptr().add(i));
let va1 = _mm512_loadu_ps(a.as_ptr().add(i + 16));
let vb1 = _mm512_loadu_ps(b.as_ptr().add(i + 16));
acc0 = _mm512_fmadd_ps(va0, vb0, acc0);
acc1 = _mm512_fmadd_ps(va1, vb1, acc1);
i += 32;
}
// Use AVX-512 horizontal reduce (optimal instruction)
let acc = _mm512_add_ps(acc0, acc1);
let result = _mm512_reduce_add_ps(acc);
result
}
}
llama.cpp AVX-512
// llama.cpp uses _mm512_reduce_add_ps for horizontal reduction
// Same optimization pattern as trueno
Analysis: Both use _mm512_reduce_add_ps which is the optimal AVX-512 horizontal sum. Trueno uses 2 accumulators (optimal for 512-bit registers), llama.cpp uses similar patterns.
3. Softmax Implementation
Trueno (Numerically stable, row-wise)
#![allow(unused)]
fn main() {
// trueno/src/brick.rs:4278-4300
fn simd_softmax_row(scores: &mut [f32]) {
if scores.is_empty() {
return;
}
// Find max for numerical stability
let max = scores.iter().cloned().fold(f32::NEG_INFINITY, f32::max);
// Compute exp(x - max) and sum
let mut sum = 0.0f32;
for s in scores.iter_mut() {
*s = (*s - max).exp();
sum += *s;
}
// Normalize
let inv_sum = 1.0 / sum;
for s in scores.iter_mut() {
*s *= inv_sum;
}
}
}
llama.cpp (SIMD exp with reduce)
// ggml/src/ggml-cpu/vec.cpp:548-568
ggml_float ggml_vec_soft_max_f32(const int n, float * y, const float * x, float max) {
int i = 0;
ggml_float sum = 0;
#if defined(__AVX512F__) && defined(__AVX512DQ__)
for (; i + 15 < n; i += 16) {
__m512 val = ggml_v_expf(_mm512_sub_ps(_mm512_loadu_ps(x + i),
_mm512_set1_ps(max)));
_mm512_storeu_ps(y + i, val);
sum += (ggml_float)_mm512_reduce_add_ps(val);
}
#elif defined(__AVX2__) && defined(__FMA__)
for (; i + 7 < n; i += 8) {
__m256 val = ggml_v_expf(_mm256_sub_ps(_mm256_loadu_ps(x + i),
_mm256_set1_ps(max)));
_mm256_storeu_ps(y + i, val);
// horizontal sum...
}
#endif
// ...
}
PyTorch (Sleef-based exp)
// Uses Sleef_expf8_u10 for vectorized exp
auto tmp4 = Vectorized<float>(Sleef_expf8_u10(neg_pow_2));
Analysis:
- llama.cpp has the most optimized SIMD softmax with custom
ggml_v_expf - Trueno uses standard library
exp()which auto-vectorizes well - PyTorch uses Sleef library for vectorized transcendentals
Improvement Opportunity: Trueno could add SIMD exp using polynomial approximation for 2-3x softmax speedup.
4. Attention Implementation
Trueno AttentionOp (PMAT-017)
#![allow(unused)]
fn main() {
// trueno/src/brick.rs:4153-4377
impl ComputeOp for AttentionOp {
fn execute(&self, input: Self::Input, _backend: Backend) -> Result<Self::Output, TruenoError> {
let (q, k, v) = input;
let mut output = vec![0.0f32; self.seq_len * self.head_dim];
let mut scores = vec![0.0f32; self.kv_seq_len];
for qi in 0..self.seq_len {
let q_row = &q[qi * self.head_dim..(qi + 1) * self.head_dim];
// SIMD dot products for Q @ K^T
for ki in 0..self.kv_seq_len {
let k_row = &k[ki * self.head_dim..(ki + 1) * self.head_dim];
scores[ki] = Self::simd_dot(q_row, k_row) * self.scale;
}
// Row-wise softmax
Self::simd_softmax_row(&mut scores);
// Weighted sum: output = softmax(scores) @ V
let out_row = &mut output[qi * self.head_dim..(qi + 1) * self.head_dim];
for ki in 0..self.kv_seq_len {
let v_row = &v[ki * self.head_dim..(ki + 1) * self.head_dim];
let weight = scores[ki];
for (o, &vi) in out_row.iter_mut().zip(v_row.iter()) {
*o += weight * vi;
}
}
}
Ok(output)
}
}
}
llama.cpp Flash Attention
// ggml/src/ggml-cpu/ops.cpp - tiled attention with online softmax
// Uses tiled computation to stay in L1/L2 cache
// Implements FlashAttention algorithm with incremental softmax
PyTorch Flash Attention
// Uses CUDA kernels for Flash Attention
// CPU path uses standard attention with SIMD ops
Analysis:
- Trueno provides clean SIMD-accelerated attention with runtime feature detection
- llama.cpp has the most optimized tiled attention with online softmax
- PyTorch relies on CUDA for Flash Attention, CPU path is less optimized
5. Backend Coverage
| Backend | Trueno | PyTorch | llama.cpp |
|---|---|---|---|
| AVX2 | ✅ Full | ✅ Full | ✅ Full |
| AVX-512 | ✅ Full | ✅ Partial | ✅ Full |
| NEON | ✅ Full | ✅ Full | ✅ Full |
| SSE2 | ✅ Full | ✅ Full | ✅ Full |
| AMX | ❌ | ❌ | ✅ |
| wgpu (GPU) | ✅ | ❌ (uses CUDA) | ✅ (Vulkan) |
| WASM | ✅ | ❌ | ❌ |
Trueno Advantages:
- wgpu GPU backend: Cross-platform GPU support (Vulkan/Metal/DX12/WebGPU) vs CUDA-only
- WASM support: Browser deployment capability
- Unified API: Same code for all backends with feature detection
6. Memory Safety
| Aspect | Trueno | PyTorch | llama.cpp |
|---|---|---|---|
| Buffer overflows | Compile-time prevented | Runtime checks | Manual validation |
| Use-after-free | Impossible (ownership) | Smart pointers | Manual |
| Data races | Compile-time prevented | Mutex-based | Manual |
| Null pointers | Option types | nullptr checks | Manual |
Critical Advantage: Trueno’s Rust implementation prevents entire classes of bugs at compile time.
7. Performance Benchmarks
Dot Product (1M elements, single-threaded)
| Implementation | Throughput | Notes |
|---|---|---|
| Trueno AVX2 | 12.5 GFLOP/s | 4-accumulator |
| Trueno AVX-512 | 22.3 GFLOP/s | 2-accumulator |
| llama.cpp AVX2 | ~12 GFLOP/s | Similar pattern |
| PyTorch ATen | ~8 GFLOP/s | Single accumulator |
Thread Optimization Discovery (PMAT-004)
Trueno’s profiling revealed optimal thread count:
| Threads | Throughput | Overhead |
|---|---|---|
| 48 (default) | 12.4 tok/s | 3.5x |
| 16 (optimal) | 25.4 tok/s | 1.7x |
| Improvement | 2.05x |
This optimization applies to all SIMD implementations but was discovered through Trueno’s BrickProfiler.
8. Quantization Support
| Format | Trueno (APR v2) | llama.cpp | PyTorch |
|---|---|---|---|
| Int8 | ✅ | ✅ Q8_0 | ✅ |
| Int4 | ✅ | ✅ Q4_K | ✅ GPTQ |
| Q5_K | ✅ (QUANT-Q5K) | ✅ | ❌ |
| Q6_K | ✅ (QUANT-Q5K) | ✅ | ❌ |
Update: Trueno now matches llama.cpp’s full k-quant format support with Q5_K and Q6_K implementations (QUANT-Q5K ticket).
9. Conclusion
Trueno Equals or Exceeds:
- Dot product performance: 4-accumulator FMA matches llama.cpp, exceeds PyTorch
- AVX-512 optimization: Uses
_mm512_reduce_add_pslike llama.cpp - Memory safety: Compile-time guarantees exceed both
- Cross-platform GPU: wgpu vs CUDA-only (PyTorch) or Vulkan-only (llama.cpp)
- WASM support: Unique to Trueno
Implemented Optimizations (SIMD-EXP, QUANT-Q5K):
- SIMD exp approximation: Implemented! 6th-degree Remez minimax polynomial matching llama.cpp’s ggml_v_expf. Measured 4.35x speedup for softmax.
- Q5_K/Q6_K formats: Implemented! Full dequantization and SIMD dot product support matching llama.cpp block format.
Areas for Future Work:
- AMX support: Intel AMX tiles for matrix operations (Sapphire Rapids+)
Proof of Superiority:
Trueno achieves equivalent SIMD performance to llama.cpp (the fastest open-source
inference engine) while providing Rust's compile-time safety guarantees. The
4-accumulator dot product pattern and AVX-512 reduce intrinsics match the
state-of-the-art, and the unified backend abstraction enables deployment targets
(WASM, wgpu) that neither PyTorch nor llama.cpp support.
Previous: Appendix F: Performance Benchmarks Next: Appendix H: Roadmap
PAIML Sovereign AI Ecosystem
This appendix provides a comprehensive comparison between the traditional Python/Jupyter ML ecosystem and the PAIML Sovereign AI Stack built on Rust, including migration tooling to convert existing codebases.
Visual Overview
Executive Summary
The core insight: Python ML is actually a C/C++/Fortran stack with scripting glue. The PAIML ecosystem replaces the entire tower with pure Rust, delivering compile-time guarantees, single-binary deployment, cryptographic sovereignty, plus migration tooling to convert existing codebases.
| Trade-off | Python Wins | Rust Wins |
|---|---|---|
| Ecosystem breadth | ✓ Imports GGUF/SafeTensors/ONNX (500k+ HF models) | |
| Deployment simplicity | ✓ Single binary | |
| Correctness guarantees | ✓ Compile-time | |
| Security by design | ✓ Native crypto | |
| Edge/airgap deployment | ✓ Zero dependencies | |
| Migration path | ✓ Automated transpilers | |
| Python ecosystem familiarity | ✓ Existing skills/code |
Complete Ecosystem Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ MIGRATION LAYER │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │
│ │ depyler │ │ decy │ │ bashrs │ │ ruchy │ │ New Rust-first │ │
│ │ Py→Rust │ │ C→Rust │ │ Rust→sh │ │ Scripting│ │ Scripting │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ TOOLING LAYER │
│ ┌──────────────────┐ ┌──────────────────┐ ┌────────────────────────┐ │
│ │ pmcp (rust-mcp) │ │ pforge │ │ pmat │ │
│ │ MCP Protocol │ │ Declarative MCP │ │ Quality Analysis │ │
│ │ 16x faster │ │ YAML→Rust MCP │ │ TDG/Mutation/Lint │ │
│ └──────────────────┘ └──────────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────────────┐
│ SOVEREIGN AI STACK │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ batuta v0.1.3 │ │
│ │ Orchestration/CLI │ │
│ ├─────────────────────────────┬───────────────────────────────────────┤ │
│ │ realizar v0.2.2 │ pacha v0.1.1 │ │
│ │ GGUF/SafeTensor Inference │ Model Registry (Ed25519/ChaCha) │ │
│ ├─────────────────────────────┴───────────────────────────────────────┤ │
│ │ aprender v0.14.1 │ │
│ │ ML Algorithms: regression, trees, clustering, .apr │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ trueno v0.7.4 │ │
│ │ SIMD/GPU Compute: CUDA + wgpu (Metal/Vulkan) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ Pure Rust │ No FFI │ No C deps │ Single Binary │
└─────────────────────────────────────────────────────────────────────────┘
Layer 1: Sovereign AI Stack (ML Infrastructure)
Python/Jupyter Ecosystem
┌─────────────────────────────────────────┐
│ Python Scripts │ ← What you write
├─────────────────────────────────────────┤
│ NumPy │ Pandas │ sklearn │ PyTorch │ ← Python APIs
├─────────────────────────────────────────┤
│ BLAS/LAPACK │ libtorch │ cuDNN │ ← C/C++/Fortran
├─────────────────────────────────────────┤
│ CUDA Toolkit │ ← NVIDIA only
└─────────────────────────────────────────┘
Sovereign AI Stack (Rust)
┌─────────────────────────────────────────┐
│ batuta v0.1.3 │ ← Orchestration/CLI
├──────────────────┬──────────────────────┤
│ realizar v0.2.2 │ pacha v0.1.1 │ ← Inference │ Registry
├──────────────────┴──────────────────────┤
│ aprender v0.14.1 │ ← ML Algorithms
├─────────────────────────────────────────┤
│ trueno v0.7.4 │ ← SIMD/GPU Compute
└─────────────────────────────────────────┘
Pure Rust │ No FFI │ No C deps
Component Reference
| Layer | Python | Rust (Sovereign) | Function |
|---|---|---|---|
| Compute | NumPy, CuPy, JAX | trueno | SIMD/GPU primitives |
| ML Algos | scikit-learn, XGBoost | aprender | Classical ML |
| Inference | transformers, vLLM | realizar | Model serving |
| Registry | MLflow, HuggingFace Hub | pacha | Model management |
| Orchestration | Airflow, Ray, Kubeflow | batuta | Workflow coordination |
| Data Loading | pandas, Datasets | alimentar | ETL pipelines |
| Analytics DB | DuckDB, Polars | trueno-db | GPU-accelerated queries |
Model Import: Full HuggingFace Compatibility
The ecosystem breadth argument is eliminated. The Sovereign AI Stack imports all major model formats:
| Format | Source | Import Status |
|---|---|---|
| GGUF | llama.cpp, HuggingFace | ✓ Native via realizar |
| SafeTensors | HuggingFace standard | ✓ Native via realizar |
| ONNX | Cross-framework | ✓ Supported |
| PyTorch (.pt/.pth) | Convert to SafeTensors | ✓ Via conversion |
# Load any HuggingFace model
batuta pacha pull meta-llama/Llama-3-8B-Instruct-GGUF
batuta pacha pull mistralai/Mistral-7B-v0.1 # SafeTensors
# Convert and import with provenance
batuta pacha import model.safetensors --sign --encrypt
Result: Access to 500k+ HuggingFace models with single-binary deployment, no Python runtime.
Layer 2: Tooling (MCP & Quality)
pmcp (rust-mcp-sdk) — MCP Protocol Implementation
What it is: Production-grade Rust implementation of the Model Context Protocol (MCP), 16x faster than TypeScript.
| Feature | Specification |
|---|---|
| Performance | 16x faster than TypeScript SDK, 50x lower memory |
| Transports | stdio, HTTP/SSE, WebSocket, WASM |
| Auth | OAuth 2.0, Bearer tokens, OIDC discovery |
| Type Safety | Automatic JSON schema from Rust types |
| Quality | Toyota Way principles, zero unwrap() policy |
#![allow(unused)]
fn main() {
// Type-safe MCP server example
let server = ServerBuilder::new()
.name("weather-server")
.tool("get-weather", TypedTool::new(...))
.build()?;
server.run_stdio().await?;
}
Links: github.com/paiml/rust-mcp-sdk | crates.io/crates/pmcp
pforge — Declarative MCP Framework
What it is: Define MCP servers in YAML instead of code. Built on pmcp.
forge:
name: my-server
version: 0.1.0
transport: stdio
tools:
- type: native
name: greet
description: "Greet someone"
handler:
path: handlers::greet_handler
params:
name: { type: string, required: true }
| Handler Type | Description |
|---|---|
| Native | Rust functions with full type safety |
| CLI | Execute shell commands |
| HTTP | Proxy HTTP endpoints |
| Pipeline | Chain multiple tools |
Links: github.com/paiml/pforge | paiml.github.io/pforge
pmat — Code Quality Analysis Toolkit
What it is: Zero-configuration AI context generation and code quality analysis for 17+ languages.
| Capability | Description |
|---|---|
| Context Generation | Deep analysis for Claude, GPT, LLMs |
| Technical Debt Grading | A+ through F scoring, 6 metrics |
| Mutation Testing | Test suite quality (85%+ kill rate target) |
| Repository Scoring | Health assessment (0-211 scale) |
| Semantic Search | Natural language code discovery |
| MCP Integration | 19 tools for AI agents |
# Generate AI-ready context
pmat context --output context.md --format llm-optimized
# Grade technical debt
pmat analyze tdg
# Run mutation testing
pmat mutate --target src/ --threshold 85
Links: github.com/paiml/paiml-mcp-agent-toolkit | crates.io/crates/pmat
Layer 3: Migration Transpilers
The Rust Migration Path
The PAIML ecosystem provides transpilers to migrate existing codebases to Rust:
┌─────────────────────────────────────────────────────────────────┐
│ MIGRATION SOURCES │
├────────────┬────────────┬────────────┬────────────┬─────────────┤
│ Python │ C │ Bash │ (New) │ Rust │
│ depyler │ decy │ bashrs │ ruchy │ (Target) │
│ ↓ │ ↓ │ ↓ │ ↓ │ │
│ .py │ .c │ .sh │ .ruchy │ .rs │
│ ↓ │ ↓ │ ↓ │ ↓ │ │
│ ══════════════════════════════════════════════════════════════ │
│ SAFE, IDIOMATIC RUST │
└─────────────────────────────────────────────────────────────────┘
depyler — Python to Rust Transpiler
What it is: Compiles Python to Rust with semantic verification and memory safety analysis.
| Feature | Details |
|---|---|
| Single-command compile | depyler compile script.py → native binary |
| Semantic verification | Property-based testing for equivalence |
| Type-directed | Uses Python annotations for Rust types |
| 27 stdlib modules | json, datetime, hashlib, etc. (100% validated) |
| MCP Integration | Available as MCP server for AI assistants |
# Compile Python to standalone binary
depyler compile script.py -o myapp
# Transpile with verification
depyler transpile example.py --verify
Python (example.py):
def fibonacci(n: int) -> int:
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
Rust (generated):
#![allow(unused)]
fn main() {
fn fibonacci(n: i32) -> i32 {
if n <= 1 {
return n;
}
fibonacci(n - 1) + fibonacci(n - 2)
}
}
Links: github.com/paiml/depyler | crates.io/crates/depyler
decy — C to Rust Transpiler
What it is: Transpiles legacy C to safe, idiomatic Rust with minimal unsafe blocks.
| Feature | Details |
|---|---|
| Ownership inference | Converts pointers to &T, &mut T, Box, Vec |
| Lifetime inference | Automatic lifetime annotation |
| Unsafe minimization | 4-phase reduction: 100% → <5% unsafe |
| Project-level | decy transpile-project src/ with caching |
| Target projects | CPython, Git, SQLite, NumPy |
# Transpile single file
decy transpile input.c -o output.rs
# Transpile entire project
decy transpile-project src/ -o rust_output/
# Debug transpilation
decy debug --visualize-ownership input.c
Unsafe Reduction Pipeline:
- Phase 1: Pattern-based (100% → 50%) — malloc/free → Box
- Phase 2: Ownership inference (50% → 20%) — &T, &mut T
- Phase 3: Lifetime inference (20% → 10%)
- Phase 4: Safe wrappers (10% → <5%)
Links: github.com/paiml/decy
bashrs (rash) — Bidirectional Shell Safety Tool
What it is: Write shell scripts in Rust with automatic safety, OR purify legacy bash.
| Direction | Description |
|---|---|
| Rust → Shell | Write safe shell scripts in Rust syntax |
| Bash → Safe Shell | Purify messy bash to deterministic POSIX |
Automatic Safety Guarantees:
- Shell injection protection
- Word splitting prevention
- Glob expansion safety
- Idempotent operations
# Transpile Rust to shell
bashrs build install.rs -o install.sh
# Purify legacy bash
bashrs purify messy.sh -o clean.sh
# Lint shell scripts
bashrs lint script.sh
Before (messy bash):
SESSION_ID=$RANDOM # Non-deterministic
mkdir /app/releases/$RELEASE # Non-idempotent
After (purified):
session_id="session-${version}" # Deterministic
mkdir -p "/app/releases/${release}" # Idempotent
Links: github.com/paiml/bashrs | crates.io/crates/bashrs
ruchy — Rust-First Scripting Language
What it is: Modern scripting language that transpiles to Rust. Python expressiveness + Rust safety.
| Feature | Details |
|---|---|
| Self-hosting compiler | Written in Rust, full bootstrapping |
| Interactive REPL | Syntax highlighting, completion |
| WASM support | Browser and edge deployment |
| Notebook integration | Jupyter-style with testing |
| DataFrame support | 80% complete, 200K+ property tests |
| Zero unsafe | All generated code is thread-safe |
// Variables and functions
let x = 42
let name = "Ruchy"
println(f"Hello, {name}!")
fun add(a, b) {
a + b
}
// Pattern matching
match value {
Some(x) => println(f"Got {x}"),
None => println("Nothing"),
}
# Interactive REPL
ruchy
# Run script
ruchy script.ruchy
# Compile to binary
ruchy compile script.ruchy -o myapp
# Package management (Cargo integration)
ruchy new my_project
ruchy add serde tokio
Links: github.com/paiml/ruchy | crates.io/crates/ruchy
The 10-Point Comparison (Python vs Rust)
1. Deployment
| Python | Rust |
|---|---|
| Python runtime (~100MB) | Single static binary |
| conda/venv environment | (~10-50MB total) |
| pip dependencies (GB+ for ML) | No runtime needed |
| CUDA toolkit (~4GB) | Copy file, execute |
| cuDNN (~800MB) | |
| Dockerfile to wrangle it all |
Bottom line: ~5GB+ install vs ~50MB binary.
2. Underlying Reality
| Python | Rust |
|---|---|
| NumPy = BLAS/LAPACK (Fortran) | Pure Rust throughout |
| PyTorch = libtorch (C++) | No FFI boundaries |
| TensorFlow = C++ core | No C toolchain required |
| Python is the glue, not the engine | Self-contained |
Bottom line: You’re not really writing Python ML—you’re configuring C++.
3. Error Discovery
| Python/Jupyter | Rust |
|---|---|
| Runtime errors | Compile-time errors |
| One cell at a time | All errors at once |
| Silent shape mismatches | Type-checked dimensions |
| Stack trace dumps | Actionable fix suggestions |
| Kernel crashes lose state | Build fails safely |
Example:
# Python: runs, produces wrong result silently
result = model.predict(X.T) # Oops, transposed
#![allow(unused)]
fn main() {
// Rust: compile error with fix suggestion
error[E0308]: mismatched types
--> src/main.rs:12:18
|
12 | model.predict(&x)?;
| ^^ expected `Matrix<100, 10>`, found `Matrix<10, 100>`
|
help: consider using `x.transpose()`
}
4. Memory & Thread Safety
| Python | Rust |
|---|---|
| Garbage collector | Ownership system |
| Global Interpreter Lock (GIL) | Send + Sync traits |
| Manual C buffer management | Compile-time enforcement |
| Data races possible | Data races impossible |
| “just pray” | Zero-cost abstractions |
Bottom line: Rust eliminates entire categories of bugs at compile time.
5. GPU Support
| Python | Rust |
|---|---|
| CUDA only | CUDA (when available) |
| NVIDIA hardware lock-in | wgpu backend |
| C++ underneath | Metal (Apple) |
| Complex driver dependencies | Vulkan (cross-platform) |
| WebGPU (browser) | |
| Pure Rust implementation |
Bottom line: Rust gives you CUDA performance where available, portable fallbacks elsewhere.
6. Model Security
| Python | Rust |
|---|---|
| Pickle (arbitrary code execution) | Ed25519 digital signatures |
| Signing is afterthought | ChaCha20-Poly1305 encryption |
| Trust-on-download | BLAKE3 content addressing |
| No provenance chain | Native .apr format |
| Cryptographic lineage |
Security primitives in .apr format:
- AES-256-GCM encryption at rest
- Ed25519 signatures for authenticity
- X25519 key exchange for distribution
- CRC32 checksums for integrity
- License blocks and watermarking
7. Privacy & Sovereignty
| Python | Rust |
|---|---|
| Requires discipline | Enforced by design |
| Easy to accidentally leak | Privacy tiers block calls |
| No built-in controls | Configurable per-deployment |
Privacy Tiers:
| Tier | Behavior | Use Case |
|---|---|---|
| Sovereign | Blocks ALL external APIs | Healthcare, Government |
| Private | VPC/dedicated endpoints only | Financial services |
| Standard | Public APIs allowed | General deployment |
#![allow(unused)]
fn main() {
let selector = BackendSelector::new()
.with_privacy(PrivacyTier::Sovereign);
// Only returns: Realizar, Ollama, LlamaCpp (local)
}
8. Dependency Management
| Python | Rust |
|---|---|
| conda environment conflicts | Cargo.lock deterministic |
| C library version mismatches | Reproducible builds |
| “works on my machine” | No system dependencies |
| Diamond dependency hell | Semantic versioning enforced |
| Rebuild env from scratch regularly | Build once, run anywhere |
Python nightmare:
$ conda install pytorch
Solving environment: failed
Conflict: libstdc++ 11.2 vs 12.1
Rust reality:
$ cargo build --release
Compiling aprender v0.14.1
Finished release [optimized] target(s) in 45.32s
9. Model Formats
| Python | Rust |
|---|---|
| Pickle (unsafe, Python-only) | Native .apr format |
| SafeTensors | Imports SafeTensors ✓ |
| GGUF | Imports GGUF ✓ |
| ONNX | Imports ONNX ✓ |
| Fragmented, incompatible | Universal import + unified native format |
Key insight: The Sovereign AI Stack can load any model from HuggingFace via GGUF/SafeTensors import. You get access to 500k+ models WITHOUT the Python runtime.
.apr format capabilities:
- Memory-mapped loading (600x faster)
- Zero-copy deserialization
- Built-in Ed25519 signing & ChaCha20 encryption
- Compression (zstd)
- Commercial licensing blocks
- Buyer-specific watermarking
10. Debug Cycle
| Python/Jupyter | Rust |
|---|---|
| Run cell | cargo build |
| Crash | See all errors |
| Fix one error | Fix all errors |
| Run cell | cargo build |
| Different crash | Runs correctly |
| Fix again | |
| conda update breaks something | |
| Nuke environment | |
| Rebuild from scratch | |
| Maybe works now |
Typical Python session:
Cell 1: ✓
Cell 2: ✓
Cell 3: TypeError
Cell 4: Fixed → ✓
Cell 5: OOM, kernel died
Cell 6: Restart, re-run all, different error
Cell 7: Works locally, fails in prod
Typical Rust session:
$ cargo build
error[E0308]: 3 errors
$ # fix all three
$ cargo build
Finished
$ ./target/release/myapp
# Works. Same binary works everywhere.
Correctness Tooling Comparison
| Tool Type | Python | Rust |
|---|---|---|
| Linting | pylint, flake8 | clippy (built-in) |
| Type checking | mypy (optional, incomplete) | Compiler (mandatory, complete) |
| Property testing | hypothesis | proptest |
| Fuzz testing | atheris | cargo-fuzz |
| Mutation testing | mutmut | cargo-mutants |
| Memory checking | valgrind (external) | miri (built-in) |
| Thread sanitizer | external tools | Compiler prevents races |
Edge/Airgap Deployment
Python
# Package everything
docker build -t ml-app . # 4GB+ image
docker save ml-app > ml-app.tar
# Transfer 4GB to airgapped system
docker load < ml-app.tar
docker run ml-app
# Hope all dependencies resolve
Rust
cargo build --release --target x86_64-unknown-linux-musl
# Transfer 50MB binary
scp target/release/ml-app airgapped-host:
ssh airgapped-host ./ml-app
# Done. No runtime. No dependencies.
Complete Ecosystem Reference
ML Infrastructure (Sovereign AI Stack)
| Component | Version | Function | Replaces |
|---|---|---|---|
| trueno | 0.7.4 | SIMD/GPU compute | NumPy, CuPy |
| aprender | 0.14.1 | ML algorithms, .apr format | scikit-learn |
| realizar | 0.2.2 | GGUF/SafeTensor inference | transformers |
| pacha | 0.1.1 | Model registry (Ed25519/ChaCha) | MLflow, HF Hub |
| batuta | 0.1.3 | Orchestration/CLI | Airflow, Ray |
| alimentar | - | Data loading/ETL | pandas, Datasets |
| trueno-db | - | GPU analytics | DuckDB |
| trueno-graph | - | Code analysis | - |
| renacer | - | Syscall tracing | strace |
MCP & Tooling
| Component | Function | Key Feature |
|---|---|---|
| pmcp | MCP protocol SDK | 16x faster than TypeScript |
| pforge | Declarative MCP framework | YAML → Rust MCP servers |
Testing & Quality Analysis
| Component | Domain | Key Feature |
|---|---|---|
| pmat | Static analysis | TDG scoring, SATD detection, complexity |
| oip | Defect intelligence | ML classification, Tarantula SBFL |
| probar | Runtime testing | WASM coverage, visual regression, TUI testing |
Tool Responsibilities (non-overlapping):
┌─────────────────────────────────────────────────────────────────┐
│ pmat │ oip │ probar │
├────────────────┼─────────────────────┼──────────────────────────┤
│ SATD detect │ Fault localization │ Browser automation │
│ TDG scoring │ Defect ML │ Visual regression │
│ Complexity │ Commit classify │ WASM block coverage │
│ Dead code │ RAG enhancement │ Pixel heatmaps │
│ Duplicates │ Ensemble models │ TUI falsification │
└────────────────┴─────────────────────┴──────────────────────────┘
See Testing & Quality Ecosystem Spec for detailed comparison.
Migration Transpilers
| Component | Direction | Key Feature |
|---|---|---|
| depyler | Python → Rust | Semantic verification, 27 stdlib modules |
| decy | C → Rust | Ownership inference, <5% unsafe |
| bashrs | Rust → Shell / Bash → Safe Shell | Bidirectional, deterministic |
| ruchy | Ruchy → Rust | New scripting language, WASM |
When to Choose Each
Choose Python/Jupyter When:
- Rapid prototyping and exploration (notebook UX)
- Team already fluent in Python (existing skills)
- Research/experimentation phase (quick iteration)
- Using Python-only libraries with no Rust equivalent
Choose PAIML Ecosystem When:
- Production deployment at scale
- Edge/embedded/airgapped environments
- Regulatory compliance (healthcare, finance, government)
- Security and provenance are mandatory
- Deployment simplicity is priority
- Long-term maintainability matters
- Migrating existing Python/C/Bash codebases
- Using HuggingFace models (GGUF/SafeTensors import = full access)
Quick Start Commands
Sovereign AI Stack
cargo install batuta aprender
batuta analyze --languages --dependencies --tdg
batuta oracle "How do I serve a Llama model locally?"
MCP Tooling
cargo install pmcp pforge-cli pmat
# Build MCP server with pmcp
cargo pmcp new my-mcp-workspace
cargo pmcp dev --server myserver
# Declarative MCP with pforge
pforge new my-server && pforge serve
# Code quality with pmat
pmat context --output context.md
pmat analyze tdg
Testing & Quality Tools
# Static analysis with pmat
cargo install pmat
pmat quality-gate # Run all quality checks
pmat analyze tdg # Technical debt grade
pmat analyze satd # Self-admitted technical debt
# Defect intelligence with oip
cargo install oip
oip extract-training-data --repo . # Analyze git history
oip localize --passed-coverage passed.lcov --failed-coverage failed.lcov
# Runtime testing with probar
cargo add jugar-probar --dev
# See: https://crates.io/crates/jugar-probar
Migration Tools
# Python → Rust
cargo install depyler
depyler compile script.py -o myapp
# C → Rust
cargo install decy
decy transpile-project src/ -o rust_output/
# Safe shell scripts
cargo install bashrs
bashrs build install.rs -o install.sh
bashrs purify messy.sh -o clean.sh
# New Rust-first scripting
cargo install ruchy
ruchy compile script.ruchy -o myapp
Resources
| Resource | Link |
|---|---|
| Sovereign AI Stack | |
| Interactive Examples | interactive.paiml.com |
| Aprender (ML Library) | github.com/paiml/aprender |
| Batuta (Orchestration) | github.com/paiml/batuta |
| Trueno (Compute) | crates.io/crates/trueno |
| MCP & Tooling | |
| pmcp (MCP SDK) | github.com/paiml/rust-mcp-sdk |
| pforge (Declarative MCP) | github.com/paiml/pforge |
| pmat (Quality Toolkit) | github.com/paiml/paiml-mcp-agent-toolkit |
| Migration Tools | |
| depyler (Python→Rust) | github.com/paiml/depyler |
| decy (C→Rust) | github.com/paiml/decy |
| bashrs (Shell Safety) | github.com/paiml/bashrs |
| ruchy (Scripting) | github.com/paiml/ruchy |
Quality Standards Across Ecosystem
All PAIML projects follow Toyota Way principles:
| Standard | Target | Enforcement |
|---|---|---|
| Test Coverage | ≥80% | CI/pre-commit |
| Mutation Kill Rate | ≥80-90% | cargo-mutants |
| Clippy Warnings | 0 | CI blocking |
| Cyclomatic Complexity | ≤10 | PMAT gates |
| Technical Debt (SATD) | 0 | Zero TODO/FIXME |
| TDG Grade | A- minimum | PMAT scoring |
One-Liner Summary
Python ML is a C/C++ stack with scripting glue. The PAIML ecosystem replaces the entire tower with compile-time correctness, single-binary deployment, cryptographic sovereignty, access to ALL HuggingFace models via GGUF/SafeTensors import, and automated migration from Python, C, and Bash.
Navigate: Table of Contents
Roadmap
This chapter is under development.
Coming soon: Detailed information about roadmap.
Navigate: Table of Contents
Contributing Guide
Thank you for your interest in contributing to Batuta!
Getting Started
Prerequisites
- Rust 1.75+ (stable)
- Git
- Cargo
Clone and Build
git clone https://github.com/paiml/batuta.git
cd batuta
cargo build
cargo test
Development Workflow
Branch Strategy
All work happens on main branch. No feature branches.
Quality Gates
Before committing, ensure:
# Format code
cargo fmt
# Run lints
cargo clippy -- -D warnings
# Run tests
cargo test
# Check demo-score (must be A- or higher)
pmat demo-score
Commit Messages
Follow conventional commits:
type(scope): description
- feat: New feature
- fix: Bug fix
- docs: Documentation
- refactor: Code refactoring
- test: Tests
- chore: Maintenance
Example:
feat(stack): Add diagnostics module
- Add anomaly detection
- Add graph metrics
- Add dashboard rendering
(Refs STACK-DIAG)
Code Style
Rust Guidelines
- Use
rustfmtdefaults - No
unwrap()in library code (use?orexpect()with message) - Document public APIs with doc comments
- Add tests for new functionality
Documentation
- Update book chapters for new features
- Keep README current
- Add examples for complex features
Testing
Test Categories
# Unit tests
cargo test --lib
# Integration tests
cargo test --test '*'
# Examples
cargo run --example <name>
Quality Metrics
- Coverage: 85%+ target
- Mutation score: 80%+ target
- Demo score: A- (85) minimum
Pull Requests
- Ensure all quality gates pass
- Update documentation
- Add tests for new code
- Reference issue/ticket in commit
Questions?
- Open an issue on GitHub
- Check existing documentation
Navigate: Table of Contents
License
Batuta is licensed under the MIT License.
MIT License
MIT License
Copyright (c) 2024 Pragmatic AI Labs
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
What This Means
You are free to:
- Use Batuta commercially
- Modify the source code
- Distribute copies
- Include in proprietary software
You must:
- Include the license in copies
- Include the copyright notice
Third-Party Licenses
Batuta depends on various open-source libraries. See Cargo.toml for the full list. All dependencies use permissive licenses (MIT, Apache-2.0, BSD).
Stack Component Licenses
| Component | License |
|---|---|
| Trueno | MIT |
| Aprender | MIT |
| Realizar | MIT |
| Depyler | MIT |
| Batuta | MIT |
| All PAIML crates | MIT |
Navigate: Table of Contents