Oracle Mode
“Ask the Oracle, receive the wisdom of the stack.”
Oracle Mode is the intelligent query interface for the Sovereign AI Stack. Instead of manually researching which components to use, Oracle Mode guides you to the optimal solution based on your requirements.
Overview
Oracle Mode provides:
- Knowledge Graph: Complete registry of stack components with capabilities
- Natural Language Interface: Query in plain English
- Intelligent Recommendations: Algorithm and backend selection
- Code Generation: Ready-to-use examples
┌──────────────────────────────────────────────────────────────────┐
│ ORACLE MODE ARCHITECTURE │
└──────────────────────────────────────────────────────────────────┘
┌─────────────────┐
│ Natural Query │
│ "Train RF" │
└────────┬────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ QUERY ENGINE │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Domain │ │ Algorithm │ │ Performance │ │
│ │ Detection │ │ Extraction │ │ Hints │ │
│ └─────────────┘ └──────────────┘ └──────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE GRAPH │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Layer 0: Primitives → trueno, trueno-db, trueno-graph │ │
│ │ Layer 1: ML → aprender │ │
│ │ Layer 2: Pipeline → entrenar, realizar │ │
│ │ Layer 3: Transpilers → depyler, decy, bashrs, ruchy │ │
│ │ Layer 4: Orchestration→ batuta, repartir │ │
│ │ Layer 5: Quality → certeza, pmat, renacer │ │
│ │ Layer 6: Data → alimentar │ │
│ └───────────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ RECOMMENDER │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Component │ │ Backend │ │ Distribution │ │
│ │ Selection │ │ Selection │ │ Decision │ │
│ └─────────────┘ └──────────────┘ └──────────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────┐
│ Response │
│ + Code Example │
└─────────────────┘
The Sovereign AI Stack
Oracle Mode knows all 20 components in the stack:
| Layer | Components | Purpose |
|---|---|---|
| L0: Primitives | trueno, trueno-db, trueno-graph, trueno-viz, trueno-rag | SIMD/GPU compute, vector storage, graph ops, RAG |
| L1: ML | aprender | First-principles ML algorithms |
| L2: Pipeline | entrenar, realizar | Training loops, inference runtime |
| L3: Transpilers | depyler, decy, bashrs, ruchy | Python/C transpilers + Rust↔Shell bidirectional |
| L4: Orchestration | batuta, repartir, pforge | Migration workflow, distributed compute, MCP servers |
| L5: Quality | certeza, pmat, renacer | Testing, profiling, syscall tracing |
| L6: Data | alimentar, pacha | Data loading, model/recipe registry |
Basic Usage
CLI Interface
# List all stack components
$ batuta oracle --list
# Show component details
$ batuta oracle --show trueno
# Find components by capability
$ batuta oracle --capabilities simd
# Query integration patterns
$ batuta oracle --integrate aprender realizar
# Interactive mode
$ batuta oracle --interactive
Interactive Mode
$ batuta oracle --interactive
🔮 Oracle Mode - Ask anything about the Sovereign AI Stack
oracle> How do I train a random forest on 1M samples?
📊 Analysis:
Problem class: Supervised Learning
Algorithm: random_forest
Data size: Large (1M samples)
💡 Primary Recommendation: aprender
Path: aprender::tree::RandomForest
Confidence: 95%
Rationale: Random forest is ideal for large tabular datasets
🔧 Backend: SIMD
Rationale: SIMD vectorization optimal for 1M samples with High complexity
📦 Supporting Components:
- trueno (95%): SIMD-accelerated tensor operations
- alimentar (70%): Parallel data loading
💻 Code Example:
use aprender::tree::RandomForest;
use alimentar::Dataset;
let dataset = Dataset::from_csv("data.csv")?;
let (x, y) = dataset.split_features_target("label")?;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.n_jobs(-1) // Use all cores
.fit(&x, &y)?;
📚 Related Queries:
- How to optimize random forest hyperparameters?
- How to serialize trained models with realizar?
- How to distribute training with repartir?
Backend Selection
Oracle Mode uses Amdahl’s Law and PCIe transfer overhead (Gregg & Hazelwood, 2011) to select the optimal compute backend.
The 5× Rule
GPU dispatch is only beneficial when compute time exceeds 5× the PCIe transfer time:
If compute_time > 5 × transfer_time → Use GPU
Otherwise → Use SIMD
Backend Decision Matrix
| Operation | Complexity | Small Data | Large Data | GPU Available |
|---|---|---|---|---|
| Element-wise | O(n) | Scalar | SIMD | SIMD (memory-bound) |
| Reductions | O(n) | Scalar | SIMD | SIMD |
| Matrix mult | O(n³) | SIMD | GPU | GPU |
| Conv2D | O(n²k²) | SIMD | GPU | GPU |
| Attention | O(n²d) | SIMD | GPU | GPU |
Backend Selection Example
oracle> What backend for 2048×2048 matrix multiplication?
🎯 Backend Selection:
Operation: Matrix multiplication
Size: 2048 × 2048 = 4.2M elements
Complexity: O(n³) = 8.6B FLOPs
PCIe Transfer: 4.2M × 4 bytes × 2 = 34 MB
Transfer time: 34 MB / 32 GB/s = 1.06 ms
Compute time: 8.6B FLOPs / 20 TFLOPS = 0.43 ms
Ratio: 0.43 / 1.06 = 0.41× (< 5×)
💡 Recommendation: SIMD
Rationale: PCIe overhead dominates. Use trueno SIMD backend.
GPU becomes beneficial at ~8192×8192.
Distribution Decision
Oracle uses Amdahl’s Law for distribution decisions:
Speedup = 1 / ((1 - P) + P/N)
Where:
P = Parallel fraction of workload
N = Number of nodes
Distribution Example
oracle> Should I distribute random forest on 4 nodes?
📊 Amdahl's Law Analysis:
Algorithm: Random Forest
Parallel fraction: 0.95 (tree training is parallelizable)
Nodes: 4
Theoretical speedup: 1 / (0.05 + 0.95/4) = 3.48×
Communication overhead: ~10% per node = 40%
Effective speedup: 3.48 × 0.6 = 2.09×
💡 Recommendation: Yes, distribute with repartir
Expected speedup: 2.09×
Break-even: 2+ nodes
📦 Code Example:
use repartir::{Executor, WorkStealing};
use aprender::tree::RandomForest;
let executor = Executor::new()
.with_workers(4)
.with_scheduler(WorkStealing);
let forest = executor.map(
trees.chunks(25),
|chunk| train_tree_subset(chunk, &data)
).await?;
Knowledge Graph Queries
Find by Capability
oracle> What components support GPU?
🔍 Components with GPU capability:
- trueno: SIMD-accelerated tensor operations with GPU dispatch
- realizar: GPU-accelerated inference runtime
Find by Domain
oracle> What do I need for graph analytics?
🧠 Graph Analytics Components:
- trueno-graph: Graph traversal and algorithms
- trueno-db: Vector storage with graph indexes
Integration Patterns
oracle> How do I integrate depyler with aprender?
🔗 Integration: depyler → aprender
Pattern: sklearn_migration
Description: Convert sklearn code to aprender
Example:
# Original Python (sklearn)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100)
model.fit(X, y)
# After depyler transpilation
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.fit(&x, &y)?;
Academic Foundations
Oracle Mode is grounded in peer-reviewed research:
| Concept | Reference | Application |
|---|---|---|
| PCIe overhead | Gregg & Hazelwood (2011) | Backend selection |
| Amdahl’s Law | Amdahl (1967) | Distribution decisions |
| Roofline model | Williams et al. (2009) | Performance bounds |
| SIMD vectorization | Fog (2022) | Optimization hints |
| Decision trees | Breiman (2001) | Algorithm recommendations |
JSON Output
For programmatic access, use --format json:
$ batuta oracle --format json "random forest large data"
{
"problem_class": "Supervised Learning",
"algorithm": "random_forest",
"primary": {
"component": "aprender",
"path": "aprender::tree::RandomForest",
"confidence": 0.95,
"rationale": "Random forest is ideal for large tabular datasets"
},
"supporting": [
{
"component": "trueno",
"confidence": 0.95,
"rationale": "SIMD-accelerated tensor operations"
}
],
"compute": {
"backend": "SIMD",
"rationale": "SIMD vectorization optimal for large datasets"
},
"distribution": {
"needed": false,
"rationale": "Single-node sufficient for this workload size"
},
"code_example": "use aprender::tree::RandomForest;..."
}
Code Output
For Unix pipeline composition, use --format code to extract raw Rust code with no ANSI escapes and no metadata:
# From a natural language query
$ batuta oracle "train a random forest" --format code
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
# From a cookbook recipe
$ batuta oracle --recipe ml-random-forest --format code
# From an integration pattern
$ batuta oracle --integrate "aprender,realizar" --format code
# Pipe through rustfmt and copy
$ batuta oracle --recipe training-lora --format code | rustfmt | pbcopy
# Dump all recipes with delimiter comments
$ batuta oracle --cookbook --format code
// --- ml-random-forest ---
use aprender::prelude::*;
...
// --- ml-serving ---
use realizar::prelude::*;
...
Code output follows the Jidoka principle: when no code is available, the process exits with code 1 and a stderr diagnostic rather than emitting garbage. Commands like --list, --capabilities, and --rag have no code representation and always exit 1 with --format code.
TDD Test Companions
Every code example — both cookbook recipes and recommender-generated snippets — includes a TDD test companion: a #[cfg(test)] module with 3-4 focused tests. Test companions follow PMAT compliance rules: low cyclomatic complexity, single assertion per test, real crate types.
When using --format code, test companions are appended after the main code:
$ batuta oracle --recipe ml-random-forest --format code
use aprender::tree::RandomForest;
let model = RandomForest::new()
.n_estimators(100)
.max_depth(Some(10))
.fit(&x, &y)?;
#[cfg(test)]
mod tests {
#[test]
fn test_random_forest_construction() {
let n_estimators = 100;
let max_depth = Some(10);
assert!(n_estimators > 0);
assert!(max_depth.unwrap() > 0);
}
#[test]
fn test_prediction_count_matches_input() {
let n_samples = 50;
let predictions = vec![0usize; n_samples];
assert_eq!(predictions.len(), n_samples);
}
#[test]
fn test_feature_importance_sums_to_one() {
let importances = vec![0.4, 0.35, 0.25];
let sum: f64 = importances.iter().sum();
assert!((sum - 1.0).abs() < 1e-10);
}
}
Test companion categories:
| Recipe Type | Test Approach |
|---|---|
| Pure Rust (28 recipes) | Full #[cfg(test)] mod tests block |
| Python+Rust (2 recipes) | Test Rust portion only |
| WASM (3 recipes) | #[cfg(all(test, not(target_arch = "wasm32")))] guard |
| Recommender (5 examples) | Embedded in code_example string |
Recommender code examples (batuta oracle "train a model" --format code) also include test companions inline, so the output is always test-ready.
# Count test companions across all recipes
$ batuta oracle --cookbook --format code 2>/dev/null | grep -c '#\[cfg('
34
# Pipe a recipe with tests through rustfmt
$ batuta oracle --recipe ml-random-forest --format code | rustfmt
See docs/specifications/code-snippets.md for the full specification with Popperian falsification protocol.
Programmatic API
Use Oracle Mode from Rust code:
#![allow(unused)]
fn main() {
use batuta::oracle::{Recommender, OracleQuery, DataSize};
// Natural language query
let recommender = Recommender::new();
let response = recommender.query("train random forest on 1M samples");
println!("Primary: {}", response.primary.component);
println!("Backend: {:?}", response.compute.backend);
// Structured query with constraints
let query = OracleQuery::new("neural network training")
.with_data_size(DataSize::samples(1_000_000))
.with_hardware(HardwareSpec::with_gpu(16.0))
.sovereign_only();
let response = recommender.query_structured(&query);
if response.distribution.needed {
println!("Distribute with: {:?}", response.distribution.tool);
}
}
RAG Oracle (APR-Powered)
The RAG Oracle extends Oracle Mode with Retrieval-Augmented Generation for stack documentation. It indexes all CLAUDE.md and README.md files from stack components and provides semantic search.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ RAG ORACLE PIPELINE │
└─────────────────────────────────────────────────────────────────┘
┌─────────────┐ ┌─────────────────┐ ┌─────────────────────────┐
│ Source │ │ Semantic │ │ Content-Addressable │
│ Docs │ → │ Chunker │ → │ Index (BLAKE3) │
│ (P0-P3) │ │ (Code-aware) │ │ (Poka-Yoke) │
└─────────────┘ └─────────────────┘ └─────────────────────────┘
↓
┌─────────────┐ ┌─────────────────┐ ┌─────────────────────────┐
│ Results │ │ RRF Fusion │ │ Hybrid Retrieval │
│ + Scores │ ← │ (k=60) │ ← │ (BM25 + Dense) │
└─────────────┘ └─────────────────┘ └─────────────────────────┘
Toyota Production System Integration
The RAG Oracle applies Toyota Way principles:
| Principle | Implementation |
|---|---|
| Jidoka | Stop-on-error validation (NaN/Inf detection, dimension mismatch) |
| Poka-Yoke | Content hashing prevents stale indexes (BLAKE3) |
| Heijunka | Load-leveled reindexing via priority queue |
| Muda | Delta-only updates skip unchanged documents |
| Kaizen | Model hash tracking for continuous improvement |
Index Persistence (Section 9.7)
The RAG index is persisted to disk for fast startup and offline usage:
Cache Location: ~/.cache/batuta/rag/
Cache Files:
~/.cache/batuta/rag/
├── manifest.json # Version, checksums, timestamps
├── index.json # Inverted index (BM25 terms)
└── documents.json # Document metadata + chunks
Integrity Validation (Jidoka):
- BLAKE3 checksums for index.json and documents.json
- Version compatibility check (major version must match)
- Checksum mismatch triggers load failure (stop-on-error)
Persistence Flow:
Index (CLI) Persist Load (CLI)
─────────── ─────── ──────────
batuta oracle ┌───────┐ batuta oracle
--rag-index ────▶ │ Cache │ ────▶ --rag "query"
└───────┘
│
▼
batuta oracle ──────▶ Stats
--rag-stats (no full load)
batuta oracle ──────▶ Full Rebuild (two-phase save)
--rag-index-force
RAG CLI Commands
# Index all stack documentation (CLAUDE.md, README.md)
$ batuta oracle --rag-index
📚 RAG Indexer (Heijunka Mode)
──────────────────────────────────────────────────
Scanning stack repositories...
✓ trueno/CLAUDE.md ████████░░░░░░░ (12 chunks)
✓ trueno/README.md ██████░░░░░░░░░ (8 chunks)
✓ aprender/CLAUDE.md ██████████░░░░░ (15 chunks)
...
Complete: 16 documents, 142 chunks indexed
Vocabulary: 2847 unique terms
Avg doc length: 89.4 tokens
# Query with RAG
$ batuta oracle --rag "How do I use SIMD for matrix operations?"
🔍 RAG Oracle Mode
──────────────────────────────────────────────────
Index: 16 documents, 142 chunks
Query: How do I use SIMD for matrix operations?
1. [trueno] trueno/CLAUDE.md#42 ████████░░ 78%
Trueno provides SIMD-accelerated tensor ops...
2. [trueno] trueno/README.md#15 ██████░░░░ 62%
Matrix multiplication with AVX2/AVX-512...
# Show TUI dashboard (native only)
$ batuta oracle --rag-dashboard
# Show cache statistics (fast, manifest only)
$ batuta oracle --rag-stats
📊 RAG Index Statistics
──────────────────────────────────────────────────
Version: 1.0.0
Batuta version: 0.6.2
Indexed at: 2025-01-30 14:23:45 UTC
Sources:
- trueno: 4 docs, 42 chunks
- aprender: 3 docs, 38 chunks
- hf-ground-truth-corpus: 12 docs, 100 chunks
# Force rebuild (old cache retained until save completes)
$ batuta oracle --rag-index-force
Force rebuild requested (old cache retained until save)...
📚 RAG Indexer (Heijunka Mode)
...
RAG TUI Dashboard
The dashboard shows real-time index health, query latency, and retrieval quality:
┌─ Oracle RAG Dashboard ──────────────────────────────────────┐
│ Index Health: 95% | Docs: 16 | Chunks: 142 │
├─────────────────────────────────────────────────────────────┤
│ │
│ Index Status Query Latency │
│ ───────────── ───────────── │
│ > trueno ████████░░ 42 ▁▂▃▄▅▆▇█▆▅▃▂▁ │
│ aprender █████████░ 38 avg: 12ms p99: 45ms │
│ realizar ██████░░░░ 24 │
│ entrenar █████░░░░░ 18 Retrieval Quality │
│ ───────────────── │
│ Recent Queries MRR 0.847 ████████░░ │
│ ───────────── NDCG 0.791 ███████░░░ │
│ 12:34:56 "SIMD tensor" trueno R@10 0.923 █████████░ │
│ 12:34:41 "train model" aprender │
│ │
├─────────────────────────────────────────────────────────────┤
│ [q]uit [r]efresh [↑/↓]navigate │
└─────────────────────────────────────────────────────────────┘
Hybrid Retrieval
RAG Oracle uses hybrid retrieval combining:
- BM25 (Sparse): Term-based matching with IDF weighting
- Dense Retrieval: Embedding-based semantic similarity (placeholder for trueno-db)
- RRF Fusion: Reciprocal Rank Fusion (k=60) combines both rankings
RRF Score = Σ 1/(k + rank) for each retriever
Scalar Int8 Rescoring (Two-Stage Retrieval)
For large-scale dense retrieval, the RAG Oracle implements scalar int8 rescoring based on the HuggingFace embedding quantization research:
┌─────────────────────────────────────────────────────────────────┐
│ TWO-STAGE RESCORING PIPELINE │
└─────────────────────────────────────────────────────────────────┘
Stage 1: Fast Approximate Search Stage 2: Precise Rescoring
──────────────────────────────── ──────────────────────────
┌─────────────┐ ┌─────────────────────────┐
│ Query (f32) │ │ Top 4k candidates │
│ → int8 │ ─────────────────────▶ │ (from Stage 1) │
│ │ i8 × i8 dot product │ │
└─────────────┘ O(n) fast scan │ f32 × i8 rescoring │
│ │ with scale factor │
▼ │ │
┌─────────────┐ │ Final top-k ranking │
│ Index (int8)│ └─────────────────────────┘
│ 4× smaller │
└─────────────┘
Benefits:
- 4× memory reduction (f32 → int8)
- 99% accuracy retention with rescoring
- 3.66× speedup via SIMD acceleration
SIMD Backend Detection:
| Backend | Ops/Cycle | Platforms |
|---|---|---|
| AVX-512 | 64 | Intel Skylake-X, Ice Lake |
| AVX2 | 32 | Intel Haswell+, AMD Zen+ |
| NEON | 16 | ARM64 (M1/M2, Raspberry Pi) |
| Scalar | 1 | Universal fallback |
Quantization (Kaizen):
The quantization uses absmax symmetric quantization with Welford’s online algorithm for numerically stable calibration:
scale = absmax / 127
quantized[i] = clamp(round(x[i] / scale), -128, 127)
Run the Demo:
# Run the scalar int8 rescoring demo
cargo run --example int8_rescore_demo --features native
# Output:
# 🚀 Scalar Int8 Rescoring Retriever Demo
# 🖥️ Detected SIMD Backend: AVX-512
# Int8 operations per cycle: 64
# 📊 Memory Comparison (10 documents × 384 dims):
# f32 storage: 15360 bytes
# int8 storage: 4320 bytes
# Compression: 3.56×
See docs/specifications/retriever-spec.md for the full specification with 100-point Popperian falsification checklist.
Document Priority (Genchi Genbutsu)
Documents are indexed with priority levels:
| Priority | Source | Trigger |
|---|---|---|
| P0 | CLAUDE.md | Every commit |
| P1 | README.md, Cargo.toml, pyproject.toml | On release |
| P2 | docs/.md, src/**/.py | Weekly scan |
| P3 | examples/.rs, tests/**/.py, Docstrings | Monthly scan |
Ground Truth Corpora (Cross-Language)
The RAG Oracle indexes external ground truth corpora for cross-language ML pattern discovery:
┌─────────────────────────────────────────────────────────────────┐
│ GROUND TRUTH CORPUS ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Rust Stack │ │ Python Corpus │ │
│ │ (trueno, etc) │ │ (hf-gtc) │ │
│ │ CLAUDE.md │ │ CLAUDE.md │ │
│ │ README.md │ │ src/**/*.py │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ └─────────────┬─────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ RAG Oracle Index (BM25 + Dense) │ │
│ │ Cross-language search for ML patterns │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Query: "How do I tokenize text for BERT?" │
│ ↓ │
│ Results: hf-gtc/preprocessing/tokenization.py │
│ + candle/trueno Rust equivalent │
│ │
└─────────────────────────────────────────────────────────────────┘
HuggingFace Ground Truth Corpus
Location: ../hf-ground-truth-corpus
A curated collection of production-ready Python recipes for HuggingFace ML workflows:
- 95%+ test coverage with property-based testing (Hypothesis)
- Module structure:
hf_gtc.hub,hf_gtc.inference,hf_gtc.preprocessing,hf_gtc.training - Cross-references: Maps Python patterns to Rust equivalents (candle/trueno)
Query Examples:
# Query for Python ML patterns
$ batuta oracle --rag "How do I tokenize text for BERT?"
# Returns: hf_gtc/preprocessing/tokenization.py + candle equivalent
$ batuta oracle --rag "sentiment analysis pipeline"
# Returns: hf_gtc/inference/pipelines.py patterns
Extending Ground Truth
To add new ground truth corpora:
- Add directory to
python_corpus_dirsinsrc/cli/oracle.rs:cmd_oracle_rag_index() - Ensure corpus has CLAUDE.md and README.md for P0/P1 indexing
- Python source in
src/**/*.pyis indexed as P2 - Run
batuta oracle --rag-indexto rebuild index
Python Chunking
Python files use specialized delimiters for semantic chunking:
| Delimiter | Purpose |
|---|---|
\ndef | Function definitions |
\nclass | Class definitions |
\n def | Method definitions |
\nasync def | Async function definitions |
\n## | Markdown section headers |
Programmatic RAG API
#![allow(unused)]
fn main() {
use batuta::oracle::rag::{RagOracle, ChunkerConfig, SemanticChunker};
// Create RAG Oracle
let oracle = RagOracle::new();
// Query the index
let results = oracle.query("SIMD tensor operations");
for result in results {
println!("{}: {} (score: {:.2})",
result.component,
result.source,
result.score
);
}
// Custom chunking
let config = ChunkerConfig::new(512, 64, &["\n## ", "\nfn "]);
let chunker = SemanticChunker::from_config(&config);
let chunks = chunker.split(content);
}
Auto-Update System
The RAG index stays fresh automatically through a three-layer freshness system:
Layer 1: Shell Auto-Fresh (ora-fresh)
On every shell login, ora-fresh runs in the background to check index freshness:
# Runs automatically on shell login (non-blocking)
ora-fresh
# Manual check
ora-fresh
✅ Index is fresh (3h old)
# When stale
ora-fresh
📚 Stack changed since last index, refreshing...
ora-fresh checks two conditions:
- Stale marker:
~/.cache/batuta/rag/.stale(set by post-commit hooks) - Age: Index older than 24 hours
Layer 2: Post-Commit Hooks (26 repos)
Every commit in any Sovereign AI Stack repository touches a stale marker file:
# .git/hooks/post-commit (installed in all 26 stack repos)
#!/bin/bash
touch "$HOME/.cache/batuta/rag/.stale" 2>/dev/null
This is a zero-overhead signal — the next ora-fresh invocation picks it up and triggers a reindex. No work is done at commit time beyond a single touch call.
Layer 3: Fingerprint-Based Change Detection (BLAKE3)
When a reindex is triggered, BLAKE3 content fingerprints prevent unnecessary work:
batuta oracle --rag-index
✅ Index is current (no files changed since last index)
Each indexed file has a DocumentFingerprint containing:
- Content hash: BLAKE3 hash of file contents
- Chunker config hash: Detects chunking parameter changes
- Model hash: Detects embedding model changes
If no fingerprints have changed, the entire reindex is skipped instantly.
┌─────────────────────────────────────────────────────────────────┐
│ AUTO-UPDATE FLOW │
└─────────────────────────────────────────────────────────────────┘
git commit ─────▶ post-commit hook
touch ~/.cache/batuta/rag/.stale
│
▼
shell login ────▶ ora-fresh (background)
checks .stale marker + 24h age
│
▼
batuta oracle ──▶ fingerprint check (BLAKE3)
--rag-index compare content hashes
skip if nothing changed
│
(changed)│(unchanged)
│ └──▶ "Index is current"
▼
Full reindex (~30s)
Persist new fingerprints
Manual Commands
# Check freshness (instant)
ora-fresh
# Reindex with change detection (skips if current)
batuta oracle --rag-index
# Force full reindex (ignores fingerprints)
batuta oracle --rag-index-force
Key Takeaways
- Query naturally: Ask in plain English, get precise answers
- Trust the math: Backend selection based on PCIe and Amdahl analysis
- Complete stack: All 20 components indexed with capabilities
- Code ready: Get working examples, not just recommendations
- Reproducible: JSON output for automation and CI/CD
Next Steps
Try Oracle Mode yourself:
# Run the Oracle demo
cargo run --example oracle_demo --features native
# Run the RAG Oracle demo
cargo run --example rag_oracle_demo --features native
# Run the Scalar Int8 Rescoring demo
cargo run --example int8_rescore_demo --features native
# Index stack documentation for RAG
batuta oracle --rag-index
# Query with RAG
batuta oracle --rag "How do I train a model?"
# Start interactive mode
batuta oracle --interactive
# Query from CLI
batuta oracle "How do I migrate sklearn to Rust?"
Previous: Renacer: Syscall Tracing Next: Example Overview