Keyboard shortcuts

Press โ† or โ†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

batuta oracle

Query the Sovereign AI Stack knowledge graph for component recommendations, backend selection, and integration patterns.

Synopsis

batuta oracle [OPTIONS] [QUERY]

Description

Oracle Mode provides an intelligent query interface to the Sovereign AI Stack. It analyzes your requirements and recommends:

  • Primary component for your task
  • Supporting components that integrate well
  • Compute backend (Scalar/SIMD/GPU/Distributed)
  • Code examples ready to use

Options

OptionDescription
--listList all stack components
--show <component>Show details about a specific component
--capabilities <cap>Find components by capability (e.g., simd, ml, transpilation)
--integrate <from> <to>Show integration pattern between two components
--interactiveStart interactive query mode
--format <format>Output format: text (default), json, markdown, code, or code+svg
--arxivEnrich results with relevant arXiv papers from builtin curated database
--arxiv-liveFetch live arXiv papers instead of builtin database
--arxiv-max <n>Maximum arXiv papers to show (default: 3)
--ragUse RAG-based retrieval from indexed stack documentation
--rag-indexIndex/reindex stack documentation for RAG queries
--rag-index-forceClear cache and rebuild index from scratch
--rag-statsShow cache statistics (fast, manifest only)
--rag-dashboardLaunch TUI dashboard for RAG index statistics
--rag-profileEnable RAG profiling output (timing breakdown)
--rag-traceEnable RAG tracing (detailed query execution trace)
--localShow local workspace status (~/src PAIML projects)
--dirtyShow only dirty (uncommitted changes) projects
--publish-orderShow safe publish order respecting dependencies
--pmat-querySearch functions via PMAT quality-annotated code search
--pmat-project-path <path>Project path for PMAT query (defaults to current directory)
--pmat-limit <n>Maximum number of PMAT results (default: 10)
--pmat-min-grade <grade>Minimum TDG grade filter (A, B, C, D, F)
--pmat-max-complexity <n>Maximum cyclomatic complexity filter
--pmat-include-sourceInclude source code in PMAT results
--pmat-all-localSearch across all local PAIML projects in ~/src
-h, --helpPrint help information

Examples

List Stack Components

$ batuta oracle --list

๐Ÿ“š Sovereign AI Stack Components:

Layer 0: Compute Primitives
  - trueno v0.8.8: SIMD-accelerated tensor operations + simulation testing framework
  - trueno-db v0.3.7: High-performance vector database
  - trueno-graph v0.1.4: Graph analytics engine
  - trueno-viz v0.1.5: Visualization toolkit

Layer 1: ML Algorithms
  - aprender v0.19.0: First-principles ML library

Layer 2: Training & Inference
  - entrenar v0.3.0: Training loop framework
  - realizar v0.3.0: ML inference runtime
...

Query Component Details

$ batuta oracle --show aprender

๐Ÿ“ฆ Component: aprender v0.19.0

Layer: ML Algorithms
Description: Next-generation machine learning library in pure Rust

Capabilities:
  - random_forest (Machine Learning)
  - gradient_boosting (Machine Learning)
  - clustering (Machine Learning)
  - neural_networks (Machine Learning)

Integrates with:
  - trueno: Uses SIMD-accelerated tensor operations
  - realizar: Exports models for inference
  - alimentar: Loads training data

References:
  [1] Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32
  [2] Chen & Guestrin (2016). XGBoost: A Scalable Tree Boosting System

Find by Capability

$ batuta oracle --capabilities simd

๐Ÿ” Components with 'simd' capability:
  - trueno: SIMD-accelerated tensor operations

Natural Language Query

$ batuta oracle "How do I train a random forest on 1M samples?"

๐Ÿ“Š Analysis:
  Problem class: Supervised Learning
  Algorithm: random_forest
  Data size: Large (1M samples)

๐Ÿ’ก Primary Recommendation: aprender
   Path: aprender::tree::RandomForest
   Confidence: 95%

๐Ÿ”ง Backend: SIMD
   Rationale: SIMD vectorization optimal for 1M samples

๐Ÿ’ป Code Example:
use aprender::tree::RandomForest;

let model = RandomForest::new()
    .n_estimators(100)
    .max_depth(Some(10))
    .fit(&x, &y)?;

Integration Patterns

$ batuta oracle --integrate depyler aprender

๐Ÿ”— Integration: depyler โ†’ aprender

Pattern: sklearn_migration
Description: Convert sklearn code to aprender

Before (Python/sklearn):
  from sklearn.ensemble import RandomForestClassifier
  model = RandomForestClassifier(n_estimators=100)

After (Rust/aprender):
  use aprender::tree::RandomForest;
  let model = RandomForest::new().n_estimators(100);

Media Production Query

$ batuta oracle "render video from MLT"

๐Ÿ“Š Problem Class: Media Production

๐ŸŽฏ Primary Recommendation
  Component: rmedia
  Confidence: 85%
  Rationale: rmedia is recommended for Media Production tasks

๐Ÿ”ง Supporting Components
  - whisper-apr (70%) โ€” Integrates via audio_extraction pattern
  - certeza (70%) โ€” Integrates via course_quality_gate pattern

๐Ÿ’ก Example Code
  use rmedia::prelude::*;

  let timeline = Timeline::from_mlt("course.mlt")?;
  let job = RenderJob::new(&timeline)
      .output("output.mp4")
      .codec(Codec::H264 { crf: 23 })
      .resolution(1920, 1080);
  job.render()?;
$ batuta oracle --integrate whisper-apr,rmedia

๐Ÿ”— Integration: whisper-apr โ†’ rmedia

Pattern: transcription_pipeline
Description: Transcribe course audio with whisper-apr, feed into rmedia subtitle pipeline

Code Example:
  // 1. Transcribe audio with whisper-apr
  let model = WhisperModel::from_apr("whisper-base.apr")?;
  let transcript = model.transcribe(&audio)?;

  // 2. Burn subtitles into video with rmedia
  rmedia::subtitle::burn_in("lecture.mp4", &transcript.srt(), "output.mp4")?;

Interactive Mode

$ batuta oracle --interactive

๐Ÿ”ฎ Oracle Mode - Ask anything about the Sovereign AI Stack

oracle> What's the fastest way to do matrix multiplication?

๐Ÿ“Š Analysis:
  Problem class: Linear Algebra

๐Ÿ’ก Primary Recommendation: trueno
   Confidence: 85%
   Rationale: SIMD-accelerated matrix operations

๐Ÿ’ป Code Example:
use trueno::prelude::*;

let a = Tensor::from_vec(vec![1.0, 2.0, 3.0, 4.0]).reshape([2, 2]);
let b = Tensor::from_vec(vec![5.0, 6.0, 7.0, 8.0]).reshape([2, 2]);
let c = a.matmul(&b);

oracle> exit
Goodbye!

JSON Output

$ batuta oracle --format json "random forest"

{
  "problem_class": "Supervised Learning",
  "algorithm": "random_forest",
  "primary": {
    "component": "aprender",
    "path": "aprender::tree::RandomForest",
    "confidence": 0.9,
    "rationale": "Random forest for supervised learning"
  },
  "compute": {
    "backend": "SIMD",
    "rationale": "SIMD vectorization optimal"
  },
  "distribution": {
    "needed": false,
    "rationale": "Single-node sufficient"
  }
}

Code Output

Extract raw code snippets for piping to other tools. No ANSI escapes, no metadata โ€” just code. All code output includes TDD test companions (#[cfg(test)] modules) appended after the main code:

# Extract code from a recipe (includes test companion)
$ batuta oracle --recipe ml-random-forest --format code
use aprender::tree::RandomForest;

let model = RandomForest::new()
    .n_estimators(100)
    .max_depth(Some(10))
    .fit(&x, &y)?;

#[cfg(test)]
mod tests {
    #[test]
    fn test_random_forest_construction() {
        let n_estimators = 100;
        assert!(n_estimators > 0);
    }
    // ... 2-3 more focused tests
}

# Natural language queries also include test companions
$ batuta oracle "train a model" --format code > example.rs

# Pipe to rustfmt and clipboard
$ batuta oracle --recipe training-lora --format code | rustfmt | pbcopy

# Dump all cookbook recipes as code (each includes test companion)
$ batuta oracle --cookbook --format code > all_recipes.rs

# Count test companions
$ batuta oracle --cookbook --format code 2>/dev/null | grep -c '#\[cfg('
34

# Commands without code exit with code 1
$ batuta oracle --list --format code
No code available for --list (try --format text)
$ echo $?
1

When the requested context has no code available (e.g., --list, --capabilities, --rag), the process exits with code 1 and a stderr diagnostic suggesting --format text.

RAG-Based Query

Query using Retrieval-Augmented Generation from indexed stack documentation:

$ batuta oracle --rag "How do I fine-tune a model with LoRA?"

๐Ÿ” RAG Oracle Query: "How do I fine-tune a model with LoRA?"

๐Ÿ“„ Retrieved Documents (RRF-fused):
  1. entrenar/CLAUDE.md (score: 0.847)
     "LoRA (Low-Rank Adaptation) enables parameter-efficient fine-tuning..."

  2. aprender/CLAUDE.md (score: 0.623)
     "For training workflows, entrenar provides autograd and optimization..."

๐Ÿ’ก Recommendation:
   Use `entrenar` for LoRA fine-tuning with quantization support (QLoRA).

๐Ÿ’ป Code Example:
   use entrenar::lora::{LoraConfig, LoraTrainer};

   let config = LoraConfig::new()
       .rank(16)
       .alpha(32.0)
       .target_modules(&["q_proj", "v_proj"]);

   let trainer = LoraTrainer::new(model, config);
   trainer.train(&dataset)?;

Index Stack Documentation

Build or update the RAG index from stack CLAUDE.md files and ground truth corpora:

$ batuta oracle --rag-index

๐Ÿ“š RAG Indexer (Heijunka Mode)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Scanning Rust stack repositories...

  โœ“ trueno/CLAUDE.md          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ (12 chunks)
  โœ“ trueno/README.md          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (8 chunks)
  โœ“ aprender/CLAUDE.md        โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ (15 chunks)
  โœ“ realizar/CLAUDE.md        โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (8 chunks)
  ...

Scanning Python ground truth corpora...

  โœ“ hf-ground-truth-corpus/CLAUDE.md      โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (6 chunks)
  โœ“ hf-ground-truth-corpus/README.md      โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ (12 chunks)
  โœ“ src/hf_gtc/hub/search.py              โ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (4 chunks)
  โœ“ src/hf_gtc/preprocessing/tokenization.py โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (6 chunks)
  ...

โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Complete: 28 documents, 186 chunks indexed

Vocabulary: 3847 unique terms
Avg doc length: 89.4 tokens

Reindexer: 28 documents tracked

Query Ground Truth Corpora

Query for Python ML patterns and get cross-language results:

$ batuta oracle --rag "How do I tokenize text for BERT?"

๐Ÿ” RAG Oracle Mode
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Index: 28 documents, 186 chunks

Query: How do I tokenize text for BERT?

1. [hf-ground-truth-corpus] src/hf_gtc/preprocessing/tokenization.py#12 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 82%
   def preprocess_text(text: str) -> str:
       text = text.strip().lower()...

2. [trueno] trueno/CLAUDE.md#156 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘ 65%
   For text preprocessing, trueno provides...

3. [hf-ground-truth-corpus] hf-ground-truth-corpus/README.md#42 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘ 58%
   from hf_gtc.preprocessing.tokenization import preprocess_text...

$ batuta oracle --rag "sentiment analysis pipeline"

# Returns Python pipeline patterns + Rust inference equivalents

RAG Cache Statistics

Show index statistics without a full load (reads manifest only):

$ batuta oracle --rag-stats

๐Ÿ“Š RAG Index Statistics
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Version: 1.0.0
Batuta version: 0.6.2
Indexed at: 2025-01-30 14:23:45 UTC
Cache path: /home/user/.cache/batuta/rag

Sources:
  - trueno: 4 docs, 42 chunks (commit: abc123)
  - aprender: 3 docs, 38 chunks (commit: def456)
  - hf-ground-truth-corpus: 12 docs, 100 chunks

RAG Profiling

Enable profiling to see detailed timing breakdowns for RAG queries:

$ batuta oracle --rag "tokenization" --rag-profile

๐Ÿ” RAG Oracle Query: "tokenization"

๐Ÿ“„ Retrieved Documents (RRF-fused):
  1. trueno/CLAUDE.md (score: 0.82)
     "Tokenization support for text processing..."

๐Ÿ“Š RAG Profiling Results
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  bm25_search:    4.21ms (count: 1)
  tfidf_search:   2.18ms (count: 1)
  rrf_fusion:     0.45ms (count: 1)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  Total query time: 6.84ms
  Cache hit rate: 75.0%

Combine with --rag-trace for even more detailed execution traces:

$ batuta oracle --rag "tokenization" --rag-profile --rag-trace

# Includes detailed per-operation tracing

Syntax Highlighting

Oracle output features rich 24-bit true color syntax highlighting powered by syntect. Code examples in --format text (default) and cookbook recipes are automatically highlighted with the base16-ocean.dark theme:

Color Scheme:

Token TypeColorExample
KeywordsPink (#b48ead)fn, let, use, impl
CommentsGray (#65737e)// comment
StringsGreen (#a3be8c)"hello"
NumbersOrange (#d08770)42, 3.14
FunctionsTeal (#8fa1b3)println!, map
Fn NamesBlue (#8fa1b3)function definitions
AttributesRed (#bf616a)#[derive], #[test]

Example Output:

$ batuta oracle --recipe ml-random-forest

>> Random Forest Training
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Code:
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
use aprender::tree::RandomForest;     # 'use' in pink, path in white

let model = RandomForest::new()       # 'let' in pink, identifiers in white
    .n_estimators(100)                # method in teal, number in orange
    .max_depth(Some(10))
    .fit(&x, &y)?;
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Supported Languages:

  • Rust (primary)
  • Python (ground truth corpora)
  • Go, TypeScript, JavaScript
  • Markdown, TOML, JSON, Shell

The --format code option outputs raw code without highlighting for piping to other tools.

SVG Output Format

Generate Material Design 3 compliant SVG diagrams alongside code examples:

$ batuta oracle --recipe ml-random-forest --format code+svg

# Outputs both:
# 1. Rust code example with TDD test companion
# 2. SVG architecture diagram showing component relationships

$ batuta oracle --recipe training-lora --format code+svg > lora_recipe.rs
# The SVG is generated but only code is written to file

SVG diagrams use:

  • Material Design 3 color palette (#6750A4 primary, etc.)
  • 8px grid alignment for crisp rendering
  • Shape-heavy renderer for architectural diagrams (3+ components)
  • Text-heavy renderer for documentation diagrams (1-2 components)

arXiv Paper Enrichment

Enrich oracle results with relevant academic papers. The builtin curated database provides instant offline results from approximately 120 entries. The live API fetches directly from arXiv for the most current papers.

# Enrich any query with curated arXiv papers
$ batuta oracle "whisper speech recognition" --arxiv

# Show more papers
$ batuta oracle "transformer attention" --arxiv --arxiv-max 5

# Live fetch from arXiv API (requires network)
$ batuta oracle "LoRA fine-tuning" --arxiv-live

# JSON output includes papers array
$ batuta oracle "inference optimization" --arxiv --format json

# Markdown output with linked titles
$ batuta oracle "deep learning" --arxiv --format markdown

Search terms are automatically derived from the query analysis (components, domains, algorithms, and keywords). The --arxiv flag is silently skipped when using --format code to keep output pipe-safe.

Force Rebuild Index

Rebuild from scratch, ignoring fingerprint-based skip. The old cache is retained until the new index is saved (crash-safe two-phase write):

$ batuta oracle --rag-index-force

Force rebuild requested (old cache retained until save)...
๐Ÿ“š RAG Indexer (Heijunka Mode)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Scanning Rust stack repositories...
  โœ“ trueno/CLAUDE.md          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ (12 chunks)
  ...

Complete: 28 documents, 186 chunks indexed
Index saved to /home/user/.cache/batuta/rag

Private RAG Configuration

Index private repositories that should never be committed to version control. Create a .batuta-private.toml file at the project root (git-ignored by default):

[private]
rust_stack_dirs = ["../rmedia", "../infra", "../assetgen"]
rust_corpus_dirs = ["../resolve-pipeline"]
python_corpus_dirs = ["../coursera-stats", "../interactive.paiml.com"]
# Index with private repos merged
$ batuta oracle --rag-index

RAG Indexer (Heijunka Mode)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Private: 6 private directories merged from .batuta-private.toml

  [   index] Indexing Rust stack...
  ...
  โœ“ rmedia/CLAUDE.md    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘ (12 chunks)
  โœ“ rmedia/README.md    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘ (8 chunks)
  โœ“ infra/CLAUDE.md     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ (6 chunks)
  ...

# Query private content
$ batuta oracle --rag "video editor"
1. [rmedia] rmedia/README.md#1  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 100%
   Pure Rust headless video editor...

Edge cases: missing file is silent, malformed TOML prints a warning, empty [private] is a no-op.

RAG Dashboard

Launch the TUI dashboard to monitor RAG index health:

$ batuta oracle --rag-dashboard

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  RAG Oracle Dashboard                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Index Status: HEALTHY          Last Updated: 2 hours ago   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Documents by Priority:                                      โ”‚
โ”‚   P0 (Critical): โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 12 CLAUDE.md         โ”‚
โ”‚   P1 (High):     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ         8 README.md          โ”‚
โ”‚   P2 (Medium):   โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ               4 docs/              โ”‚
โ”‚   P3 (Low):      โ–ˆโ–ˆโ–ˆโ–ˆ                 2 examples/          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Retrieval Quality (last 24h):                               โ”‚
โ”‚   MRR:        0.847  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘                   โ”‚
โ”‚   Recall@5:   0.923  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘                   โ”‚
โ”‚   NDCG@10:    0.891  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Reindex Queue (Heijunka):                                   โ”‚
โ”‚   - entrenar/CLAUDE.md (staleness: 0.72)                    โ”‚
โ”‚   - realizar/CLAUDE.md (staleness: 0.45)                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Local Workspace Discovery

Discover PAIML projects in ~/src with development state awareness:

$ batuta oracle --local

๐Ÿ  Local Workspace Status (PAIML projects in ~/src)

๐Ÿ“Š Summary:
  Total projects: 42
  โœ… Clean:       28
  ๐Ÿ”ง Dirty:       10
  ๐Ÿ“ค Unpushed:    4

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Project          โ”‚ Local    โ”‚ Crates.io โ”‚ State  โ”‚ Git Status      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ trueno           โ”‚ 0.11.0   โ”‚ 0.11.0    โ”‚ โœ… Clean โ”‚                 โ”‚
โ”‚ aprender         โ”‚ 0.24.0   โ”‚ 0.24.0    โ”‚ โœ… Clean โ”‚                 โ”‚
โ”‚ depyler          โ”‚ 3.21.0   โ”‚ 3.20.0    โ”‚ ๐Ÿ”ง Dirty โ”‚ 15 mod, 3 new   โ”‚
โ”‚ entrenar         โ”‚ 0.5.0    โ”‚ 0.5.0     โ”‚ ๐Ÿ“ค Unpushed โ”‚ 2 ahead       โ”‚
โ”‚ batuta           โ”‚ 0.5.0    โ”‚ 0.5.0     โ”‚ โœ… Clean โ”‚                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’ก Dirty projects use crates.io version for deps (stable)

Development State Legend

StateIconMeaning
Cleanโœ…No uncommitted changes, safe to use local version
Dirty๐Ÿ”งActive development, use crates.io version for deps
Unpushed๐Ÿ“คClean but has unpushed commits

Key Insight: Dirty projects donโ€™t block the stack! The crates.io version is stable and should be used for dependencies while local development continues.

Show Only Dirty Projects

Filter to show only projects with uncommitted changes:

$ batuta oracle --dirty

๐Ÿ”ง Dirty Projects (active development)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Project          โ”‚ Local    โ”‚ Crates.io โ”‚ Changes                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ depyler          โ”‚ 3.21.0   โ”‚ 3.20.0    โ”‚ 15 modified, 3 untrackedโ”‚
โ”‚ renacer          โ”‚ 0.10.0   โ”‚ 0.9.0     โ”‚ 8 modified              โ”‚
โ”‚ pmat             โ”‚ 0.20.0   โ”‚ 0.19.0    โ”‚ 22 modified, 5 untrackedโ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’ก These projects are safe to skip - crates.io versions are stable.
   Focus on --publish-order for clean projects ready to release.

Publish Order

Show the safe publish order respecting inter-project dependencies:

$ batuta oracle --publish-order

๐Ÿ“ฆ Suggested Publish Order (topological sort)

Step 1: trueno-graph (0.1.9 โ†’ 0.1.10)
  โœ… Ready - no blockers
  Dependencies: (none)

Step 2: aprender (0.23.0 โ†’ 0.24.0)
  โœ… Ready - no blockers
  Dependencies: trueno

Step 3: entrenar (0.4.0 โ†’ 0.5.0)
  โœ… Ready - no blockers
  Dependencies: aprender

Step 4: depyler (3.20.0 โ†’ 3.21.0)
  โš ๏ธ  Blocked: 15 uncommitted changes
  Dependencies: aprender, entrenar

Step 5: batuta (0.4.9 โ†’ 0.5.0)
  โš ๏ธ  Blocked: waiting for depyler
  Dependencies: all stack components

โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
๐Ÿ“Š Summary:
  Ready to publish: 3 projects
  Blocked: 2 projects

๐Ÿ’ก Run 'cargo publish' in order shown above.
   Skip blocked projects - they'll use crates.io stable versions.

Auto-Update System

The RAG index stays fresh automatically through three layers:

Layer 1: Shell Auto-Fresh (ora-fresh)

# Runs automatically on shell login (non-blocking background check)
# Manual invocation:
$ ora-fresh
โœ… Index is fresh (3h old)

# When a stack repo has been committed since last index:
$ ora-fresh
๐Ÿ“š Stack changed since last index, refreshing...

Layer 2: Post-Commit Hooks

All 26 stack repos have a post-commit hook that touches a stale marker:

# Installed in .git/hooks/post-commit across all stack repos
touch "$HOME/.cache/batuta/rag/.stale" 2>/dev/null

Layer 3: Fingerprint-Based Change Detection

On reindex, BLAKE3 content fingerprints skip work when nothing changed:

# Second run detects no changes via fingerprints
$ batuta oracle --rag-index
โœ… Index is current (no files changed since last index)

# Force reindex ignores fingerprints (old cache retained until save)
$ batuta oracle --rag-index-force
Force rebuild requested (old cache retained until save)...
๐Ÿ“š RAG Indexer (Heijunka Mode)
...
Complete: 5016 documents, 264369 chunks indexed

Each DocumentFingerprint tracks:

  • Content hash (BLAKE3 of file contents)
  • Chunker config hash (detect parameter changes)
  • Model hash (detect embedding model changes)

Search for functions by semantic query with quality annotations (TDG grade, complexity, Big-O):

$ batuta oracle --pmat-query "error handling"

PMAT Query Mode
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

PMAT Query: error handling
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

1. [A] src/pipeline.rs:142  validate_stage          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 92.5
   fn validate_stage(&self, stage: &Stage) -> Result<()>
   Complexity: 4 | Big-O: O(n) | SATD: 0

2. [B] src/backend.rs:88    select_backend          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 78.3
   fn select_backend(&self, workload: &Workload) -> Backend
   Complexity: 8 | Big-O: O(n log n) | SATD: 1

PMAT Query with Filters

Filter results by quality grade or complexity:

# Only grade A functions
$ batuta oracle --pmat-query "serialize" --pmat-min-grade A

# Low complexity functions only
$ batuta oracle --pmat-query "cache" --pmat-max-complexity 5

# Include source code in output
$ batuta oracle --pmat-query "allocator" --pmat-include-source --pmat-limit 3

# JSON output for tooling
$ batuta oracle --pmat-query "error handling" --format json
{
  "query": "error handling",
  "source": "pmat",
  "result_count": 10,
  "results": [...]
}

# Markdown table
$ batuta oracle --pmat-query "serialize" --format markdown

Combined PMAT + RAG Search (RRF-Fused)

Combine function-level code search with document-level RAG retrieval. Results are fused into a single ranked list using Reciprocal Rank Fusion (RRF, k=60):

$ batuta oracle --pmat-query "error handling" --rag

Combined PMAT + RAG (RRF-fused)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

1. [fn] [A] src/pipeline.rs:142  validate_stage          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘ 92.5
   Complexity: 4 | Big-O: O(n) | SATD: 0

2. [doc] [aprender] error-handling.md  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 85%
   Best practices for robust error handling...

3. [fn] [B] src/backend.rs:88   select_backend          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 78.3
   Complexity: 8 | Big-O: O(n log n) | SATD: 1

Summary: 2A 1B | Avg complexity: 4.5 | Total SATD: 0 | Complexity: 1-8

Search across all local PAIML projects in ~/src:

$ batuta oracle --pmat-query "tokenizer" --pmat-all-local

1. [A] [whisper-apr] src/tokenizer/bpe.rs:42  encode          โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 0.3
   Complexity: 3 | Big-O: O(n) | SATD: 0

2. [A] [aprender] src/text/vectorize/mod.rs:918  with_tokenizer  โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 0.1
   Complexity: 1 | Big-O: O(1) | SATD: 0

Summary: 10A | Avg complexity: 1.4 | Total SATD: 0 | Complexity: 1-4

Git History Search (-G / --git-history)

RRF-fused git history combines code search with commit history analysis. The output includes six sections:

$ pmat query "error handling" -G --churn --limit 3

1. Code Results โ€” Functions ranked by relevance with TDG grades, complexity, and churn:

src/parf.rs:279-341 โ”‚ detect_patterns โ”‚ TDG: B โ”‚ O(n^3)
   C:11 โ”‚ L:67 โ”‚ โ†“7 โ”‚ 10c โ”‚ ๐Ÿ”„10% โ”‚ โš 1 โ”‚ ๐Ÿ›4:CLONE

2. Git History (RRF-fused) โ€” Commits matching the query with colored tags and TDG-annotated files:

  1. 6a99f95 [fix] fix(safety): replace critical unwrap() calls  (0.724)
     Noah Gift 2026-01-30
     src/cli/stack.rs [B](3 fixes) faults:24, src/experiment/tree.rs [A] faults:8

  2. 8748f08 [fix] fix(examples): Replace unwrap() with proper error handling (0.672)
     Noah Gift 2025-12-07
     examples/mcp_demo.rs [B] faults:2, examples/stack_diagnostics_demo.rs [A] faults:2

Commit tags are color-coded: [feat] green, [fix] red, [test] yellow. Each file is annotated with its TDG grade and fault count.

3. Hotspots โ€” Top changed files across all commits with fix counts and author ownership:

  Cargo.toml                  61 commits (14.2%)  4 fixes  Noah Gift:97%
  src/main.rs                 60 commits (13.9%)  5 fixes  risk:3.9  Noah Gift:90%
  src/cli/oracle.rs           37 commits ( 8.6%)  5 fixes  Noah Gift:100%

Files with high fix counts and low ownership percentage indicate risk areas.

4. Defect Introduction โ€” Feature commits that needed fixes within 30 days:

  5a3798f Cargo.lock, Cargo.toml                    9 fixes within 30d
  6763cf2 src/cli/oracle.rs, src/main.rs             8 fixes within 30d

Identifies commits that introduced instability โ€” useful for understanding which features were under-tested.

5. Churn Velocity โ€” Commits per week over a 16-week window:

  Cargo.toml                  3.9/wk    (bright red = unstable)
  src/main.rs                 3.9/wk
  src/cli/oracle.rs           2.4/wk    (yellow = moderate)
  README.md                   1.9/wk    (dimmed = stable)

6. Co-Change Coupling โ€” Files that always change together (Jaccard similarity):

  Cargo.lock <-> Cargo.toml     (50 co-changes, J=0.72)   (bright red)
  Cargo.toml <-> src/main.rs    (17 co-changes, J=0.16)
  src/lib.rs <-> src/main.rs    (13 co-changes, J=0.18)

High Jaccard similarity (J > 0.5) indicates tightly coupled files that should be reviewed together.

Enrichment Flags

Enrichment flags add git and AST-derived signals to code search results:

# Git volatility: 90-day commit count, churn score
$ pmat query "error handling" --churn

# Code clone detection: MinHash+LSH similarity
$ pmat query "error handling" --duplicates

# Pattern diversity: repetitive vs unique code
$ pmat query "error handling" --entropy

# Fault annotations: unwrap, panic, unsafe, expect
$ pmat query "error handling" --faults

# Full audit: all enrichment flags + git history
$ pmat query "error handling" --churn --duplicates --entropy --faults -G
FlagDescriptionSource
-G / --git-historyGit history RRF fusion (commits + code)git log
--churnGit volatility (90-day commit count, churn score)git log
--duplicatesCode clone detection (MinHash + LSH)AST
--entropyPattern diversity (repetitive vs unique)AST
--faultsFault annotations (unwrap, panic, unsafe)AST

Quality Distribution Summary

All output modes include an aggregate quality summary showing grade distribution, mean complexity, total SATD, and complexity range:

Summary: 3A 2B 1C | Avg complexity: 5.2 | Total SATD: 2 | Complexity: 1-12

Running the Demo

An interactive demo showcasing PMAT query parsing, quality filtering, output formats, hybrid search, and v2.0 enhancements:

cargo run --example pmat_query_demo --features native

The demo walks through:

  1. Parsing PMAT JSON output โ€” Deserializing function-level results with TDG grades
  2. Quality filtering โ€” Grade, complexity, and SATD filters
  3. Output formats โ€” JSON envelope, markdown table
  4. Hybrid search โ€” RRF-fused ranking (k=60) combining [fn] + [doc] results
  5. Quality signals โ€” TDG score, complexity, Big-O, SATD explained
  6. v2.0 enhancements โ€” Cross-project search, caching, quality summary, backlinks
  7. Git history search โ€” -G flag with RRF-fused commit results, colored tags, TDG-annotated files
  8. Hotspots โ€” Top changed files with fix counts and author ownership
  9. Defect introduction โ€” Feature commits patched within 30 days
  10. Churn velocity โ€” Commits/week with color-coded stability indicators
  11. Co-change coupling โ€” Files that always change together (Jaccard similarity)
  12. Enrichment flags โ€” --churn, --duplicates, --entropy, --faults reference

Exit Codes

CodeDescription
0Success
1General error / no code available (--format code on non-code context)
2Invalid arguments

See Also


Previous: batuta reset Next: Migration Strategy