Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Phase 1: Analysis

Phase 1 is the entry point of the Batuta transpilation pipeline. It scans the source project to build a complete understanding of what needs to be converted before any code transformation begins.

What Analysis Produces

The AnalysisStage walks the source directory and generates a ProjectAnalysis containing:

  • Language map – which files are Python, C, Shell, or mixed
  • Dependency graph – pip, Conda, npm, Makefile dependencies detected
  • TDG score – Technical Debt Grade from PMAT static analysis
  • ML framework usage – PyTorch, sklearn, NumPy import detection
  • Transpiler recommendation – which tool handles each language

Pipeline Integration

Analysis populates the PipelineContext that flows through all subsequent stages:

#![allow(unused)]
fn main() {
pub struct PipelineContext {
    pub input_path: PathBuf,
    pub output_path: PathBuf,
    pub primary_language: Option<Language>,
    pub file_mappings: Vec<(PathBuf, PathBuf)>,
    pub metadata: HashMap<String, serde_json::Value>,
    // ...
}
}

The primary_language field drives transpiler selection in Phase 2. The metadata map carries TDG scores, dependency counts, and ML framework details forward.

CLI Usage

# Full analysis with all sub-phases
batuta analyze --languages --dependencies --tdg /path/to/project

# Language detection only
batuta analyze --languages /path/to/project

# JSON output for tooling integration
batuta analyze --languages --format json /path/to/project

Analysis Sub-Phases

Sub-PhaseInputOutput
Language DetectionFile extensions, shebangsVec<LanguageStats>, primary_language
Dependency Analysisrequirements.txt, Makefile, etc.Vec<DependencyInfo>
TDG ScoringSource code via PMATtdg_score: Option<f64>
ML DetectionPython import statementsConversion recommendations

Jidoka Behavior

If the source directory does not exist or contains no recognizable files, the AnalysisStage returns an error. The pipeline’s ValidationStrategy::StopOnError setting halts execution immediately, preventing downstream stages from operating on invalid input.

Phase 1 fails --> Phase 2 never starts --> No broken output

Transpiler Recommendation

Based on the detected primary language, Analysis recommends a transpiler:

Primary LanguageRecommended Transpiler
PythonDepyler (Python to Rust)
C / C++Decy (C/C++ to Rust)
ShellBashrs (Shell to Rust)
RustAlready Rust (consider Ruchy)

Sub-Phase Details

Each sub-phase is documented in its own section:


Navigate: Table of Contents