Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

batuta agent

Sovereign agent runtime using the perceive-reason-act pattern.

Synopsis

batuta agent run --manifest <MANIFEST> --prompt <PROMPT> [--max-iterations <N>] [--daemon]
batuta agent chat --manifest <MANIFEST>
batuta agent validate --manifest <MANIFEST>
batuta agent status --manifest <MANIFEST>
batuta agent sign --manifest <MANIFEST> [--signer <ID>] [--output <PATH>]
batuta agent verify-sig --manifest <MANIFEST> --pubkey <PATH> [--signature <PATH>]
batuta agent contracts

Subcommands

run

Execute a single agent invocation with the given prompt.

batuta agent run --manifest agent.toml --prompt "Summarize the codebase"

Options:

FlagDescription
--manifest <PATH>Path to agent manifest TOML file
--prompt <TEXT>Prompt to send to the agent
--max-iterations <N>Override max iterations from manifest
--daemonRun as a long-lived service (for forjar deployments)

chat

Start an interactive chat session with the agent. Type quit or exit to end.

batuta agent chat --manifest agent.toml

The chat loop runs run_agent_loop() for each user message, maintaining persistent memory across turns (recalled via BM25 when using TruenoMemory).

validate

Validate an agent manifest without running it.

batuta agent validate --manifest agent.toml

status

Display agent manifest summary, resource quotas, model config, and capabilities.

batuta agent status --manifest agent.toml

Reports validation errors (if any), manifest metadata, resource limits (max iterations, tool calls, cost budget), model configuration, and the list of granted capabilities.

sign

Cryptographically sign an agent manifest using Ed25519 via pacha+BLAKE3.

batuta agent sign --manifest agent.toml --signer "admin@paiml.com"
batuta agent sign --manifest agent.toml --output agent.toml.sig

The manifest is normalized to canonical TOML before hashing to ensure deterministic signatures regardless of whitespace or key ordering.

verify-sig

Verify an Ed25519 signature on an agent manifest.

batuta agent verify-sig --manifest agent.toml --pubkey key.pub
batuta agent verify-sig --manifest agent.toml --pubkey key.pub --signature agent.toml.sig

contracts

Display the design-by-contract invariants from contracts/agent-loop-v1.yaml.

batuta agent contracts

Shows all invariants (INV-001 through INV-007), their test bindings, and verification targets (coverage, mutation, complexity thresholds).

Agent Manifest

The agent manifest is a TOML file that configures the runtime:

name = "code-reviewer"
version = "0.1.0"
description = "Reviews code for quality issues"

[model]
model_path = "/models/llama3-8b.gguf"
max_tokens = 4096
temperature = 0.3
system_prompt = "You are a code review assistant."

[resources]
max_iterations = 20
max_tool_calls = 50
max_cost_usd = 0.0  # 0 = unlimited (sovereign)

capabilities = ["Rag", "Memory"]
privacy = "Sovereign"

Architecture

The agent uses a perceive-reason-act loop (Toyota Way: Jidoka):

┌─────────────────────────────────────┐
│         Perceive (Memory Recall)    │
│  Recall relevant memories, augment  │
│  system prompt with context         │
├─────────────────────────────────────┤
│    Context Management [F-003]       │
│  Pre-subtract system+tool tokens,   │
│  truncate messages via SlidingWindow│
├─────────────────────────────────────┤
│         Reason (LLM Completion)     │
│  Send truncated conversation to     │
│  LlmDriver with retry+backoff      │
├─────────────────────────────────────┤
│         Act (Tool Execution)        │
│  Execute tools with capability      │
│  checks (Poka-Yoke), store results  │
├─────────────────────────────────────┤
│         Guard (Jidoka)              │
│  Check iteration limits, ping-pong  │
│  detection, cost budget             │
└─────────────────────────────────────┘

Context Management

The agent integrates serve::context::ContextManager for token-aware truncation before each LLM call. This prevents context overflow errors and ensures long conversations degrade gracefully.

Budget calculation:

effective_window = driver.context_window()
                 - estimate_tokens(system_prompt)
                 - estimate_tokens(tool_definitions)
                 - output_reserve (max_tokens)

The system prompt and tool schemas are pre-subtracted from the window. Only conversation messages are passed to the SlidingWindow truncation strategy, which keeps the most recent messages when the budget is exceeded.

Error modes:

  • If messages fit: no truncation, zero overhead
  • If messages overflow: oldest messages dropped (SlidingWindow)
  • If overflow after truncation: AgentError::ContextOverflow

Retry with Exponential Backoff

Driver calls use automatic retry for transient errors:

Error TypeRetryableBackoff
RateLimitedYes1s, 2s, 4s
OverloadedYes1s, 2s, 4s
NetworkYes1s, 2s, 4s
ModelNotFoundNoImmediate fail
InferenceFailedNoImmediate fail

Maximum 3 retry attempts with exponential backoff (base 1s).

Safety Features

  • LoopGuard: Prevents runaway loops (max iterations, tool call limits)
  • Ping-pong detection: FxHash-based detection of oscillatory tool calls
  • Capability filtering: Tools only accessible if manifest grants capability
  • Cost circuit breaker: Stops execution when cost budget exceeded
  • Context truncation: Automatic SlidingWindow truncation for long conversations
  • Consecutive MaxTokens: Circuit-breaks after 5 consecutive truncated responses
  • Privacy tier: Sovereign (local-only), Private, or Standard

Daemon Mode

The --daemon flag runs the agent as a long-lived service process, suitable for forjar deployments:

batuta agent run \
  --manifest /etc/batuta/agent.toml \
  --prompt "Monitor system health" \
  --daemon

Daemon mode:

  • Runs the agent loop as a background service
  • Responds to SIGTERM/SIGINT for graceful shutdown
  • Designed for systemd integration via forjar provisioning

Examples

# Validate a manifest
batuta agent validate --manifest examples/agent.toml

# Run with a prompt
batuta agent run \
  --manifest examples/agent.toml \
  --prompt "What are the main modules in this project?"

# Override iteration limit
batuta agent run \
  --manifest examples/agent.toml \
  --prompt "Find all TODO comments" \
  --max-iterations 5

# Run as daemon (forjar)
batuta agent run \
  --manifest examples/agent.toml \
  --prompt "Monitor logs" \
  --daemon

Driver Backends

DriverPrivacy TierFeatureDescription
RealizarDriverSovereigninferenceLocal GGUF/APR inference via realizar
MockDriverSovereignagentsDeterministic responses for testing
RemoteDriverStandardnativeHTTP to Anthropic/OpenAI APIs
RoutingDriverConfigurablenativeLocal-first with remote fallback

RoutingDriver

The RoutingDriver wraps a primary (typically local/sovereign) and fallback (typically remote/cloud) driver. Three strategies:

StrategyBehavior
PrimaryWithFallbackTry primary; on retryable error, spillover to fallback
PrimaryOnlyPrimary only, no fallback
FallbackOnlyFallback only, skip primary

Privacy tier inherits the most permissive of the two drivers — if the fallback is Standard, data may leave the machine on spillover.

RemoteDriver

Supports both Anthropic Messages API and OpenAI Chat Completions API:

ProviderEndpointTool Format
Anthropic/v1/messagestool_use content blocks
OpenAI/v1/chat/completionsfunction tool_calls

Error mapping: HTTP 429 → RateLimited, 529/503 → Overloaded, other → Network.

Builtin Tools

ToolCapabilityFeatureDescription
MemoryToolMemoryagentsRead/write agent persistent state
RagToolRagragSearch indexed documentation via BM25+vector
ShellToolShellagentsSandboxed subprocess execution with allowlisting
ComputeToolComputeagentsParallel task execution via JoinSet
BrowserToolBrowseragents-browserHeadless Chromium automation

ShellTool

Executes shell commands with capability-based allowlisting (Poka-Yoke):

  • Only allowlisted commands are executable
  • Working directory is restricted
  • Output truncated to 8192 bytes to prevent context overflow
  • Configurable timeout (default: 30 seconds)

ComputeTool

Parallel task execution for compute-intensive workflows:

  • Single task execution (run action)
  • Parallel execution (parallel action) via tokio JoinSet
  • Max concurrent tasks configurable (default: 4)
  • Output truncated to 16KB per task
  • Configurable timeout (default: 5 minutes)

BrowserTool Actions

ActionInputDescription
navigate{ "url": "..." }Navigate to URL (Sovereign: localhost only)
screenshot{}Take page screenshot (base64 PNG)
evaluate{ "expression": "..." }Evaluate JavaScript
eval_wasm{ "expression": "..." }Evaluate WASM expression
click{ "selector": "..." }Click CSS selector
wait_wasm{}Wait for WASM runtime readiness
console{}Get console messages

Programmatic Usage

Basic Usage

#![allow(unused)]
fn main() {
use batuta::agent::manifest::AgentManifest;
use batuta::agent::driver::mock::MockDriver;
use batuta::agent::memory::InMemorySubstrate;
use batuta::agent::runtime::run_agent_loop;
use batuta::agent::tool::ToolRegistry;

let manifest = AgentManifest::default();
let driver = MockDriver::single_response("Hello!");
let registry = ToolRegistry::default();
let memory = InMemorySubstrate::new();

let result = run_agent_loop(
    &manifest,
    "Say hello",
    &driver,
    &registry,
    &memory,
    None,  // Optional stream event channel
).await?;

println!("Response: {}", result.text);
}

Using AgentBuilder

#![allow(unused)]
fn main() {
use batuta::agent::AgentBuilder;
use batuta::agent::manifest::AgentManifest;
use batuta::agent::driver::mock::MockDriver;

let manifest = AgentManifest::default();
let driver = MockDriver::single_response("Built!");

let result = AgentBuilder::new(&manifest)
    .driver(&driver)
    .run("Hello builder")
    .await?;

println!("{}", result.text);  // "Built!"
}

With Stream Events

#![allow(unused)]
fn main() {
use tokio::sync::mpsc;
use batuta::agent::AgentBuilder;
use batuta::agent::driver::StreamEvent;

let (tx, mut rx) = mpsc::channel(64);

let result = AgentBuilder::new(&manifest)
    .driver(&driver)
    .stream(tx)
    .run("Hello")
    .await?;

while let Ok(event) = rx.try_recv() {
    match event {
        StreamEvent::PhaseChange { phase } => {
            println!("Phase: {phase}");
        }
        StreamEvent::TextDelta { text } => {
            print!("{text}");
        }
        _ => {}
    }
}
}

Quality Gates

The agent module passes all PMAT quality gates:

  • Zero SATD comments (QA-001)
  • All source files ≤500 lines (QA-002)
  • 95%+ line coverage (QA-003)
  • Zero cognitive complexity violations (QA-005)
  • 16/16 design-by-contract invariants verified
  • 27/27 integration demo scenarios passing

Run quality verification:

# Contract invariants
cargo run --example agent_contracts --features agents

# Full integration demos
cargo run --example agent_demo --features agents

See Also