APR Complete Specification

Version: 2.0.0-draft Status: Draft Created: 2025-12-16 GitHub Issue: https://github.com/paiml/aprender/issues/119

Abstract
Design Principles
APR Binary Format
CLI Operations
Auxiliary Data Patterns
Format Comparison
Error Handling
Configuration
Quality Gates
Multi-Format Conversion Specification
Conversion QA Checklist (25 Points)
Automated Conversion Validation
Falsification QA Checklist (Legacy)
Implementation Roadmap
References
Appendices

1. Abstract

APR (Aprender Portable Representation) is a WASM-first model serialization format for machine learning models. This specification covers:

APR Binary Format: Binary format supporting web-scale models (10B+ parameters) with tensor alignment, LZ4 streaming compression, and multi-file sharding
CLI Operations: Comprehensive tooling for inspect, debug, trace, export, convert, import, merge, diff, and validate operations
Auxiliary Data: Patterns for storing vocabulary, tokenizer config, mel filterbanks, and other model-specific data

2. Design Principles

2.1 WASM-First Design

WASM-first: Must work in wasm32-unknown-unknown without Emscripten
Progressive enhancement: Features degrade gracefully (mmap → heap, compression → raw)
Single format: ONE format specification, no versioning complexity
Zero-copy where possible: Alignment enables direct tensor access
Streaming: Support chunked loading for large models

2.2 Toyota Way Alignment

Principle	Application
Genchi Genbutsu	Go and see the actual model data, not abstractions
Visualization	Make model internals visible for debugging
Jidoka	Stop on quality issues (corrupted models, NaN weights)
Kaizen	Continuous improvement via diff and merge operations
Standardization	Consistent CLI interface across all operations

3. APR Binary Format

3.1 Format Overview

┌─────────────────────────────────────────────────────────────┐
│ Header (32 bytes, aligned)                                  │
├─────────────────────────────────────────────────────────────┤
│ Metadata Section (JSON, variable length)                    │
├─────────────────────────────────────────────────────────────┤
│ Tensor Index (binary, variable length)                      │
├─────────────────────────────────────────────────────────────┤
│ [Padding to 64-byte alignment]                              │
├─────────────────────────────────────────────────────────────┤
│ Tensor Data Section (aligned tensors)                       │
│   ├── Tensor 0 (64-byte aligned)                           │
│   ├── Tensor 1 (64-byte aligned)                           │
│   └── ...                                                   │
├─────────────────────────────────────────────────────────────┤
│ Footer (16 bytes)                                           │
└─────────────────────────────────────────────────────────────┘

3.2 Header (32 bytes)

Offset	Size	Field	Description
0	4	magic	`APR2` (0x41505232)
4	2	version_major	Format major version (2)
6	2	version_minor	Format minor version (0)
8	4	flags	Feature flags (see below)
12	4	metadata_offset	Offset to metadata section
16	4	metadata_size	Size of metadata section
20	4	index_offset	Offset to tensor index
24	4	index_size	Size of tensor index
28	4	data_offset	Offset to tensor data section

3.3 Feature Flags

bitflags! {
    pub struct AprFlags: u32 {
        const COMPRESSED     = 0b0000_0001;  // LZ4 compression enabled
        const ALIGNED_64     = 0b0000_0010;  // 64-byte tensor alignment
        const ALIGNED_32     = 0b0000_0100;  // 32-byte tensor alignment (GGUF compat)
        const SHARDED        = 0b0000_1000;  // Multi-file model
        const ENCRYPTED      = 0b0001_0000;  // AES-256-GCM encryption
        const SIGNED         = 0b0010_0000;  // Ed25519 signature present
        const QUANTIZED      = 0b0100_0000;  // Contains quantized tensors
        const STREAMING      = 0b1000_0000;  // Streaming-optimized layout
    }
}

3.4 Metadata Section

JSON object containing model configuration and auxiliary data.

Required Keys

{
  "apr_version": "2.0.0",
  "model_type": "whisper",
  "architecture": {
    "n_vocab": 51865,
    "n_audio_ctx": 1500,
    "n_text_ctx": 448,
    "n_mels": 80,
    "n_audio_layer": 4,
    "n_text_layer": 4,
    "n_audio_head": 6,
    "n_text_head": 6,
    "n_audio_state": 384,
    "n_text_state": 384
  }
}

Optional Keys

{
  "vocab": ["<|endoftext|>", "<|startoftranscript|>", "..."],
  "mel_filterbank": [0.0, 0.0, "..."],
  "mel_filterbank_shape": [80, 201],
  "tokenizer_config": { "..." },
  "model_card": { "..." },
  "quantization": {
    "method": "Q8_0",
    "bits_per_weight": 8.5
  }
}

3.5 Tensor Index (Binary)

Index Header (8 bytes)

Offset	Size	Field
0	4	tensor_count
4	4	reserved

Tensor Entry (variable, ~40+ bytes each)

Offset	Size	Field	Description
0	2	name_len	Length of tensor name
2	name_len	name	UTF-8 tensor name
+0	1	dtype	Data type enum
+1	1	n_dims	Number of dimensions (1-8)
+2	8×n_dims	dims	Dimension sizes (u64 each)
+n	8	offset	Byte offset in data section
+n+8	8	size	Compressed size (or raw size)
+n+16	8	raw_size	Uncompressed size (0 if not compressed)
+n+24	4	flags	Per-tensor flags

Data Type Enum

#[repr(u8)]
pub enum DType {
    F32 = 0, F16 = 1, BF16 = 2, I8 = 3, I16 = 4, I32 = 5, I64 = 6, U8 = 7,
    Q8_0 = 16, Q4_0 = 17, Q4_1 = 18, Q5_0 = 19, Q5_1 = 20,
}

3.6 Tensor Data Section

Tensors stored contiguously with alignment padding.

Default: 64-byte alignment (cache-line optimal)
GGUF-compatible: 32-byte alignment
Compression: Per-tensor LZ4 block compression (64KB blocks)

Offset	Size	Field	Description
0	4	crc32	CRC32 of all preceding bytes
4	4	magic_end	`2RPA` (reverse magic)
8	8	file_size	Total file size for validation

3.8 Sharding (Multi-File)

For models > 2GB, use manifest + shard files.

{
  "apr_version": "2.0.0",
  "sharded": true,
  "shard_count": 4,
  "shards": [
    {"file": "model-00001-of-00004.apr", "size": 2147483648, "crc32": "..."},
    {"file": "model-00002-of-00004.apr", "size": 2147483648, "crc32": "..."}
  ],
  "tensor_shard_map": {
    "encoder.conv1.weight": 0,
    "decoder.token_embedding.weight": 1
  }
}

3.9 WASM Considerations

pub trait StreamingLoader {
    fn load_metadata(&mut self) -> Result<AprMetadata>;
    fn load_index(&mut self) -> Result<Vec<TensorDescriptor>>;
    fn load_tensor(&mut self, name: &str) -> Result<Tensor>;
    fn prefetch(&mut self, names: &[&str]);
}

4. CLI Operations

4.1 Command Overview

apr - APR Model Operations Tool

COMMANDS:
    inspect     Inspect model metadata, vocab, and structure
    debug       Simple debugging output ("drama" mode)
    validate    Validate model integrity
    diff        Compare two models
    tensors     List tensor information
    export      Export model to other formats
    import      Import from external formats
    convert     Convert between model types
    merge       Merge multiple models
    trace       Trace model operations with renacer
    lint        Check for best practices and conventions
    explain     Explain errors, architecture, and tensors
    tui         Interactive terminal UI for exploration

4.2 Inspect Command

$ apr inspect whisper.apr

=== whisper.apr ===
Type:        NeuralCustom (Whisper ASR)
Version:     1.0
Size:        1.5 GB (compressed: 890 MB)
Parameters:  39,000,000
Vocab Size:  51,865
Flags:       COMPRESSED | SIGNED
Checksum:    0xA1B2C3D4 (valid)

Options: --vocab, --filters, --json, --full

4.2.1 Visual Inspection

For suspect tensors, generate an in-terminal histogram to visualize distributions (e.g., detecting shifted means):

$ apr tensors model.apr --hist encoder.layer_norm.weight

Distribution: encoder.layer_norm.weight (shape: [384])
Min: 10.4  Max: 12.1  Mean: 11.2  Std: 0.2

       |          *
       |         ***
  50%  |        *****
       |       *******
       |      *********
       +------------------
       10.0      11.2      12.5

4.3 Debug Command ("Drama" Mode)

$ apr debug whisper.apr --drama

====[ DRAMA: whisper.apr ]====

ACT I: THE HEADER
  Scene 1: Magic bytes... APRN (applause!)
  Scene 2: Version check... 1.0 (standing ovation!)

ACT II: THE METADATA
  Scene 1: Parameters... 39,000,000 (a cast of millions!)

ACT III: THE VERDICT
  CURTAIN CALL: Model is PRODUCTION READY!

Options: --hex, --strings, --limit

4.4 Validate Command

$ apr validate model.apr --quality

=== 100-Point Quality Assessment ===

Structure (25 pts):     24/25
Security (25 pts):      20/25
Weights (25 pts):       25/25
Metadata (25 pts):      22/25

TOTAL: 91/100 (EXCELLENT)

4.5 Diff Command

$ apr diff model_v1.apr model_v2.apr

Similarity: 94.2%
Weight Changes: Max delta 0.0234, L2 distance 1.234
Vocab Changes: Added 42 tokens, Removed 3 tokens

Diff vs Reference

Compare an APR model against a raw .safetensors reference to detect translation drift:

$ apr diff model.apr source.safetensors --tensor-mapping mapping.json

# Output:
# encoder.conv1.weight: MATCH (delta < 1e-6)
# encoder.layer_norm.weight: DRIFT (delta = 10.2) !!!

4.6 Export Command

Format	Extension	Use Case
ONNX	`.onnx`	Cross-framework inference
SafeTensors	`.safetensors`	HuggingFace ecosystem
GGUF	`.gguf`	llama.cpp / local inference
TorchScript	`.pt`	PyTorch deployment

apr export model.apr --format gguf --quantize q4_0 --output model.gguf

4.7 Import Command

apr import hf://openai/whisper-tiny --output whisper.apr
apr import model.safetensors --from safetensors --output model.apr

4.8 Convert Command

Model optimization and size reduction operations.

apr convert model.apr --quantize q8_0 --output model_q8.apr
apr convert model.apr --precision fp16 --output model_fp16.apr

4.8.1 Size Reduction Techniques

Technique	Flag	Reduction	Quality	Reversible
Quantization	`--quantize`	2-8x	Low loss	No
Compression	`--compress`	1.2-2x	Lossless	Yes
Pruning	`--prune`	2-10x	Medium	No
Distillation	`--distill`	2-10x	Medium	No
Low-rank (SVD)	`--lowrank`	2-4x	Low loss	No
Sparsity	`--sparse`	2-5x	Low loss	Yes

Quantization

Reduce precision of weights:

# Integer quantization
apr convert model.apr --quantize int8 -o model-int8.apr      # 4x smaller
apr convert model.apr --quantize int4 -o model-int4.apr      # 8x smaller

# Float quantization
apr convert model.apr --quantize fp16 -o model-fp16.apr      # 2x smaller
apr convert model.apr --quantize bf16 -o model-bf16.apr      # 2x smaller

# GGUF-style quantization
apr convert model.apr --quantize q4_k_m -o model-q4km.apr    # 4.5 bits/weight
apr convert model.apr --quantize q8_0 -o model-q8.apr        # 8 bits/weight

Compression

Lossless compression of tensor data:

# LZ4 (fast, default)
apr convert model.apr --compress lz4 -o model-lz4.apr

# Zstd (better ratio)
apr convert model.apr --compress zstd -o model-zstd.apr
apr convert model.apr --compress zstd:19 -o model-zstd19.apr  # Max compression

# Combine with quantization
apr convert model.apr --quantize int8 --compress zstd -o model-int8-zstd.apr

Pruning

Remove low-magnitude weights:

# Unstructured pruning (sparse tensors)
apr convert model.apr --prune 0.5 -o model-pruned.apr        # 50% sparsity

# Structured pruning (remove entire neurons/heads)
apr convert model.apr --prune-heads 2 -o model-pruned.apr    # Remove 2 attention heads
apr convert model.apr --prune-layers 1 -o model-pruned.apr   # Remove 1 layer

# Magnitude-based with threshold
apr convert model.apr --prune-threshold 0.01 -o model-pruned.apr

Distillation

Train smaller model from larger (requires reference data):

# Distill to smaller architecture
apr convert model-large.apr --distill tiny --data train.jsonl -o model-tiny.apr

# Layer reduction
apr convert model.apr --distill-layers 4 --data train.jsonl -o model-4layer.apr

# Knowledge distillation with temperature
apr convert model.apr --distill small --temperature 2.0 --data train.jsonl -o model-small.apr

Note: Distillation requires training data and compute. Use --epochs and --lr to control.

Low-Rank Factorization

Decompose weight matrices using SVD/LoRA:

# SVD decomposition
apr convert model.apr --lowrank svd --rank 64 -o model-svd.apr

# LoRA-style decomposition
apr convert model.apr --lowrank lora --rank 16 -o model-lora.apr

# Target specific layers
apr convert model.apr --lowrank svd --rank 32 --target "*.fc1.weight" -o model-svd.apr

Sparsity Encoding

Efficient storage for sparse tensors:

# CSR format for sparse tensors
apr convert model.apr --sparse csr --threshold 0.001 -o model-sparse.apr

# Block sparsity (GPU-friendly)
apr convert model.apr --sparse block:4 -o model-block-sparse.apr

4.8.2 Combination Examples

# Maximum compression pipeline
apr convert model.apr \
  --quantize int4 \
  --prune 0.3 \
  --compress zstd:19 \
  -o model-optimized.apr
# Result: ~20x smaller than original

# WASM-optimized (fast decode, small size)
apr convert model.apr \
  --quantize int8 \
  --compress lz4 \
  -o model-wasm.apr
# Result: ~5x smaller, fast streaming decode

# Quality-preserving compression
apr convert model.apr \
  --quantize fp16 \
  --lowrank svd --rank 128 \
  --compress zstd \
  -o model-quality.apr
# Result: ~3x smaller, minimal quality loss

4.8.3 Size Comparison Table

Technique	Whisper Tiny	Whisper Base	LLaMA 7B
Original (f32)	145 MB	290 MB	26 GB
fp16	73 MB	145 MB	13 GB
int8	37 MB	73 MB	6.5 GB
int4	19 MB	37 MB	3.3 GB
int4 + zstd	15 MB	29 MB	2.6 GB
int4 + prune50%	10 MB	19 MB	1.7 GB

4.8.4 Quality Validation (Pre vs Post)

Compare model quality before and after optimization:

# Compare outputs between original and optimized
apr validate model.apr model-optimized.apr --quality

Quality Comparison: model.apr vs model-optimized.apr
═══════════════════════════════════════════════════════════════
                          Original    Optimized    Δ
Tensor count              167         167          0
Total params              39.0M       39.0M        0
Non-zero params           39.0M       19.5M        -50%
Size                      145 MB      15 MB        -89%

Output Comparison (10 test inputs):
  Mean L2 distance:       0.0234      (threshold: 0.1)  ✓ PASS
  Max L2 distance:        0.0891      (threshold: 0.5)  ✓ PASS
  Cosine similarity:      0.9987      (threshold: 0.99) ✓ PASS

Layer-by-layer drift:
  encoder.conv1:          0.001       ✓
  encoder.layer_norm:     0.002       ✓
  decoder.layer_norm:     0.089       ⚠ (highest drift)

VERDICT: ✓ PASS - Optimized model within quality tolerance
═══════════════════════════════════════════════════════════════

Canary Inputs

Define reference inputs with expected outputs for regression testing:

# Create canary test suite
apr canary create model.apr --input test.wav --output canary.json

# Validate optimized model against canary
apr canary check model-optimized.apr --canary canary.json

Canary Test Results:
  Input: test.wav
  Expected: "The quick brown fox jumps over the lazy dog"
  Original:  "The quick brown fox jumps over the lazy dog"  ✓
  Optimized: "The quick brown fox jumps over the lazy dog"  ✓

  Token-level accuracy: 100%
  Character error rate: 0.0%

Automatic Quality Gates

# Fail optimization if quality degrades beyond threshold
apr convert model.apr --quantize int4 --prune 0.5 \
  --quality-check \
  --max-drift 0.1 \
  --canary canary.json \
  -o model-optimized.apr

# If quality check fails:
# ERROR: Quality gate failed
#   - L2 drift: 0.24 (max: 0.1)
#   - Canary "test.wav" failed: expected "fox" got "box"
# Use --force to ignore quality gates

4.8.5 Payload Tracing (Radioactive Tracer)

Trace a payload through the model step-by-step, like a radioactive tracer in medicine:

apr trace model.apr --input test.wav --trace-payload

Payload Trace: test.wav → model.apr
═══════════════════════════════════════════════════════════════

Step 1: Audio Input
  Shape: [1, 480000]  (30s @ 16kHz)
  Stats: mean=0.002, std=0.15, range=[-0.98, 0.97]

Step 2: Mel Spectrogram
  Shape: [1, 80, 3000]
  Stats: mean=-4.2, std=2.1
  ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁  (frequency distribution)

Step 3: encoder.conv1
  Shape: [1, 384, 3000]
  Stats: mean=0.12, std=0.34
  Time: 2.3ms
  ⚠ Activation spike at position 1247 (value: 12.4)

Step 4: encoder.conv2
  Shape: [1, 384, 1500]
  Stats: mean=0.08, std=0.29
  Time: 1.8ms

Step 5: encoder.positional_embedding
  Shape: [1, 1500, 384]
  Stats: mean=0.08, std=0.31

Step 6: encoder.layers.0.self_attn
  Shape: [1, 1500, 384]
  Attention pattern:
  ░░░░░░░░░░░░░░░░░░░░
  ░░░░████░░░░░░░░░░░░  ← attending to positions 40-80
  ░░░░░░░░░░░░████░░░░

  ... (layers 1-3) ...

Step 10: encoder.layer_norm
  Shape: [1, 1500, 384]
  Stats: mean=0.00, std=1.02  ✓ (properly normalized)

Step 11: decoder.token_embedding (SOT token)
  Shape: [1, 1, 384]
  Token: <|startoftranscript|> (50258)

  ... (decoder steps) ...

Step 47: Output Logits
  Shape: [1, 12, 51865]
  Top predictions:
    1. "The" (0.94)
    2. "A" (0.03)
    3. "This" (0.01)

═══════════════════════════════════════════════════════════════
Total time: 142ms | Peak memory: 312MB | Tokens generated: 12

Comparing Traces (Diff Mode)

Compare payload path between two models:

apr trace model.apr model-optimized.apr --input test.wav --diff

Trace Diff: model.apr vs model-optimized.apr
═══════════════════════════════════════════════════════════════

Step    Layer                    Original     Optimized    Drift
─────   ─────                    ────────     ─────────    ─────
1       audio_input              ████████     ████████     0.000
2       mel_spectrogram          ████████     ████████     0.000
3       encoder.conv1            ████████     ███████░     0.012
4       encoder.conv2            ████████     ███████░     0.018
...
10      encoder.layer_norm       ████████     ██████░░     0.089 ⚠
11      decoder.token_embed      ████████     ████████     0.001
...
47      output_logits            ████████     ███████░     0.023

Divergence detected at: encoder.layer_norm (step 10)
  Original mean:  0.0023
  Optimized mean: 0.0892

Recommendation: Check layer norm weight quantization

Anomaly Detection

Automatically detect unusual activations:

apr trace model.apr --input test.wav --detect-anomalies

Anomaly Report:
═══════════════════════════════════════════════════════════════

⚠ ANOMALY at encoder.layers.2.self_attn (step 8)
  - Activation explosion: max=847.3 (expected <10)
  - Possible cause: NaN propagation or weight corruption
  - Affected tokens: positions 120-135

⚠ ANOMALY at decoder.layer_norm (step 15)
  - Dead neurons: 12% of outputs are exactly 0
  - Possible cause: Aggressive pruning or ReLU saturation

✓ No anomalies in remaining 45 layers

Interactive Trace Mode (TUI)

apr trace model.apr --input test.wav --interactive

┌─────────────────────────────────────────────────────────────────┐
│  Payload Trace: test.wav                        [Interactive]   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─ Pipeline ───────────────────────────────────────────────┐  │
│  │                                                          │  │
│  │  [Audio] ──▶ [Mel] ──▶ [Conv1] ──▶ [Conv2] ──▶ ...      │  │
│  │     ✓         ✓         ✓          ✓                     │  │
│  │                                    ▲                      │  │
│  │                                    │ YOU ARE HERE         │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Current Layer: encoder.conv2 ───────────────────────────┐  │
│  │ Input:  [1, 384, 3000]   Output: [1, 384, 1500]          │  │
│  │ Params: 589,824          Time: 1.8ms                     │  │
│  │                                                          │  │
│  │ Activation Distribution:                                 │  │
│  │     ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁                                      │  │
│  │   -2.0            0            2.0                       │  │
│  │                                                          │  │
│  │ Weight Stats: mean=0.002, std=0.04                       │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Payload Snapshot ───────────────────────────────────────┐  │
│  │ [0.12, 0.34, -0.21, 0.08, 0.45, -0.11, 0.02, ...]       │  │
│  │ mean=0.08  std=0.29  min=-1.2  max=2.1                  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [←/→] step  [Enter] inspect  [d]iff  [e]xport  [q]uit   4/47   │
└─────────────────────────────────────────────────────────────────┘

Export Trace for Analysis

# Export full trace to JSON
apr trace model.apr --input test.wav --export trace.json

# Export to Chrome trace format (for chrome://tracing)
apr trace model.apr --input test.wav --export trace.perfetto

# Export intermediate activations for debugging
apr trace model.apr --input test.wav --dump-activations ./activations/

4.8.6 Debugging Conversion

# Analyze source tensor stats without converting
apr convert model.safetensors --analyze-source --arch whisper

# Output:
# [PASS] encoder.conv1.weight: mean=0.003 (expected ~0.0)
# [FAIL] encoder.layer_norm.weight: mean=11.2 (expected ~1.0) -> SOURCE ALREADY CORRUPT?

4.9 Merge Command

Strategy	Description
`average`	Average weights (ensemble)
`weighted`	Weighted average by performance
`ties`	TIES merging (trim, elect, sign)
`dare`	DARE merging (drop and rescale)
`slerp`	Spherical linear interpolation

apr merge model1.apr model2.apr --strategy ties --output merged.apr

4.10 Trace Command

$ apr trace model.apr --input sample.wav

Layer                          Time (ms)   Memory (MB)
encoder.conv1                      12.3         45.2
decoder.attention.0                15.4         12.3
TOTAL                             142.5        312.4

4.11 Lint Command

Static analysis for best practices, conventions, and "soft" requirements. Unlike validate (which checks for corruption/invalidity), lint checks for quality and standardization.

$ apr lint model.apr

[WARN] Metadata: Missing 'license' field
[WARN] Metadata: Missing 'model_card'
[INFO] Tensor Naming: 'encoder.w' should be 'encoder.weight' for auto-mapping
[INFO] Efficiency: 12 tensors could be aligned to 64 bytes (currently 32)

Falsifiable Guarantees (Must Fail If):

Naming: Any tensor name not matching canonical schema (Section 10.8) raises INFO/WARN.
Metadata: Missing license, model_card, or provenance raises WARN.
Efficiency: Tensors unaligned to 64 bytes raise INFO.
Compression: Uncompressed tensors >1MB raise INFO.

4.12 Explain Command

Provides human-readable context, architectural explanations, and error troubleshooting.

Explain Model Architecture

$ apr explain model.apr

This is a **Whisper (Tiny)** model.
- **Purpose**: Automatic Speech Recognition (ASR)
- **Architecture**: Encoder-Decoder Transformer
- **Input**: 80-channel Mel spectrograms
- **Output**: Text tokens (multilingual)

Explain Specific Tensor

$ apr explain model.apr --tensor encoder.conv1.weight

**encoder.conv1.weight**
- **Role**: Initial feature extraction (Audio -> Latent)
- **Shape**: [384, 80, 3] (Filters, Input Channels, Kernel Size)
- **Stats**: Mean 0.002, Std 0.04 (Healthy)

Explain Error Codes

$ apr explain E002

**E002: Corrupted Data**
The payload checksum does not match the header.
- **Common Causes**: Interrupted download, bit rot, disk error.
- **Troubleshooting**:
  1. Run `apr validate --checksum` to verify.
  2. Check source file integrity (MD5/SHA256).

Falsifiable Guarantees:

Unknown Error: apr explain E999 must return "Unknown Error Code" (not crash).
Unknown Tensor: apr explain --tensor nonexistent must list fuzzy matches.
Architecture: Must correctly identify all supported architectures (Section 10).

4.13 TUI Command

Interactive terminal UI for model exploration, statistics visualization, and comparison. Built with ratatui and trueno-viz.

$ apr tui model.apr
$ apr tui model1.apr model2.apr --compare

4.13.1 Graph View

ASCII/Unicode graph visualization of model architecture:

┌─────────────────────────────────────────────────────────────────┐
│  Model: whisper-tiny.apr                          [Graph View]  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   ┌─────────┐    ┌─────────┐    ┌─────────┐                    │
│   │  Audio  │───▶│  Conv1  │───▶│  Conv2  │                    │
│   │ [80,3000]│    │[384,80,3]│   │[384,384]│                    │
│   └─────────┘    └─────────┘    └─────────┘                    │
│                                      │                          │
│                                      ▼                          │
│   ┌──────────────────────────────────────────────────────┐     │
│   │              Encoder Layers (×4)                      │     │
│   │  ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐   │     │
│   │  │Self-Attn│──▶│  LN   │──▶│  FFN   │──▶│  LN    │   │     │
│   │  └────────┘   └────────┘   └────────┘   └────────┘   │     │
│   └──────────────────────────────────────────────────────┘     │
│                           │                                     │
│                           ▼                                     │
│   ┌──────────────────────────────────────────────────────┐     │
│   │              Decoder Layers (×4)                      │     │
│   │  ┌────────┐   ┌────────┐   ┌────────┐   ┌────────┐   │     │
│   │  │Self-Attn│──▶│Cross-Attn│─▶│  FFN   │──▶│  LN    │   │     │
│   │  └────────┘   └────────┘   └────────┘   └────────┘   │     │
│   └──────────────────────────────────────────────────────┘     │
│                           │                                     │
│                           ▼                                     │
│                    ┌─────────────┐                              │
│                    │   Output    │                              │
│                    │  [51865]    │                              │
│                    └─────────────┘                              │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [g]raph [s]tats [c]ompare [t]ensors [h]ist [q]uit    Page 1/3  │
└─────────────────────────────────────────────────────────────────┘

4.13.2 Descriptive Statistics View

Live-updating tensor statistics dashboard:

┌─────────────────────────────────────────────────────────────────┐
│  Model: whisper-tiny.apr                          [Stats View]  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─ Overview ───────────────────────────────────────────────┐  │
│  │ Total Params: 39,000,000    Tensors: 167    Size: 145MB  │  │
│  │ Quantization: f32           Vocab: 51,865   Arch: Whisper│  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Layer Norm Health ──────────────────────────────────────┐  │
│  │ Tensor                        Mean    Std    Status      │  │
│  │ encoder.layer_norm.weight     1.48    0.32   ✓ OK        │  │
│  │ decoder.layer_norm.weight    11.10    0.21   ✗ BAD       │  │
│  │ encoder.layers.0.ln.weight    1.22    0.28   ✓ OK        │  │
│  │ encoder.layers.1.ln.weight    1.35    0.31   ✓ OK        │  │
│  │ encoder.layers.2.ln.weight    1.41    0.29   ✓ OK        │  │
│  │ encoder.layers.3.ln.weight   10.94    0.18   ✗ BAD       │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Weight Distribution ────────────────────────────────────┐  │
│  │                                                          │  │
│  │  Attention:  ████████████████████  Mean: 0.002  ✓        │  │
│  │  FFN:        ███████████████████   Mean: 0.001  ✓        │  │
│  │  Embedding:  █████████████████     Mean: 0.015  ✓        │  │
│  │  LayerNorm:  ██████████████████████████████████  ✗       │  │
│  │              ↑ outlier: decoder.layer_norm.weight        │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Validation Score ───────────────────────────────────────┐  │
│  │ ████████████████████░░░░  21/25 FAIL                     │  │
│  │ Critical: 2 Layer Norm weights outside [0.5, 3.0]        │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [g]raph [s]tats [c]ompare [t]ensors [h]ist [q]uit    Page 1/1  │
└─────────────────────────────────────────────────────────────────┘

4.13.3 Comparison View

Side-by-side model comparison with diff highlighting:

┌─────────────────────────────────────────────────────────────────┐
│  Comparing: model_v1.apr vs model_v2.apr         [Compare View] │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─ Summary ────────────────────────────────────────────────┐  │
│  │ Similarity: 94.2%    Changed: 12 tensors    New: 0       │  │
│  │ Max Δ: 0.0234        L2 Dist: 1.234         Removed: 0   │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Tensor Comparison ──────────────────────────────────────┐  │
│  │ Tensor                    v1 Mean   v2 Mean   Δ          │  │
│  │ encoder.conv1.weight      0.0023    0.0025    +0.0002    │  │
│  │ encoder.layer_norm.wt     1.4832    1.4901    +0.0069    │  │
│  │ decoder.layer_norm.wt    11.0983    1.0521   -10.0462 !! │  │
│  │ decoder.layers.0.fc1.wt   0.0012    0.0014    +0.0002    │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Distribution Comparison ────────────────────────────────┐  │
│  │                                                          │  │
│  │  decoder.layer_norm.weight:                              │  │
│  │                                                          │  │
│  │  v1: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░████  (mean=11.1)   │  │
│  │  v2: ░░░░░░░░░░████░░░░░░░░░░░░░░░░░░░░░░  (mean=1.05)   │  │
│  │      ──────────────────────────────────────              │  │
│  │      0         5         10        15                    │  │
│  │                                                          │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌─ Validation Score Comparison ────────────────────────────┐  │
│  │ v1: ████████████████████░░░░  21/25 FAIL                 │  │
│  │ v2: ████████████████████████  25/25 PASS  ← IMPROVED     │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [g]raph [s]tats [c]ompare [t]ensors [h]ist [q]uit    Page 1/1  │
└─────────────────────────────────────────────────────────────────┘

4.13.4 Histogram View

Per-tensor distribution visualization with sparklines:

┌─────────────────────────────────────────────────────────────────┐
│  Tensor: decoder.layer_norm.weight               [Histogram]    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Shape: [384]    dtype: f32    Size: 1.5 KB                    │
│  Mean: 11.0983   Std: 0.2134   Min: 10.42   Max: 12.01         │
│                                                                 │
│  Distribution:                                                  │
│                                                                 │
│   150 │                    ▄▄▄▄                                 │
│       │                  ▄██████▄                               │
│   100 │                ▄██████████▄                             │
│       │              ▄██████████████▄                           │
│    50 │            ▄██████████████████▄                         │
│       │          ▄██████████████████████▄                       │
│     0 ├──────────────────────────────────────────────           │
│       10.0      10.5      11.0      11.5      12.0              │
│                                                                 │
│  ⚠ ANOMALY DETECTED:                                           │
│  Expected mean ≈ 1.0 for LayerNorm weight                       │
│  Actual mean = 11.0983 (10x higher than expected)               │
│                                                                 │
│  Possible causes:                                               │
│  • Incorrect tensor scaling during conversion                   │
│  • Wrong tensor mapped to this name                             │
│  • Source model corruption                                      │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [←/→] prev/next tensor  [Enter] select  [q] back    12/167     │
└─────────────────────────────────────────────────────────────────┘

4.13.5 Keybindings

Key	Action
`g`	Switch to Graph view
`s`	Switch to Stats view
`c`	Switch to Compare view (if 2 models)
`t`	Switch to Tensor list
`h`	Switch to Histogram view
`Enter`	Select/drill down
`Esc`	Back/cancel
`↑/↓`	Navigate list
`←/→`	Previous/next page or tensor
`/`	Search tensors
`?`	Help
`q`	Quit

4.13.6 Implementation

Crates:

ratatui = "0.28" - Terminal UI framework
crossterm = "0.28" - Cross-platform terminal handling
trueno-viz - Tensor visualization utilities (optional)

Feature Flag:

[features]
tui = ["ratatui", "crossterm"]

5. Auxiliary Data Patterns

5.1 JSON Metadata Pattern

[APR magic] → [metadata_len] → [JSON metadata] → [tensors] → [CRC32]
                                     ↑
                            Auxiliary data here

5.2 Common Auxiliary Data Types

Vocabulary (NLP)

{"vocab": ["<pad>", "<unk>", "the", "..."], "vocab_size": 51865}

Mel Filterbank (Audio)

{"mel_filterbank": [0.0, "..."], "mel_filterbank_shape": [80, 201]}

Tokenizer Config

{"tokenizer_config": {"type": "bpe", "unk_token": "<|unk|>", "eos_token": "<|endoftext|>"}}

Embedded Tokenizer (PMAT-APR-TOK-001 - v1.2.0)

APR files now automatically embed tokenizers during conversion, making them truly portable single-file models:

{
  "tokenizer.vocabulary": ["<|endoftext|>", "<|startoftranscript|>", "the", "..."],
  "tokenizer.vocab_size": 151643,
  "tokenizer.bos_token_id": 151643,
  "tokenizer.eos_token_id": 151645,
  "tokenizer.model_type": "BPE"
}

Conversion Support:

SafeTensors → APR: Reads sibling tokenizer.json and embeds vocabulary
GGUF → APR: Extracts vocabulary from GGUF metadata tensors
Inference: Decodes tokens using embedded vocabulary (no external files needed)

Image Preprocessing (Vision)

{"image_config": {"image_size": 224, "mean": [0.485, 0.456, 0.406]}}

Label Mapping (Classification)

{"labels": {"0": "cat", "1": "dog"}, "num_labels": 2}

5.3 Tensor Storage for Large Data

Data Size	JSON Metadata	Tensor
< 100KB	Preferred	Overkill
100KB - 1MB	Acceptable	Good
> 1MB	Avoid	Preferred

Naming convention: audio.mel_filterbank, text.token_embedding

5.4 Best Practices

Use standard keys: Follow HuggingFace/GGUF conventions
Include shape info: Always store shape alongside flattened arrays
Version metadata: Include format_version for compatibility
Document units: Specify if values are normalized, in Hz, etc.
Validate on load: Check array lengths match expected shapes

6. Format Comparison

Feature	APR1	APR2	GGUF	SafeTensors
WASM-first	Yes	Yes	No	Yes
Tensor alignment	No	Yes (64B)	Yes (32B)	Yes
Compression	No	LZ4	No	No
Quantization	Metadata	Native	Native	No
Sharding	No	Yes	No	Yes
Streaming	No	Yes	No	No
JSON metadata	Yes	Yes	Typed KV	JSON
CRC32	Yes	Yes	No	No

7. Error Handling

Code	Category	Description
E001	FORMAT	Invalid file format
E002	CORRUPT	Corrupted data
E003	VERSION	Unsupported version
E004	CHECKSUM	Checksum mismatch
E005	DECRYPT	Decryption failed
E006	SIGNATURE	Signature invalid
E007	IO	File I/O error
E008	MEMORY	Out of memory

8. Configuration

# ~/.config/apr/config.toml

[defaults]
output_format = "text"
color = true

[inspect]
show_vocab = true
max_tokens_display = 20

[debug]
drama_mode = false
hex_limit = 256

[validate]
strict = true
require_signature = false

9. Quality Gates

# .pmat-gates.toml
[apr-ops]
test_coverage_minimum = 95.0
max_cyclomatic_complexity = 10
satd_maximum = 0
mutation_score_minimum = 85.0
max_inspect_latency_ms = 100

10. Multi-Format Conversion Specification

10.1 Supported Input Formats

APR supports conversion from all major ML model formats:

Format	Extensions	Source	Priority	Status
SafeTensors	`.safetensors`	HuggingFace	P0	✅ Implemented
PyTorch	`.pt`, `.pth`, `.bin`	PyTorch	P0	🔲 Planned
GGUF	`.gguf`	llama.cpp	P1	🔲 Planned
GGML	`.bin`	Legacy llama.cpp	P2	🔲 Planned
ONNX	`.onnx`	ONNX Runtime	P1	🔲 Planned
TensorFlow	`.pb`, `.h5`, SavedModel	TensorFlow/Keras	P2	🔲 Planned
Core ML	`.mlmodel`, `.mlpackage`	Apple	P3	🔲 Future
TensorRT	`.engine`, `.plan`	NVIDIA	P3	🔲 Future

Critical Lesson Learned: A single incorrect tensor conversion (e.g., decoder.layer_norm.weight with mean=11 instead of ~1) can cause complete model failure while passing basic structural checks.

10.2 SafeTensors (HuggingFace)

Status: ✅ Primary implementation

File Structure:

model.safetensors
├── Header (8 bytes): JSON length (u64 LE)
├── JSON Metadata: tensor names, shapes, dtypes, offsets
└── Tensor Data: contiguous f32/f16/bf16 arrays

CLI Usage:

apr convert model.safetensors -o model.apr
apr convert model.safetensors --quantize int8 -o model-int8.apr

# From HuggingFace Hub
apr convert hf://openai/whisper-tiny -o whisper-tiny.apr

Data Types: | SafeTensors Type | APR Conversion | |------------------|----------------| | F32 | Direct copy | | F16 | Convert to f32 or keep as f16 | | BF16 | Convert to f32 | | I8 | Keep as int8 (quantized) |

Crate: safetensors = "0.4"

10.3 PyTorch (.pt, .pth, .bin)

Status: 🔲 Planned (P0)

File Structure:

model.pt (ZIP archive)
├── data.pkl          # Python pickle with tensor metadata
├── data/0            # Raw tensor bytes
├── data/1
└── ...

Security Warning: PyTorch files use Python pickle, which can execute arbitrary code. APR conversion MUST:

Use pickle in restricted mode (no arbitrary imports)
Validate tensor shapes before allocation
Reject files with suspicious pickle opcodes

CLI Usage:

apr convert model.pt -o model.apr --arch whisper
apr convert model.pth -o model.apr --arch llama

# With state_dict key prefix
apr convert model.pt -o model.apr --prefix "model."

Implementation Notes:

Use zip crate for archive extraction
Implement minimal pickle parser (BINGET, MARK, TUPLE, etc.)
Map torch.float32 → f32, torch.float16 → f16
Handle both full checkpoints and state_dict-only files

Crate: Custom pickle parser (no Python dependency)

10.4 GGUF (llama.cpp)

Status: 🔲 Planned (P1)

File Structure:

model.gguf
├── Magic (4 bytes): "GGUF"
├── Version (4 bytes): u32
├── Tensor Count (8 bytes): u64
├── Metadata KV Count (8 bytes): u64
├── Metadata KV Pairs: typed key-value store
├── Tensor Infos: name, dims, type, offset
└── Tensor Data: aligned, possibly quantized

CLI Usage:

apr convert model.gguf -o model.apr
apr convert model-q4_k_m.gguf -o model.apr --dequantize f32
apr convert model.gguf -o model.apr --keep-quantization

Quantization Types: | GGUF Type | Bits | APR Handling | |-----------|------|--------------| | F32 | 32 | Direct copy | | F16 | 16 | Convert or keep | | Q8_0 | 8 | Dequantize or convert to APR int8 | | Q4_0 | 4 | Dequantize to f32 | | Q4_K_M | 4.5 | Dequantize to f32 | | Q5_K_M | 5.5 | Dequantize to f32 | | Q6_K | 6 | Dequantize to f32 |

Metadata Mapping: | GGUF Key | APR Metadata | |----------|--------------| | general.architecture | model_type | | general.name | model_name | | llama.context_length | context_length | | llama.embedding_length | hidden_size | | tokenizer.ggml.tokens | Vocabulary |

Crate: Custom GGUF parser

10.5 GGML (Legacy)

Status: 🔲 Planned (P2)

File Structure:

model.bin
├── Magic (4 bytes): "lmgg" or "tjgg"
├── Hyperparameters: model-specific struct
├── Vocabulary: token strings
└── Tensors: name + dims + data (unaligned)

CLI Usage:

apr convert model.bin -o model.apr --format ggml --arch llama

Notes:

Legacy format, prefer GGUF for new conversions
No standardized metadata format
Architecture must be specified manually

10.6 ONNX

Status: 🔲 Planned (P1)

File Structure:

model.onnx (Protobuf)
├── ModelProto
│   ├── graph: GraphProto
│   │   ├── node[]: operators
│   │   ├── input[]: model inputs
│   │   ├── output[]: model outputs
│   │   └── initializer[]: weight tensors
│   └── metadata_props: key-value pairs

CLI Usage:

apr convert model.onnx -o model.apr
apr convert model.onnx -o model.apr --opset 17

Data Types: | ONNX Type | APR Conversion | |-----------|----------------| | FLOAT | f32 | | FLOAT16 | f16 | | BFLOAT16 | f32 (convert) | | INT8 | int8 | | UINT8 | int8 (reinterpret) |

Crate: onnx-pb = "0.1" or custom protobuf parser

10.7 TensorFlow/Keras

Status: 🔲 Planned (P2)

Supported Formats:

Format	Description	CLI Flag
SavedModel	Directory with `saved_model.pb`	`--format savedmodel`
HDF5	Keras `.h5` files	`--format h5`
Frozen Graph	Single `.pb` file	`--format frozen`
TFLite	`.tflite` mobile format	`--format tflite`

CLI Usage:

apr convert saved_model/ -o model.apr --format savedmodel
apr convert model.h5 -o model.apr --format h5
apr convert model.tflite -o model.apr --format tflite

Notes:

HDF5 requires hdf5 crate
SavedModel requires protobuf parsing
TFLite uses FlatBuffers

10.8 Tensor Name Mapping

Each source format uses different naming conventions. APR standardizes to a canonical form:

Whisper Model Mapping

Source Format	Source Name	APR Name
SafeTensors	`model.encoder.conv1.weight`	`encoder.conv1.weight`
SafeTensors	`model.encoder.embed_positions.weight`	`encoder.positional_embedding`
SafeTensors	`model.decoder.embed_tokens.weight`	`decoder.token_embedding`
PyTorch	`encoder.conv1.weight`	`encoder.conv1.weight`
GGUF	`encoder.conv1.weight`	`encoder.conv1.weight`
ONNX	`/encoder/conv1/weight`	`encoder.conv1.weight`

LLaMA Model Mapping

Source Format	Source Name	APR Name
SafeTensors	`model.embed_tokens.weight`	`token_embedding`
SafeTensors	`model.layers.0.self_attn.q_proj.weight`	`layers.0.attn.q_proj.weight`
GGUF	`token_embd.weight`	`token_embedding`
GGUF	`blk.0.attn_q.weight`	`layers.0.attn.q_proj.weight`

Full HuggingFace Whisper Mapping

HuggingFace Name	APR Name
`model.encoder.conv1.weight`	`encoder.conv1.weight`
`model.encoder.conv1.bias`	`encoder.conv1.bias`
`model.encoder.conv2.weight`	`encoder.conv2.weight`
`model.encoder.conv2.bias`	`encoder.conv2.bias`
`model.encoder.embed_positions.weight`	`encoder.positional_embedding`
`model.encoder.layer_norm.weight`	`encoder.layer_norm.weight`
`model.encoder.layer_norm.bias`	`encoder.layer_norm.bias`
`model.encoder.layers.N.self_attn_layer_norm.weight`	`encoder.layers.N.self_attn_layer_norm.weight`
`model.encoder.layers.N.self_attn.q_proj.weight`	`encoder.layers.N.self_attn.q_proj.weight`
`model.decoder.embed_tokens.weight`	`decoder.token_embedding`
`model.decoder.embed_positions.weight`	`decoder.positional_embedding`
`model.decoder.layer_norm.weight`	`decoder.layer_norm.weight`
`model.decoder.layer_norm.bias`	`decoder.layer_norm.bias`
`model.decoder.layers.N.self_attn_layer_norm.weight`	`decoder.layers.N.self_attn_layer_norm.weight`
`model.decoder.layers.N.encoder_attn_layer_norm.weight`	`decoder.layers.N.encoder_attn_layer_norm.weight`
`model.decoder.layers.N.final_layer_norm.weight`	`decoder.layers.N.final_layer_norm.weight`

10.9 Expected Tensor Statistics

Layer Norm Weights (gamma) - MUST have mean ≈ 1.0:

Tensor                                   Expected Mean   Acceptable Range
encoder.layer_norm.weight                1.0 - 2.0       [0.5, 3.0]
decoder.layer_norm.weight                1.0 - 2.0       [0.5, 3.0]
*.self_attn_layer_norm.weight            1.0 - 2.0       [0.5, 3.0]
*.encoder_attn_layer_norm.weight         1.0 - 2.0       [0.5, 3.0]
*.final_layer_norm.weight                1.0 - 2.0       [0.5, 3.0]

Layer Norm Bias (beta) - MUST have mean ≈ 0.0:

Tensor                                   Expected Mean   Acceptable Range
*.layer_norm.bias                        0.0             [-0.5, 0.5]

Attention/Linear Weights - Should have mean ≈ 0.0:

Tensor                                   Expected Mean   Expected Std
*.q_proj.weight                          ~0.0            0.02 - 0.10
*.k_proj.weight                          ~0.0            0.02 - 0.10
*.v_proj.weight                          ~0.0            0.02 - 0.10
*.out_proj.weight                        ~0.0            0.02 - 0.10
*.fc1.weight                             ~0.0            0.02 - 0.05
*.fc2.weight                             ~0.0            0.02 - 0.05

Embeddings:

Tensor                                   Expected Mean   Expected Std
token_embedding                          ~0.0            0.02 - 0.05
positional_embedding                     ~0.0            0.01 - 0.02

10.10 Conversion Validation Requirements

Shape Validation: Every tensor must match expected shape for model architecture
Value Validation: Every tensor must have statistics within expected ranges
Reference Comparison: Converted model must produce outputs within tolerance of HF reference
Inline Validation (Strict Mode): The apr convert tool MUST run the statistical checks (Section 10.9) as tensors are being written.
- Default Behavior: If a tensor violates the "Acceptable Range" (e.g., LayerNorm mean > 3.0), the conversion aborts with an error.
- Override: Use --force or --relaxed to bypass this check.
- Justification: Better to fail early than produce a "zombie" model.

10.11 Known Failure Modes

Failure	Symptom	Root Cause	Troubleshooting
LN weight mean=11	Repetitive token output (e.g., "...")	Incorrect tensor scaling or name mapping	Use `apr tensors --hist` to visualize distribution
Missing conv bias	Zero encoder output	Conv layer not loaded	Check `--analyze-source`
Transposed weights	Garbage output	Row-major vs column-major confusion	Run `apr diff` vs reference
Truncated tensors	Partial outputs	Size mismatch during copy	Verify header vs file size

11. Master Falsification QA Checklist (100 Points)

This checklist unifies structural, physical, operational, and conversion requirements into a single 100-point quality gate. Every point must be testable and falsifiable.

A. Format & Structural Integrity (25 Points)

#	Claim	Test Command	Falsification (How to Fail)
1	Magic bytes valid	`head -c4 m.apr \\| grep APR2`	Edit file to start with "APR1" or random bytes
2	Header size fixed	`apr inspect m.apr --header`	Insert 1 byte before data offset
3	Version supported	Load v2.0 file	Load v3.0 file (should fail E003)
4	Checksum valid	`apr validate m.apr --checksum`	Flip 1 bit in payload (should fail E004)
5	JSON Metadata	`apr inspect m.apr --json`	Corrupt JSON syntax in editor
6	Tensor Alignment	`apr lint m.apr` checks 64B	Create file with 1-byte alignment (should warn)
7	Index Sorted	Validate index sort order	Swap two entries in binary index
8	Compression	`apr info` shows `lz4`	Compress with unsupported algo (should fail)
9	Sharding Manifest	Load sharded model	Delete one shard file (should fail E007)
10	Endianness	Read on Big Endian system	(Simulate BE) Read LE floats incorrectly
11	Flags Parsed	Check specific flag bits	Set undefined flag bit (should warn/ignore)
12	Footer Magic	Check `2RPA` at EOF	Truncate last 16 bytes (should fail)
13	File Size	Header size == `ls -l`	Append garbage to EOF (should warn)
14	Tensor Offsets	Read last tensor	Set offset beyond EOF (should fail E002)
15	Empty Model	Load model with 0 tensors	Create valid header, 0 tensors (should pass)
16	Huge Header	Metadata > 100MB	Create 200MB JSON header (should stream/fail gracefully)
17	UTF-8 Names	Tensor names are UTF-8	Insert invalid UTF-8 in name (should fail)
18	Duplicate Names	Index has unique names	Duplicate "tensor.a" in index (should fail)
19	Dimension Limit	Support 8 dims	Create 9-dim tensor (should fail)
20	Zero Dims	Support scalar (0-dim)	Create 0-dim tensor (should pass)
21	Datatypes	Support all `DType` enums	Use invalid enum id 255 (should fail)
22	Padding Bytes	Padding is zeroed	Fill padding with 0xFF (should warn in lint)
23	Signature	Verify Ed25519 (if signed)	Modify 1 byte of signature (should fail E006)
24	Encryption	Decrypt AES-256-GCM	Provide wrong key (should fail E005)
25	WASM Load	Load in `wasm32` env	Run in browser (must work)

B. Tensor Physics & Statistics (25 Points)

#	Claim	Test Command	Falsification (How to Fail)
26	No NaNs	`apr validate --nan-check`	Manually inject `0x7FC00000` (NaN) into f32 tensor
27	No Infs	`apr validate --nan-check`	Inject `0x7F800000` (+Inf)
28	LayerNorm Mean	`apr tensors --stats` in [0.5, 3]	Set LN weights to 11.0 (should fail/warn)
29	LayerNorm Bias	`apr tensors --stats` in [-0.5, 0.5]	Set LN bias to 5.0 (should fail/warn)
30	Embedding Std	`apr tensors --stats` < 0.2	Set embedding std to 1.0 (should warn)
31	Zero Tensors	`apr validate --zero-check`	Set entire tensor to 0.0 (should warn)
32	Shape Match	`apr validate --shapes`	Resize tensor [384]->[383] (should fail)
33	Vocab Match	Metadata `n_vocab` == tensor dim	Change metadata `n_vocab` to mismatch (should fail)
34	Quantization Range	q8_0 values in [-127, 127]	Manually set byte -128 (if using symm quant)
35	Attn/Linear Mean	Mean approx 0.0	Set Linear weight mean to 1.0 (should warn)
36	Softmax Valid	(If traceable) Output sums to 1.0	(Hard to fuzz statically, use trace)
37	Mel Filters	Values >= 0.0	Set negative filter bank value (should warn)
38	Pos Embeddings	Correct shape for ctx len	Truncate pos embedding (should fail shape)
39	Token IDs	(Trace) Output tokens < vocab	(Trace) Force output token > vocab_max
40	Audio Range	(Trace) Input in [-1, 1]	Feed audio with amp 10.0 (trace should warn)
41	FP16 Range	Values within FP16 limits	value > 65504 in FP16 tensor (should become Inf)
42	Sparsity	(If sparse) Check non-zero %	Claim sparse but 100% dense (lint warning)
43	Dead Neurons	(Trace) Activations never > 0	(Trace) Detect 0-activation neuron across 100 inputs
44	Exploding Grads	(Trace) Values > 1e6	(Trace) Detect activation spike
45	Repeat Tokens	(Trace) Repetition > 5x	(Trace) Feed silence, check for hallucination
46	Silence Input	(Trace) Output is empty/silence	Feed silence, check non-empty output
47	White Noise	(Trace) Output is garbage	Feed noise, check for confident output (bad)
48	Mel Shape	Filterbank matches audio/mels	Mismatch n_mels 80 vs 128 (should fail)
49	Text Context	Pos embed covers text ctx	Input text > max context (should truncate/fail)
50	L2 Distance	`apr diff` vs ref < 1.0	Compare against random tensor (should fail L2)

C. Tooling & Operations (25 Points)

#	Claim	Test Command	Falsification (How to Fail)
51	Inspect Speed	`inspect` < 100ms	(Perf) Load 100GB model (should be fast)
52	Lint Defaults	`apr lint` runs default checks	Create file with no license (must warn)
53	Drama Mode	`apr debug --drama`	Run on CI (no tty) - should output text
54	TUI Graph	`apr tui` renders graph	Create cyclic graph (should handle/error)
55	TUI Stats	`apr tui` stats match CLI	(Manual) Compare TUI number vs CLI number
56	Diff Identity	`apr diff a.apr a.apr`	Diff same file (must show 100% match)
57	Diff Detection	`apr diff a.apr b.apr`	Diff modified file (must show mismatch)
58	Merge Average	`apr merge` averages weights	Merge [1.0] and [3.0] -> expect [2.0]
59	Merge TIES	`apr merge --strategy ties`	(Complex) Verify TIES masking logic
60	Export ONNX	`apr export --format onnx`	Validate output with `onnx.checker`
61	Export GGUF	`apr export --format gguf`	Load output in `llama.cpp`
62	Convert Quant	`apr convert --quantize int8`	Check output size < 25% of input
63	Convert Prune	`apr convert --prune 0.5`	Check non-zero count is 50%
64	Trace Output	`apr trace` produces JSON	Corrupt input audio (should err/warn)
65	Explain Error	`apr explain E001`	Ask for E999 (should say unknown)
66	Explain Tensor	`apr explain --tensor`	Ask for random name (should fuzzy match)
67	Analyze Source	`convert --analyze-source`	Run on corrupt safetensors (must fail)
68	Inline Valid	`convert` fails on bad stat	Force bad mean in source, run convert (must abort)
69	Force Override	`convert --force`	Same as 68, but use --force (must pass)
70	Cache Dir	Uses `APR_CACHE`	Set APR_CACHE=/tmp/x (check files there)
71	Config Load	Uses `config.toml`	Set output_format=json in config (check output)
72	Canary Check	`apr canary check`	Modify weights to cause regression (should fail canary)
73	JSON Output	`apr inspect --json`	Pipe to `jq` (must parse)
74	Trace Payload	`apr trace --payload`	Corrupt tensor, check for anomaly in trace output
75	Trace Diff	`apr trace --diff`	Diff identical models (should show 0 drift)

D. Conversion & Interoperability (25 Points)

#	Claim	Test Command	Falsification (How to Fail)
76	SafeTensors	Import `.safetensors`	Import renamed .txt file (should fail)
77	PyTorch	Import `.pt` (pickle)	Import malicious pickle (should fail/block)
78	GGUF Import	Import `.gguf`	Import GGUF with unknown arch (should fail)
79	Roundtrip	APR->ONNX->APR	Compare tensor values (drift < 1e-5)
80	HF Mapping	Maps `model.layers.0` correctly	Rename layer in source (should fail map)
81	Q-DeepCopy	Preserves quantization	Convert q8->apr (should stay q8 if supported)
82	F32->BF16	`convert --precision bf16`	Check dtype is BF16
83	BF16->F32	`convert --precision f32`	Check dtype is F32
84	Vocab Import	Imports full vocab	Truncate vocab in source (check count)
85	Special Tokens	Preserves BOS/EOS/UNK	Check metadata for token IDs
86	Metadata Copy	Copies model card/license	Remove metadata from source (check warnings)
87	Tensor Name Norm	Normalizes to `encoder.x`	Check for "model.encoder.x" (bad)
88	Permutation	Transposes weights if needed	Disable transpose (check output garbage)
89	Scale Factors	Applies rescaling (e.g. div 2)	Disable scaling (check mean drift)
90	Sharded Import	Imports `model-0001...`	Missing shard 2 (should fail)
91	Remote Import	`apr import hf://...`	Network down (should fail gracefully)
92	Cache Hit	Second import is fast	Clear cache, time it; run again, time it
93	Checksum Verify	Verify source SHA256	Modify source file (should fail checksum)
94	License Warning	Warns on non-commercial	Import CC-BY-NC model (check warning)
95	Arch Detect	Auto-detects Whisper/LLaMA	Import unknown arch (should ask user)
96	Output Path	Honors `--output`	Check file exists at path
97	Overwrite	Fails if exists (no -f)	Create file, run export (should fail)
98	Disk Full	Handle ENOSPC	Simulate small disk (should fail clean)
99	Memory Limit	Respect `APR_RAM_LIMIT`	Set low limit, load big model (should error/mmap)
100	Golden Trace	Passes canonical trace	Run against `golden_traces/` (must pass)

12. Automated Validation Script

The apr-qa tool runs this 100-point checklist automatically.

# Run the full suite
apr-qa verify model.apr --score

# Run specific category
apr-qa verify model.apr --category physics

# CI/CD usage (fail if score < 95)
apr-qa verify model.apr --min-score 95

13. Import/Convert Pipeline

The complete pipeline for downloading, converting, validating, and optimizing models.

13.1 Pipeline Overview

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Source    │───▶│   Import    │───▶│  Validate   │───▶│   Output    │
│ (HF/Local)  │    │ (Converter) │    │ (100-Point) │    │   (.apr)    │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
      │                  │                  │                  │
      ▼                  ▼                  ▼                  ▼
  hf://openai/     SafeTensors→APR    Inline checks      Quantized/
  whisper-tiny     Name mapping       Tensor stats       Compressed

13.2 CLI Interface

# Full pipeline: download → convert → validate
apr import hf://openai/whisper-tiny -o whisper.apr

# With quantization
apr import hf://openai/whisper-tiny -o whisper-int8.apr --quantize int8

# Local file conversion
apr import model.safetensors -o model.apr

# Validate after import (automatic, but can run standalone)
apr validate whisper.apr --quality --min-score 95

# Post-import optimization
apr convert whisper.apr --quantize int8 --compress lz4 -o whisper-optimized.apr

13.3 SDK Interface

use aprender::format::{AprConverter, ImportOptions, ValidationConfig};

// Full pipeline with builder pattern
let apr_bytes = AprConverter::new()
    .source("hf://openai/whisper-tiny")
    .architecture("whisper")
    .validate(ValidationConfig::strict())  // Inline validation
    .quantize(Quantization::Int8)
    .compress(Compression::Lz4)
    .convert()?;

// Save to file
std::fs::write("whisper.apr", apr_bytes)?;

// Or use the high-level API
apr_import("hf://openai/whisper-tiny", "whisper.apr", ImportOptions::default())?;

13.4 Source Types

Source	Format	Example
HuggingFace Hub	`hf://org/repo`	`hf://openai/whisper-tiny`
HuggingFace File	`hf://org/repo/file`	`hf://openai/whisper-tiny/model.safetensors`
Local SafeTensors	Path	`./model.safetensors`
Local PyTorch	Path	`./model.pt`
Local GGUF	Path	`./model.gguf`
URL	`https://`	`https://example.com/model.safetensors`

13.5 Tensor Name Mapping

During import, tensor names are normalized from source format to APR canonical form:

/// Tensor name mapper trait
pub trait TensorNameMapper {
    /// Map source tensor name to APR name
    fn map_name(&self, source_name: &str) -> Option<String>;

    /// Get expected tensor statistics for validation
    fn expected_stats(&self, apr_name: &str) -> Option<TensorExpectation>;
}

/// Built-in mappers
pub enum Architecture {
    Whisper,  // HuggingFace Whisper → APR Whisper
    Llama,    // HuggingFace LLaMA → APR LLaMA
    Bert,     // HuggingFace BERT → APR BERT
    Custom(Box<dyn TensorNameMapper>),
}

Whisper Mapping Example:

HuggingFace                           → APR
model.encoder.conv1.weight            → encoder.conv1.weight
model.decoder.layer_norm.weight       → decoder.layer_norm.weight
model.decoder.layers.0.self_attn...   → decoder.layers.0.self_attn...

13.6 Inline Validation

Critical: Validation runs DURING conversion, not after. If a tensor fails validation, conversion aborts immediately.

/// Validation that runs inline during conversion
pub struct InlineValidator {
    config: ValidationConfig,
    report: ValidationReport,
}

impl InlineValidator {
    /// Called for each tensor during conversion
    pub fn validate_tensor(&mut self, name: &str, data: &[f32]) -> Result<(), ValidationError> {
        let stats = TensorStats::compute(name, data);

        // Check for NaN/Inf
        if stats.nan_count > 0 {
            return Err(ValidationError::NanDetected { name: name.to_string(), count: stats.nan_count });
        }

        // Check LayerNorm weights (mean should be ~1.0)
        if name.contains("layer_norm") && name.ends_with(".weight") {
            if stats.mean < 0.5 || stats.mean > 3.0 {
                return Err(ValidationError::LayerNormMean {
                    name: name.to_string(),
                    mean: stats.mean,
                    expected: (0.5, 3.0),
                });
            }
        }

        Ok(())
    }
}

13.7 Import Options

/// Options for the import pipeline
#[derive(Debug, Clone)]
pub struct ImportOptions {
    /// Target architecture for name mapping
    pub architecture: Architecture,

    /// Validation configuration
    pub validation: ValidationConfig,

    /// Quantization (None = keep original precision)
    pub quantize: Option<Quantization>,

    /// Compression algorithm
    pub compress: Option<Compression>,

    /// Force import even if validation fails
    pub force: bool,

    /// Cache downloaded files
    pub cache: bool,

    /// HuggingFace token (from env HF_TOKEN if None)
    pub hf_token: Option<String>,
}

impl Default for ImportOptions {
    fn default() -> Self {
        Self {
            architecture: Architecture::Auto,  // Auto-detect
            validation: ValidationConfig::strict(),
            quantize: None,
            compress: None,
            force: false,
            cache: true,
            hf_token: None,
        }
    }
}

13.8 Error Handling

Import errors are specific and actionable:

#[derive(Debug, thiserror::Error)]
pub enum ImportError {
    #[error("Download failed: {source} - {reason}")]
    DownloadFailed { source: String, reason: String },

    #[error("Unsupported format: {extension}")]
    UnsupportedFormat { extension: String },

    #[error("Tensor validation failed: {name} - {reason}")]
    ValidationFailed { name: String, reason: String },

    #[error("Name mapping failed: unknown tensor '{source_name}'")]
    UnknownTensor { source_name: String },

    #[error("Architecture mismatch: expected {expected}, found {found}")]
    ArchitectureMismatch { expected: String, found: String },

    #[error("Missing required tensor: {name}")]
    MissingTensor { name: String },
}

13.9 Caching

Downloaded models are cached to avoid re-downloading:

~/.cache/apr/
├── hf/
│   └── openai/
│       └── whisper-tiny/
│           ├── model.safetensors
│           └── config.json
└── checksum.json

# Clear cache
apr cache clear

# Show cache usage
apr cache info

# Pre-download without converting
apr download hf://openai/whisper-tiny

13.10 Testing Requirements

Every import path must have:

Unit Test: Test name mapping and validation logic
Integration Test: Download real model, convert, validate
Golden Test: Compare output against known-good .apr file
Regression Test: Ensure tensor statistics match expected values

#[test]
fn test_whisper_tiny_import() {
    let result = apr_import(
        "hf://openai/whisper-tiny",
        "/tmp/test.apr",
        ImportOptions::default(),
    );

    assert!(result.is_ok());

    // Validate the output
    let validator = AprValidator::new();
    let report = validator.validate(&std::fs::read("/tmp/test.apr").unwrap());

    assert!(report.passed(95), "Score: {}/100", report.total_score);

    // Check specific tensor that was previously buggy
    let reader = AprReader::new(&std::fs::read("/tmp/test.apr").unwrap()).unwrap();
    let ln_weight = reader.load_tensor("decoder.layer_norm.weight").unwrap();
    let stats = TensorStats::compute("decoder.layer_norm.weight", &ln_weight);

    assert!(stats.mean >= 0.5 && stats.mean <= 3.0,
        "decoder.layer_norm.weight mean={} should be in [0.5, 3.0]", stats.mean);
}

14. Implementation Roadmap

Phase 1: Alignment (v2.0)

64-byte tensor alignment
Binary tensor index
Backward-compatible reader

Phase 2: Compression (v2.1)

LZ4 block compression
Per-tensor compression flag
Streaming decompression

Phase 3: Sharding (v2.2)

Manifest file format
Multi-file loader
Tensor-level demand loading

15. References

Sculley, D., et al. (2015). "Hidden Technical Debt in Machine Learning Systems." NeurIPS 2015
Amershi, S., et al. (2019). "Software Engineering for Machine Learning." ICSE 2019
Vartak, M., et al. (2016). "ModelDB: A System for ML Model Management." SIGMOD 2016
Baylor, D., et al. (2017). "TFX: A TensorFlow-Based Production-Scale ML Platform." KDD 2017
Zaharia, M., et al. (2018). "Accelerating the ML Lifecycle with MLflow." IEEE Data Eng. Bull.

Code References:

APR v1: src/serialization/apr.rs
GGUF: src/format/gguf.rs
Bundle system: src/bundle/
SafeTensors: src/serialization/safetensors.rs

16. Appendices

A. Exit Codes

Code	Meaning
0	Success
1	General error
2	Invalid arguments
3	File not found
4	Format error
5	Validation failed

B. Environment Variables

Variable	Description	Default
`APR_CONFIG`	Config file path	`~/.config/apr/config.toml`
`APR_CACHE`	Cache directory	`~/.cache/apr`
`APR_LOG_LEVEL`	Log level	`info`
`APR_COLOR`	Enable colors	`auto`

Document generated following Toyota Way principles and PMAT quality standards.

EXTREME TDD - The Aprender Guide to Zero-Defect Machine Learning