Case Study: Homomorphic Encryption for Shell Models

This case study demonstrates privacy-preserving shell completion using homomorphic encryption (HE). With HE, shell completion models can run on untrusted servers while keeping user data encrypted.

Overview

Homomorphic encryption enables computation on encrypted data without decryption. For shell completion:

Train locally: Model trained on your private shell history
Encrypt model: Convert to HE format with your keys
Deploy anywhere: Run on cloud/untrusted servers
Privacy preserved: Server never sees plaintext commands

Quick Start

1. Generate HE Keys

# Generate key pair (one-time setup)
aprender-shell keygen --output ~/.config/aprender/

# Output:
# Generating HE key pair (128-bit security)...
# Public key:  ~/.config/aprender/public.key
# Secret key:  ~/.config/aprender/secret.key
# Relin keys:  ~/.config/aprender/relin.key

2. Train with Homomorphic Encryption

# Train model with HE flag
aprender-shell train \
  --homomorphic \
  --public-key ~/.config/aprender/public.key \
  --output ~/.aprender-shell-he.model

# Output:
# Training with homomorphic encryption (Tier 4)...
# Loading public key: ~/.config/aprender/public.key
# History file: ~/.zsh_history
# Commands loaded: 12543
# Training 3-gram model... done!
# Encrypting with HE public key... done!
# HE-encrypted model saved to: ~/.aprender-shell-he.model

3. Get Encrypted Suggestions

# Use --homomorphic flag for encrypted inference
aprender-shell suggest --homomorphic "git " -m ~/.aprender-shell-he.model

# Output:
# git status    0.2341
# git commit    0.1892
# git push      0.1567

4. Inspect Model Encryption

aprender-shell inspect -m ~/.aprender-shell-he.model

# Output:
# MODEL INFORMATION
# ═══════════════════════════════════════════
#   Encryption:   Homomorphic (BFV+CKKS hybrid)
#                 (Computation on encrypted data enabled)

Security Levels

Three security levels are available:

# 128-bit (default, recommended for most uses)
aprender-shell keygen --output ./keys --security 128

# 192-bit (higher security, larger keys)
aprender-shell keygen --output ./keys --security 192

# 256-bit (maximum security, largest keys)
aprender-shell keygen --output ./keys --security 256

Level	Key Size	Security	Use Case
128-bit	~50KB	Standard	General use
192-bit	~75KB	High	Sensitive environments
256-bit	~100KB	Maximum	Regulated industries

Encryption Tiers Comparison

aprender-shell supports four protection levels:

Tier	Method	At Rest	In Transit	In Use
1	Plain	No	No	No
2	Compressed	No	No	No
3	AES-256-GCM	Yes	Yes	No
4	Homomorphic	Yes	Yes	Yes

Tier 4 (Homomorphic) is unique: data remains encrypted even during computation.

Performance

Phase 2 implementation achieves sub-microsecond latency:

Operation	Latency	Target
`suggest`	~1 µs	<100ms
`to_homomorphic`	~10 µs	<1s
Cold start	~100 µs	<1s

The implementation is 100,000x faster than the 100ms quality gate.

API Usage

Rust API

use aprender_shell::{MarkovModel, EncryptedMarkovModel};
use aprender::format::homomorphic::{HeContext, SecurityLevel};

// Generate keys
let ctx = HeContext::new(SecurityLevel::Bit128)?;
let (public_key, secret_key) = ctx.generate_keys()?;

// Train model
let mut model = MarkovModel::new(3);
model.train(&commands);

// Convert to HE
let encrypted: EncryptedMarkovModel = model.to_homomorphic(&public_key)?;

// Get suggestions (privacy-preserving)
let suggestions = encrypted.suggest("git ", 5);

Save/Load HE Models

// Save with HE header (v3 format)
model.save_homomorphic(&path, &public_key)?;

// Inspect shows HE encryption
let info = aprender::format::inspect(&path)?;
assert!(info.encryption_mode.is_homomorphic());

File Format

HE models use the .apr v3 format:

┌─────────────────────────────────────────┐
│ Header (32 bytes)                       │
│ - Magic: "APRN"                         │
│ - Version: (3, scheme)                  │
│ - Flags: HOMOMORPHIC (0x80)            │
├─────────────────────────────────────────┤
│ Metadata (MessagePack)                  │
│ - name: "aprender-shell"                │
│ - encryption_mode: "homomorphic_hybrid" │
├─────────────────────────────────────────┤
│ Payload (encrypted n-gram data)         │
├─────────────────────────────────────────┤
│ Checksum (CRC32)                        │
└─────────────────────────────────────────┘

EXTREME TDD - The Aprender Guide to Zero-Defect Machine Learning