Case Study: Homomorphic Encryption for Shell Models
This case study demonstrates privacy-preserving shell completion using homomorphic encryption (HE). With HE, shell completion models can run on untrusted servers while keeping user data encrypted.
Overview
Homomorphic encryption enables computation on encrypted data without decryption. For shell completion:
- Train locally: Model trained on your private shell history
- Encrypt model: Convert to HE format with your keys
- Deploy anywhere: Run on cloud/untrusted servers
- Privacy preserved: Server never sees plaintext commands
Quick Start
1. Generate HE Keys
# Generate key pair (one-time setup)
aprender-shell keygen --output ~/.config/aprender/
# Output:
# Generating HE key pair (128-bit security)...
# Public key: ~/.config/aprender/public.key
# Secret key: ~/.config/aprender/secret.key
# Relin keys: ~/.config/aprender/relin.key
2. Train with Homomorphic Encryption
# Train model with HE flag
aprender-shell train \
--homomorphic \
--public-key ~/.config/aprender/public.key \
--output ~/.aprender-shell-he.model
# Output:
# Training with homomorphic encryption (Tier 4)...
# Loading public key: ~/.config/aprender/public.key
# History file: ~/.zsh_history
# Commands loaded: 12543
# Training 3-gram model... done!
# Encrypting with HE public key... done!
# HE-encrypted model saved to: ~/.aprender-shell-he.model
3. Get Encrypted Suggestions
# Use --homomorphic flag for encrypted inference
aprender-shell suggest --homomorphic "git " -m ~/.aprender-shell-he.model
# Output:
# git status 0.2341
# git commit 0.1892
# git push 0.1567
4. Inspect Model Encryption
aprender-shell inspect -m ~/.aprender-shell-he.model
# Output:
# MODEL INFORMATION
# ═══════════════════════════════════════════
# Encryption: Homomorphic (BFV+CKKS hybrid)
# (Computation on encrypted data enabled)
Security Levels
Three security levels are available:
# 128-bit (default, recommended for most uses)
aprender-shell keygen --output ./keys --security 128
# 192-bit (higher security, larger keys)
aprender-shell keygen --output ./keys --security 192
# 256-bit (maximum security, largest keys)
aprender-shell keygen --output ./keys --security 256
| Level | Key Size | Security | Use Case |
|---|---|---|---|
| 128-bit | ~50KB | Standard | General use |
| 192-bit | ~75KB | High | Sensitive environments |
| 256-bit | ~100KB | Maximum | Regulated industries |
Encryption Tiers Comparison
aprender-shell supports four protection levels:
| Tier | Method | At Rest | In Transit | In Use |
|---|---|---|---|---|
| 1 | Plain | No | No | No |
| 2 | Compressed | No | No | No |
| 3 | AES-256-GCM | Yes | Yes | No |
| 4 | Homomorphic | Yes | Yes | Yes |
Tier 4 (Homomorphic) is unique: data remains encrypted even during computation.
Performance
Phase 2 implementation achieves sub-microsecond latency:
| Operation | Latency | Target |
|---|---|---|
suggest | ~1 µs | <100ms |
to_homomorphic | ~10 µs | <1s |
| Cold start | ~100 µs | <1s |
The implementation is 100,000x faster than the 100ms quality gate.
API Usage
Rust API
use aprender_shell::{MarkovModel, EncryptedMarkovModel};
use aprender::format::homomorphic::{HeContext, SecurityLevel};
// Generate keys
let ctx = HeContext::new(SecurityLevel::Bit128)?;
let (public_key, secret_key) = ctx.generate_keys()?;
// Train model
let mut model = MarkovModel::new(3);
model.train(&commands);
// Convert to HE
let encrypted: EncryptedMarkovModel = model.to_homomorphic(&public_key)?;
// Get suggestions (privacy-preserving)
let suggestions = encrypted.suggest("git ", 5);
Save/Load HE Models
// Save with HE header (v3 format)
model.save_homomorphic(&path, &public_key)?;
// Inspect shows HE encryption
let info = aprender::format::inspect(&path)?;
assert!(info.encryption_mode.is_homomorphic());
File Format
HE models use the .apr v3 format:
┌─────────────────────────────────────────┐
│ Header (32 bytes) │
│ - Magic: "APRN" │
│ - Version: (3, scheme) │
│ - Flags: HOMOMORPHIC (0x80) │
├─────────────────────────────────────────┤
│ Metadata (MessagePack) │
│ - name: "aprender-shell" │
│ - encryption_mode: "homomorphic_hybrid" │
├─────────────────────────────────────────┤
│ Payload (encrypted n-gram data) │
├─────────────────────────────────────────┤
│ Checksum (CRC32) │
└─────────────────────────────────────────┘
Implementation Status
Phase 1: Foundation (Complete)
-
Feature flag:
format-homomorphic - Key generation CLI
- Key I/O (public, secret, relin keys)
-
v3 header with
EncryptionModeenum
Phase 2: N-gram Support (Complete)
-
to_homomorphic()conversion -
suggest()on encrypted model -
CLI:
train --homomorphic,suggest --homomorphic - <100ms latency (achieved: ~1µs)
Phase 3: Full ML Pipeline (Future)
- Actual SEAL library integration
- Ciphertext operations on n-gram weights
- Linear model HE support
- Side-channel hardening