Databricks Specialization on Coursera
Courses 1, 3 & 4 of the Databricks Specialization on Coursera
Platform: Databricks Free Edition | Comparison Layer: Sovereign AI Stack (Rust)
Design Philosophy
Course 1 is Databricks-only, building the foundation for the specialization.
Courses 3 & 4 use a dual-layer pedagogy:
- Databricks layer — Hands-on with MLflow, Feature Store, Model Serving, Vector Search, Foundation Models
- Sovereign AI Stack layer — Build the same concepts from scratch in Rust to understand what platforms abstract
Why both?
- Practitioners need to use Databricks effectively
- Engineers need to understand what's underneath
- "Understand by building" creates deeper retention
Course Overview
| Course | Title | Duration |
|---|---|---|
| 1 | Lakehouse Fundamentals | ~15 hours |
| 3 | MLOps Engineering | ~30 hours |
| 4 | GenAI Engineering | ~34 hours |
Sovereign AI Stack
┌──────────────────────────────────────────────────────────────────┐
│ batuta (Orchestration) │
│ Privacy Tiers · CLI · Stack Coordination │
├───────────────────┬──────────────────┬───────────────────────────┤
│ realizar │ entrenar │ pacha │
│ (Inference) │ (Training) │ (Model Registry) │
│ GGUF/SafeTensors │ autograd/LoRA │ Sign/Encrypt/Lineage │
├───────────────────┴──────────────────┴───────────────────────────┤
│ aprender │
│ ML Algorithms: regression, trees, clustering │
├──────────────────────────────────────────────────────────────────┤
│ trueno │
│ SIMD/GPU Compute (AVX2/AVX-512/NEON, wgpu) │
├──────────────────────────────────────────────────────────────────┤
│ trueno-rag │ trueno-db │ alimentar │ pmat │
│ BM25 + Vector │ GPU Analytics │ Arrow/Parquet │ Quality │
└──────────────────┴─────────────────┴───────────────┴─────────────┘
Prerequisites
Databricks
- Create a free account at databricks.com
- No paid features required
Sovereign AI Stack (Rust)
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Key crates
cargo install batuta realizar pmat
Getting Started
Begin with Course 1: Lakehouse Fundamentals for the foundational concepts, then continue to Course 3: MLOps Engineering or jump directly to Course 4: GenAI Engineering if you're already familiar with MLOps concepts.
Course 1: Databricks Lakehouse Fundamentals
Subtitle
Master the Data Lakehouse Architecture with Databricks Free Edition
Description
Build a solid foundation in the Databricks Lakehouse Platform: understand the evolution from data warehouses and data lakes to the lakehouse paradigm, navigate the Databricks workspace and Unity Catalog, write Spark DataFrames and SQL, and work with Delta Lake for reliable, versioned data storage. This is a Databricks-only course — no Sovereign AI Stack component.
Certification Alignment
This course prepares you for the Databricks Accredited Lakehouse Platform Fundamentals accreditation:
- 25 multiple-choice questions
- Tests conceptual understanding of the platform
- Covers architecture, components, governance, and workloads
- Free for Databricks customers and partners
Learning Outcomes
- Explain the data lakehouse architecture and how it combines warehouse reliability with lake flexibility
- Navigate the Databricks workspace, Unity Catalog, and compute resources
- Use Databricks notebooks with magic commands, dbutils, and multiple languages
- Write Spark transformations (select, filter, groupBy, join) and actions
- Create and manage Delta Lake tables with ACID transactions, MERGE, and time travel
- Build parameterized ETL pipelines and schedule them as Databricks Jobs
Duration
~15 hours | 18 videos | 6 labs | 3 quizzes
Weeks
| Week | Topic | Focus |
|---|---|---|
| 1 | Lakehouse Architecture & Platform | Architecture, workspace, catalog, compute |
| 2 | Spark Fundamentals | Notebooks, DataFrames, SQL, transformations |
| 3 | Delta Lake & Workflows | Delta tables, DML, time travel, jobs |
Databricks Free Edition Features Used
- Workspace and Notebooks
- Unity Catalog (basic)
- Apache Spark DataFrames and SQL
- Delta Lake tables
- DBFS (Databricks File System)
- Jobs and Workflows
- Sample datasets (
/databricks-datasets/)
Prerequisites
- Basic SQL knowledge
- Familiarity with Python
- Databricks Free Edition account (sign up)
Week 1: Lakehouse Architecture & Platform
Overview
Understand the evolution of data architectures from data warehouses to data lakes to the data lakehouse. Explore the Databricks platform: workspace navigation, Unity Catalog hierarchy, and compute resources.
Topics
| # | Type | Title | Duration |
|---|---|---|---|
| 1.1.1 | Video | Data Architecture Evolution | 8 min |
| 1.1.2 | Video | Lakehouse Architecture | 10 min |
| 1.1.3 | Video | Databricks and the Lakehouse | 8 min |
| 1.2.1 | Video | Databricks Overview | 10 min |
| 1.2.2 | Video | Workspace, Catalog & Data | 12 min |
| 1.3.1 | Video | Compute Resources | 8 min |
| — | Lab | Lakehouse Concepts | 30 min |
| — | Lab | Workspace & Catalog | 30 min |
| — | Quiz | Lakehouse Architecture | 15 min |
Key Concepts
Data Architecture Evolution
| Era | Architecture | Strengths | Weaknesses |
|---|---|---|---|
| 1980s–2000s | Data Warehouse | ACID, schema, BI | Expensive, rigid, no unstructured |
| 2010s | Data Lake | Cheap, flexible, any format | No ACID, quality issues, "data swamp" |
| 2020s+ | Data Lakehouse | Best of both | Requires modern platform |
Lakehouse Properties
A data lakehouse provides:
- ACID transactions on data lake storage (via Delta Lake)
- Schema enforcement and evolution for data quality
- Direct BI access to source data (no ETL to warehouse)
- Unified batch and streaming in one architecture
- Open formats (Parquet + Delta) — no vendor lock-in
- Governance via Unity Catalog
Databricks Platform Architecture
- Control Plane: Managed by Databricks — workspace UI, job scheduling, notebooks
- Data Plane: Runs in your cloud account — compute clusters, data storage, processing
- Unity Catalog: Three-level namespace (Metastore > Catalog > Schema > Table)
- Compute Options: All-purpose clusters, job clusters, SQL warehouses, serverless
Certification Topics
Key accreditation concepts from this week:
- A data lakehouse combines warehouse reliability with lake flexibility
- Delta Lake provides ACID transactions on data lake storage
- Unity Catalog provides unified governance across all data assets
- The control plane is managed by Databricks; the data plane runs in your cloud
- Photon accelerates SQL queries without requiring code changes
- Open formats prevent vendor lock-in
Demo Code
demos/course1/week1/databricks-lakehouse/— Lakehouse architecture explorationdemos/course1/week1/databricks-workspace/— Workspace, Catalog & Compute
Lab: Lakehouse Concepts
Explore the data lakehouse architecture hands-on: compare architectures, inspect platform components, and create your first Delta table.
Objectives
- Identify key properties of a data lakehouse
- Compare lakehouse vs data warehouse vs data lake
- Create a Delta table and inspect the transaction log
- Verify the Databricks environment
Lab Exercise
See labs/course1/week1/lab_lakehouse.py
Key Tasks
- Verify environment — Print Spark version and runtime info
- Architecture comparison — Build a DataFrame comparing warehouse/lake/lakehouse features
- Create Delta table — Write sample data as a Delta table
- Inspect history — Use
DESCRIBE HISTORYto view the transaction log
Validation
The lab includes a validate_lab() function that checks:
- Spark environment is running
- Delta table was created with at least 5 rows
- Architecture comparison DataFrame has all 3 architectures
Lab: Workspace & Catalog
Navigate the Databricks workspace, explore the Unity Catalog hierarchy, browse DBFS, and inspect compute resources.
Objectives
- Navigate the Databricks Workspace UI
- Explore Unity Catalog (Metastore > Catalog > Schema > Table)
- Use DBFS to browse files and sample datasets
- Inspect cluster configuration
Lab Exercise
See labs/course1/week1/lab_workspace.py
Key Tasks
- Catalog exploration — List catalogs and schemas using SQL
- Create schema and table — Build a
lab_workspace.citiestable with data - File system exploration — Browse
/databricks-datasets/with dbutils - Compute inspection — Print cluster and runtime configuration
Validation
The lab includes a validate_lab() function that checks:
- Schema
lab_workspacewas created - Cities table exists with at least 3 rows
Week 2: Spark Fundamentals
Overview
Master Apache Spark on Databricks: use notebooks with magic commands and utilities, load and preview data, then apply core DataFrame operations — select, filter, groupBy, aggregations, and joins.
Topics
| # | Type | Title | Duration |
|---|---|---|---|
| 2.1.1 | Video | Using Notebooks | 10 min |
| 2.1.2 | Video | Magic Commands & Utilities | 8 min |
| 2.1.3 | Video | Loading & Previewing Data | 10 min |
| 2.2.1 | Video | Spark Core Concepts | 12 min |
| 2.2.2 | Video | Select & Filter Operations | 10 min |
| 2.2.3 | Video | GroupBy, Aggregations & Joins | 12 min |
| — | Lab | Using Notebooks | 30 min |
| — | Lab | Spark Operations | 45 min |
| — | Quiz | Spark Fundamentals | 15 min |
Key Concepts
Databricks Notebooks
- Support Python, SQL, Scala, R in the same notebook
- Magic commands:
%python,%sql,%scala,%r,%md,%sh,%fs,%run - dbutils: File system ops (
fs), notebook chaining (notebook), widgets, secrets - display(): Rich visualizations built into Databricks
Spark Core Architecture
- SparkSession: Entry point (
sparkvariable, auto-created on Databricks) - DataFrame: Distributed collection of rows with named columns
- Lazy evaluation: Transformations build a plan; actions trigger execution
- Catalyst Optimizer: Optimizes the query plan regardless of API used
Transformations vs Actions
| Transformations (Lazy) | Actions (Eager) |
|---|---|
select() | show() |
filter() / where() | count() |
groupBy() | collect() |
join() | first() |
orderBy() | take(n) |
withColumn() | write.* |
Core Operations
- select() — Choose and transform columns
- filter() / where() — Select rows by condition
- groupBy().agg() — Group rows and compute aggregates (sum, avg, count, max, min)
- join() — Combine DataFrames (inner, left, right, full)
- orderBy() — Sort results
Data Formats
| Format | Command | Use Case |
|---|---|---|
| CSV | spark.read.csv() | Simple tabular data |
| JSON | spark.read.json() | Semi-structured data |
| Parquet | spark.read.parquet() | Columnar analytics |
| Delta | spark.read.format("delta") | Lakehouse tables |
Demo Code
demos/course1/week2/databricks-notebooks/— Notebooks, magic commands, data loadingdemos/course1/week2/databricks-spark/— Spark operations (select, filter, groupBy, join)
Lab: Using Notebooks
Practice using Databricks notebooks: magic commands for multi-language cells, dbutils for file operations, loading data, and visualizations.
Objectives
- Use magic commands to switch between Python, SQL, and Markdown
- Work with dbutils for file system operations
- Load data from CSV files
- Use display() for rich visualizations
Lab Exercise
See labs/course1/week2/lab_notebooks.py
Key Tasks
- Magic commands — Write Python and SQL cells in the same notebook
- dbutils exploration — List sample datasets, preview file contents
- Load data — Read a CSV file with schema inference
- Visualization — Use display() to create charts from aggregated data
Validation
The lab includes a validate_lab() function that checks:
- Python magic command executed correctly
- DataFrame was loaded with data
Lab: Spark Operations
Practice core Spark DataFrame operations: select, filter, groupBy, aggregations, and joins using sales data.
Objectives
- Use select() to choose and transform columns
- Use filter() to select rows by condition
- Use groupBy() with aggregation functions (sum, avg, count, max)
- Perform inner and left joins between DataFrames
- Write equivalent SQL queries
Lab Exercise
See labs/course1/week2/lab_spark.py
Key Tasks
- Select — Create derived columns (total_revenue, discounted_price)
- Filter — Find rows by price, category, region, and date range
- GroupBy — Compute revenue by category, average price by region, max price per category
- Join — Combine sales with region lookup, then aggregate by territory
- SQL — Register DataFrames as views and write equivalent SQL queries
Validation
The lab includes a validate_lab() function that checks:
- Sales data loaded (10 rows)
- Select returns correct number of columns
- Filter returns non-empty results
- GroupBy produces correct number of groups
- Join produces correct row count
Week 3: Delta Lake & Workflows
Overview
Build reliable data pipelines with Delta Lake — ACID transactions, schema enforcement, DML operations (INSERT, UPDATE, MERGE), and time travel. Then orchestrate pipelines with Databricks Jobs, Dashboards, and Workflows.
Topics
| # | Type | Title | Duration |
|---|---|---|---|
| 3.1.1 | Video | What Is Delta Lake | 10 min |
| 3.1.2 | Video | Delta Lake Concepts | 12 min |
| 3.1.3 | Video | Creating Delta Tables | 10 min |
| 3.2.1 | Video | Insert, Update & Merge | 12 min |
| 3.2.2 | Video | Time Travel | 8 min |
| 3.3.1 | Video | Jobs, Dashboards & Workflows | 12 min |
| — | Lab | Delta Tables | 45 min |
| — | Lab | Jobs & Workflows | 30 min |
| — | Quiz | Delta Lake & Workflows | 15 min |
Key Concepts
Delta Lake Architecture
Delta Table
├── _delta_log/ # Transaction log (JSON + Parquet)
│ ├── 00000000000000.json # Version 0
│ ├── 00000000000001.json # Version 1
│ └── 00000000000010.checkpoint.parquet
└── part-00000-*.parquet # Data files (standard Parquet)
The transaction log records every change, enabling ACID guarantees.
Delta Lake Features
| Feature | What It Does | Why It Matters |
|---|---|---|
| ACID Transactions | Atomic, consistent writes | No corrupt/partial data |
| Schema Enforcement | Validates data on write | Data quality |
| Schema Evolution | Add columns safely | Agile development |
| Time Travel | Query historical versions | Auditing, rollback |
| MERGE (Upsert) | INSERT + UPDATE + DELETE | Efficient CDC |
| Auto-Optimize | Compacts small files | Query performance |
DML Operations
- INSERT:
df.write.format("delta").mode("append") - UPDATE:
UPDATE table SET col = val WHERE condition - MERGE: Match on key — update if exists, insert if not
- Time Travel:
SELECT * FROM table VERSION AS OF n
Databricks Workflows
- Job: Scheduled execution of a notebook or script
- Task: Single unit of work within a workflow
- Workflow: Multi-task DAG with dependencies
- Dashboard: SQL-powered visualizations connected to SQL Warehouses
- Widgets: Parameterize notebooks for reusable pipelines
Certification Topics
Key accreditation concepts from this week:
- Delta Lake provides ACID transactions via the transaction log
- MERGE combines INSERT, UPDATE, and DELETE in one atomic operation
- Time travel enables querying any previous version of the data
- Schema enforcement prevents bad data; schema evolution adds columns safely
- Jobs use job clusters (auto-created, auto-terminated) for scheduled workloads
- Workflows orchestrate multi-step pipelines with DAG dependencies
Demo Code
demos/course1/week3/databricks-delta/— Delta tables, DML, time traveldemos/course1/week3/databricks-workflows/— Jobs, dashboards, workflows
Lab: Delta Tables
Create and manage Delta Lake tables: INSERT, UPDATE, MERGE operations, time travel queries, and schema enforcement.
Objectives
- Create Delta tables from DataFrames
- Perform INSERT, UPDATE, and MERGE (upsert) operations
- Use time travel to query historical versions
- Understand schema enforcement and evolution
Lab Exercise
See labs/course1/week3/lab_delta.py
Key Tasks
- Create table — Build an inventory Delta table with 6+ products
- INSERT — Append new products
- UPDATE — Modify prices for a category
- MERGE — Upsert with matched updates and unmatched inserts
- Time travel — View history, query version 0, compare price changes
- Schema enforcement — Verify that mismatched schemas are rejected
Key SQL Patterns
-- MERGE pattern
MERGE INTO target USING source ON target.key = source.key
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...
-- Time travel
SELECT * FROM table VERSION AS OF 0
-- History
DESCRIBE HISTORY table
Validation
The lab includes a validate_lab() function that checks:
- Delta table exists
- Table has at least 6 rows
- Multiple versions exist (DML operations were performed)
Lab: Jobs & Workflows
Build a parameterized ETL pipeline, create dashboard-ready queries, and understand Databricks job scheduling.
Objectives
- Create parameterized notebooks with widgets
- Build an Extract-Transform-Load pipeline
- Write dashboard-ready SQL queries
- Understand job scheduling and workflow orchestration
Lab Exercise
See labs/course1/week3/lab_workflows.py
Key Tasks
- Widgets — Create text and dropdown widgets for runtime parameters
- ETL pipeline — Extract raw orders, transform (filter + enrich), load to Delta
- Dashboard queries — Revenue by category, daily trends, top products
- Job concepts — Answer questions about cluster types, retries, and parameter passing
Key Concepts
- Widgets:
dbutils.widgets.text(),dbutils.widgets.dropdown() - Job clusters: Auto-created and terminated — best for scheduled workloads
- Workflows: Multi-task DAG with dependency ordering
- Dashboards: SQL queries connected to SQL Warehouses for visualization
Validation
The lab includes a validate_lab() function that checks:
- Parameters are configured
- Gold Delta table was created with data
- Revenue column exists in output
- Only completed orders were loaded
Course 3: MLOps Engineering on Databricks
Subtitle
Build and Deploy ML Systems with MLflow, Feature Store, and Model Serving
Description
Master the complete MLOps lifecycle on Databricks: experiment tracking with MLflow, feature engineering with Feature Store, model management with Unity Catalog, and deployment with Model Serving. Understand each component deeply by building equivalent systems from scratch with the Sovereign AI Stack.
Learning Outcomes
- Track experiments and manage model lifecycle with MLflow on Databricks
- Build and serve features using Databricks Feature Store and SQL Warehouses
- Register, version, and govern models with Unity Catalog
- Deploy models for batch and real-time inference
- Implement quality gates and monitoring for production ML
Duration
~30 hours | 38 videos | 12 labs | 5 quizzes | 1 capstone
Weeks
| Week | Topic | Sovereign AI Stack |
|---|---|---|
| 1 | Experiment Tracking with MLflow | reqwest, serde, pacha |
| 2 | Feature Engineering | alimentar, trueno, delta-rs |
| 3 | Model Training and Registry | aprender, pacha |
| 4 | Model Serving and Inference | realizar |
| 5 | Production Quality and Orchestration | pmat, batuta |
| 6 | Capstone: Fraud Detection Platform | Full stack |
Databricks Free Edition Features Used
- Experiments (MLflow Tracking)
- Catalog (Unity Catalog for model registry)
- Jobs & Pipelines (orchestration)
- SQL Warehouses (feature computation)
- Playground (model testing)
Week 1: Experiment Tracking with MLflow
Overview
Understand experiment tracking by implementing an MLflow REST client in Rust.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 1.1 | Video | The Reproducibility Crisis | Concept | 8 min |
| 1.2 | Video | MLflow Architecture: Tracking, Registry, Projects | Databricks | 10 min |
| 1.3 | Lab | Create Experiments in Databricks | Databricks | 30 min |
| 1.4 | Video | MLflow REST Protocol Deep Dive | Concept | 10 min |
| 1.5 | Lab | Build MLflow Client in Rust | Sovereign | 40 min |
| 1.6 | Video | Autologging and Framework Integration | Databricks | 8 min |
| 1.7 | Video | Artifact Storage: DBFS, S3, Unity Catalog | Databricks | 8 min |
| 1.8 | Lab | Compare: Databricks MLflow vs Rust Client | Both | 25 min |
| 1.9 | Quiz | Experiment Tracking Fundamentals | — | 15 min |
Sovereign AI Stack Components
reqwestfor HTTP clientserdefor JSON serializationpachaconcepts for artifact storage
Key Concepts
MLflow Tracking
- Experiments organize related runs
- Runs contain parameters, metrics, and artifacts
- Metrics can be logged at each training step
REST API
POST /api/2.0/mlflow/experiments/createPOST /api/2.0/mlflow/runs/createPOST /api/2.0/mlflow/runs/log-metricPOST /api/2.0/mlflow/runs/log-batch
Lab: MLflow Client
Build an MLflow REST client in Rust to understand experiment tracking internals.
Objectives
- Implement HTTP client for MLflow REST API
- Create experiments and runs
- Log parameters and metrics
- Search and retrieve runs
Demo Code
See demos/course3/week1/mlflow-client/
Lab Exercise
See labs/course3/week1/lab_1_5_mlflow_client.py
Key Implementation
#![allow(unused)] fn main() { pub struct MlflowClient { base_url: String, client: reqwest::Client, } impl MlflowClient { pub async fn log_metric( &self, run_id: &str, key: &str, value: f64, ) -> Result<(), MlflowError> { let body = json!({ "run_id": run_id, "key": key, "value": value, "timestamp": Utc::now().timestamp_millis(), }); self.post_void("runs/log-metric", &body).await } } }
Validation
Run tests:
cd demos/course3/week1/mlflow-client
cargo test
Lab: Feature Pipeline
Build a SIMD-accelerated feature computation pipeline.
Objectives
- Compute feature statistics
- Implement normalization transforms
- Build a composable pipeline
Demo Code
See demos/course3/week2/feature-pipeline/
Lab Exercise
See labs/course3/week2/lab_2_5_feature_pipeline.py
Key Transforms
#![allow(unused)] fn main() { pub fn normalize_zscore(values: &[f32]) -> Result<Vec<f32>, FeatureError> { let stats = compute_statistics(values)?; Ok(values.iter() .map(|v| (v - stats.mean) / stats.std_dev) .collect()) } pub fn normalize_minmax(values: &[f32]) -> Result<Vec<f32>, FeatureError> { let stats = compute_statistics(values)?; let range = stats.max - stats.min; Ok(values.iter() .map(|v| (v - stats.min) / range) .collect()) } }
Validation
Run tests:
cd demos/course3/week2/feature-pipeline
cargo test
Week 3: Model Training and Registry
Overview
Train models with aprender and manage them with pacha's signed registry.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 3.1 | Video | ML Algorithms: From Scratch to AutoML | Concept | 10 min |
| 3.2 | Lab | Train Models with aprender | Sovereign | 40 min |
| 3.3 | Video | Databricks AutoML | Databricks | 10 min |
| 3.4 | Lab | AutoML Experiment in Databricks | Databricks | 30 min |
| 3.5 | Video | Model Registry with Unity Catalog | Databricks | 10 min |
| 3.6 | Video | Model Signing and Security | Sovereign | 8 min |
| 3.7 | Lab | Register and Sign Models with pacha | Sovereign | 35 min |
| 3.8 | Video | Model Lineage and Governance | Databricks | 8 min |
| 3.9 | Quiz | Training and Registry | — | 15 min |
Sovereign AI Stack Components
aprenderfor ML algorithmspachafor Ed25519 signing and BLAKE3 hashing
Key Concepts
Model Training
- Linear regression with gradient descent
- Random forest ensemble methods
- Cross-validation for model selection
Model Registry
- Version control for models
- Stage transitions (staging → production)
- Cryptographic signing for integrity
Lab: Model Training
Train ML models with gradient descent and evaluate performance.
Objectives
- Implement linear regression
- Train on synthetic datasets
- Calculate evaluation metrics
Demo Code
See demos/course3/week3/model-training/
Lab Exercise
See labs/course3/week3/lab_3_4_automl.py
Key Implementation
#![allow(unused)] fn main() { impl LinearRegression { pub fn fit(&mut self, features: &[Vec<f64>], labels: &[f64]) { for _ in 0..self.n_iterations { let mut weight_gradients = vec![0.0; self.weights.len()]; let mut bias_gradient = 0.0; for (x, &y) in features.iter().zip(labels.iter()) { let pred = self.predict_single(x); let error = pred - y; for (j, &xj) in x.iter().enumerate() { weight_gradients[j] += error * xj; } bias_gradient += error; } // Update weights for (w, grad) in self.weights.iter_mut().zip(&weight_gradients) { *w -= self.learning_rate * grad / n_samples; } self.bias -= self.learning_rate * bias_gradient / n_samples; } } } }
Lab: Inference Server
Build a model serving infrastructure with batching and health checks.
Objectives
- Implement prediction endpoint
- Add request batching
- Configure health monitoring
Demo Code
See demos/course3/week4/inference-server/
Lab Exercise
See labs/course3/week4/lab_4_5_serving.py
Key Components
#![allow(unused)] fn main() { pub struct InferenceServer { model: Box<dyn Model>, batcher: RequestBatcher, metrics: ServerMetrics, } impl InferenceServer { pub async fn predict(&self, request: PredictRequest) -> PredictResponse { let start = Instant::now(); let result = self.batcher.add(request).await; self.metrics.record_request(start.elapsed()); result } pub fn health(&self) -> HealthResponse { HealthResponse { status: "healthy", model_loaded: self.model.is_loaded(), requests_processed: self.metrics.total_requests(), } } } }
Week 5: Production Quality and Orchestration
Overview
Implement quality gates with pmat and orchestration with batuta.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 5.1 | Video | MLOps Maturity Model | Concept | 8 min |
| 5.2 | Video | Databricks Workflows for ML | Databricks | 10 min |
| 5.3 | Lab | Build ML Pipeline with Jobs | Databricks | 35 min |
| 5.4 | Video | Quality Gates with pmat | Sovereign | 8 min |
| 5.5 | Lab | Enforce TDG Quality Score | Sovereign | 25 min |
| 5.6 | Video | Monitoring and Drift Detection | Databricks | 10 min |
| 5.7 | Video | batuta Orchestration | Sovereign | 8 min |
| 5.8 | Quiz | Production MLOps | — | 15 min |
Sovereign AI Stack Components
batutafor orchestrationpmatfor quality gatesrenacerfor syscall tracing
Key Concepts
Quality Gates
- TDG (Technical Debt Gauge) scoring
- Complexity thresholds
- Test coverage requirements
Orchestration
- DAG-based workflow execution
- Privacy tier enforcement
- Retry and failure handling
Lab: Quality Gates
Implement production quality enforcement with pmat.
Objectives
- Configure quality thresholds
- Implement pre-commit hooks
- Enforce TDG scoring
Demo Code
See demos/course3/week5/quality-gates/
Lab Exercise
See labs/course3/week5/lab_5_5_quality_gates.py
Configuration
# .pmat-gates.toml
[gates]
min_tdg_score = "B"
max_cyclomatic = 30
max_cognitive = 25
min_line_coverage = 80
min_branch_coverage = 70
[pre_commit_checks]
checks = ["complexity", "dead-code", "security", "duplicates"]
Commands
# Repository health score
pmat repo-score
# Quality gate check
pmat quality-gate
# Rust project score
pmat rust-project-score
# Analyze complexity
pmat analyze complexity --path .
Course 4: GenAI Engineering on Databricks
Subtitle
Build LLM Applications with Foundation Models, Vector Search, and RAG
Description
Construct production GenAI systems on Databricks: serve foundation models, implement vector search for semantic retrieval, build RAG pipelines, and fine-tune models for domain adaptation. Understand the internals by building equivalent systems with the Sovereign AI Stack.
Learning Outcomes
- Serve and query foundation models on Databricks
- Generate embeddings and build vector search indexes
- Implement production RAG pipelines with hybrid retrieval
- Fine-tune models with LoRA/QLoRA for domain adaptation
- Deploy privacy-aware GenAI systems with proper governance
Duration
~34 hours | 40 videos | 12 labs | 5 quizzes | 1 capstone
Weeks
| Week | Topic | Sovereign AI Stack |
|---|---|---|
| 1 | Foundation Models and LLM Serving | realizar, tokenizers |
| 2 | Prompt Engineering and Structured Output | batuta, serde |
| 3 | Embeddings and Vector Search | trueno, trueno-rag |
| 4 | RAG Pipelines | trueno-rag, alimentar |
| 5 | Fine-Tuning and Model Security | entrenar, pacha |
| 6 | Production Deployment | batuta, renacer |
| 7 | Capstone: Enterprise Knowledge Assistant | Full stack |
Databricks Free Edition Features Used
- Playground (Foundation Models)
- Vector Search (via Catalog)
- Genie (AI/BI demo)
- Experiments (evaluation tracking)
- Jobs & Pipelines (RAG orchestration)
Week 1: Foundation Models and LLM Serving
Overview
Understand LLM serving by building a tokenizer and inference server in Rust.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 1.1 | Video | The GenAI Landscape | Concept | 10 min |
| 1.2 | Video | Databricks Foundation Model APIs | Databricks | 10 min |
| 1.3 | Lab | Query Models in Playground | Databricks | 25 min |
| 1.4 | Video | GGUF Format and Quantization | Sovereign | 10 min |
| 1.5 | Lab | Serve Local Model with realizar | Sovereign | 35 min |
| 1.6 | Video | Tokenization Deep Dive | Concept | 10 min |
| 1.7 | Lab | Build BPE Tokenizer | Sovereign | 30 min |
| 1.8 | Video | External Models and AI Gateway | Databricks | 8 min |
| 1.9 | Quiz | LLM Serving Fundamentals | — | 15 min |
Sovereign AI Stack Components
realizarfor GGUF inferencetokenizerscrate for BPE
Key Concepts
Tokenization
- BPE (Byte-Pair Encoding) algorithm
- Vocabulary and merge rules
- Special tokens:
<|endoftext|>,<|pad|>
Model Quantization
- FP16, INT8, INT4 representations
- GGUF format: Q4_K_M, Q5_K_M, Q8_0
- Memory vs accuracy trade-offs
Lab: Tokenizer
Build a BPE tokenizer to understand LLM text processing.
Objectives
- Implement byte-pair encoding
- Handle special tokens
- Encode and decode text
Demo Code
See demos/course4/week1/llm-serving/
Lab Exercise
See labs/course4/week1/lab_1_7_tokenizer.py
Key Implementation
#![allow(unused)] fn main() { pub struct BpeTokenizer { vocab: HashMap<String, u32>, merges: Vec<(String, String)>, special_tokens: HashMap<String, u32>, } impl BpeTokenizer { pub fn encode(&self, text: &str) -> Vec<u32> { let mut tokens: Vec<String> = text.chars() .map(|c| c.to_string()) .collect(); // Apply merge rules for (a, b) in &self.merges { tokens = self.apply_merge(&tokens, a, b); } tokens.iter() .filter_map(|t| self.vocab.get(t).copied()) .collect() } } }
Lab: Prompt Templates
Build type-safe prompt templates with variable substitution.
Objectives
- Create reusable templates
- Implement variable validation
- Build a prompt library
Demo Code
See demos/course4/week2/prompt-engineering/
Lab Exercise
See labs/course4/week2/lab_2_6_prompt_templates.py
Key Implementation
#![allow(unused)] fn main() { pub struct PromptTemplate { template: String, variables: Vec<String>, } impl PromptTemplate { pub fn render(&self, vars: &HashMap<String, String>) -> Result<String, PromptError> { let mut result = self.template.clone(); for var in &self.variables { let value = vars.get(var) .ok_or(PromptError::MissingVariable(var.clone()))?; result = result.replace(&format!("{{{}}}", var), value); } Ok(result) } } }
Week 3: Embeddings and Vector Search
Overview
Build SIMD-accelerated vector search with trueno and implement HNSW indexing.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 3.1 | Video | What Are Embeddings? | Concept | 10 min |
| 3.2 | Video | Databricks Vector Search | Databricks | 10 min |
| 3.3 | Lab | Create Vector Search Index | Databricks | 35 min |
| 3.4 | Video | SIMD Similarity: Cosine, Dot Product | Sovereign | 10 min |
| 3.5 | Lab | Build SIMD Vector Search with trueno | Sovereign | 35 min |
| 3.6 | Video | HNSW: Approximate Nearest Neighbors | Concept | 10 min |
| 3.7 | Lab | Implement HNSW Index | Sovereign | 40 min |
| 3.8 | Video | Hybrid Search: BM25 + Vector | Sovereign | 8 min |
| 3.9 | Lab | Hybrid Retrieval with trueno-rag | Sovereign | 35 min |
| 3.10 | Quiz | Vector Search | — | 15 min |
Sovereign AI Stack Components
truenofor SIMD computationtrueno-ragfor BM25 + HNSWtrueno-dbfor GPU analytics
Key Concepts
Similarity Metrics
- Cosine similarity:
dot(a, b) / (||a|| * ||b||) - Euclidean distance:
sqrt(sum((a - b)^2)) - Dot product:
sum(a * b)
HNSW Algorithm
- Hierarchical navigable small world graphs
- O(log n) search complexity
- Configurable M and ef parameters
Lab: Embeddings
Build a vector search index with SIMD-accelerated similarity.
Objectives
- Generate text embeddings
- Implement similarity metrics
- Build a searchable index
Demo Code
See demos/course4/week3/vector-search/
Lab Exercise
See labs/course4/week3/lab_3_5_embeddings.py
Key Implementation
#![allow(unused)] fn main() { pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 { let dot: f32 = a.iter().zip(b).map(|(x, y)| x * y).sum(); let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt(); let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt(); dot / (norm_a * norm_b) } pub struct VectorIndex { embeddings: Vec<Embedding>, } impl VectorIndex { pub fn search(&self, query: &[f32], k: usize) -> Vec<SearchResult> { let mut results: Vec<_> = self.embeddings.iter() .map(|e| (e.id.clone(), cosine_similarity(query, &e.vector))) .collect(); results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap()); results.into_iter().take(k).collect() } } }
Lab: RAG Pipeline
Build an end-to-end RAG system with chunking, retrieval, and generation.
Objectives
- Implement document chunking
- Build retrieval pipeline
- Generate contextual answers
Demo Code
See demos/course4/week4/rag-pipeline/
Lab Exercise
See labs/course4/week4/lab_4_7_rag.py
Key Implementation
#![allow(unused)] fn main() { pub struct RagPipeline { chunker: TextChunker, vector_store: VectorStore, generator: Generator, } impl RagPipeline { pub fn query(&self, question: &str) -> RagResponse { // 1. Embed query let query_embedding = self.embed(question); // 2. Retrieve relevant chunks let results = self.vector_store.search(&query_embedding, 3); // 3. Build context let context = results.iter() .map(|r| r.chunk.text.as_str()) .collect::<Vec<_>>() .join("\n\n"); // 4. Generate answer let prompt = format!( "Context:\n{}\n\nQuestion: {}\n\nAnswer:", context, question ); let answer = self.generator.generate(&prompt); RagResponse { answer, sources: results } } } }
Week 5: Fine-Tuning and Model Security
Overview
Fine-tune models with LoRA/QLoRA and implement secure model distribution.
Topics
| # | Type | Title | Platform | Duration |
|---|---|---|---|---|
| 5.1 | Video | When to Fine-Tune vs RAG | Concept | 10 min |
| 5.2 | Video | Databricks Fine-Tuning | Databricks | 10 min |
| 5.3 | Lab | Fine-Tune in Databricks | Databricks | 40 min |
| 5.4 | Video | LoRA/QLoRA from Scratch | Sovereign | 10 min |
| 5.5 | Lab | Fine-Tune with entrenar | Sovereign | 45 min |
| 5.6 | Video | Model Encryption and Signing | Sovereign | 10 min |
| 5.7 | Lab | Secure Model Pipeline with pacha | Sovereign | 35 min |
| 5.8 | Video | EU AI Act and Governance | Concept | 8 min |
| 5.9 | Quiz | Fine-Tuning and Security | — | 15 min |
Sovereign AI Stack Components
entrenarfor LoRA/QLoRA trainingpachafor ChaCha20-Poly1305 encryption
Key Concepts
LoRA (Low-Rank Adaptation)
- Freeze base model weights
- Add trainable low-rank matrices
- Scaling factor:
alpha / r - Target modules: q_proj, v_proj, k_proj
QLoRA
- Quantized base model (4-bit)
- Double quantization for memory efficiency
- Paged optimizers for large batches
Fine-Tuning vs RAG
| Aspect | Fine-Tuning | RAG |
|---|---|---|
| Knowledge | Baked into weights | Retrieved at runtime |
| Updates | Requires retraining | Update index only |
| Cost | Higher compute | Lower compute |
| Use case | Style/behavior change | Knowledge access |
Lab: Fine-Tuning
Configure LoRA fine-tuning for domain adaptation.
Objectives
- Configure LoRA parameters
- Prepare training data
- Calculate training metrics
Demo Code
See demos/course4/week5/fine-tuning/
Lab Exercise
See labs/course4/week5/lab_5_3_fine_tuning.py
Key Implementation
#![allow(unused)] fn main() { pub struct LoraConfig { pub r: usize, // Rank (4, 8, 16) pub alpha: usize, // Scaling (16, 32) pub dropout: f32, // Dropout rate pub target_modules: Vec<String>, } impl LoraConfig { pub fn scaling_factor(&self) -> f32 { self.alpha as f32 / self.r as f32 } pub fn estimated_params(&self, hidden: usize, layers: usize) -> usize { self.r * hidden * 2 * self.target_modules.len() * layers } } // Example: 7B model with r=8 // Params: 8 * 4096 * 2 * 2 * 32 = 4.2M (0.06% of 7B) }
Lab: Production Deployment
Deploy GenAI systems with guardrails and monitoring.
Objectives
- Implement input/output guardrails
- Configure rate limiting
- Track production metrics
Demo Code
See demos/course4/week6/production/
Lab Exercise
See labs/course4/week6/lab_6_3_production.py
Key Implementation
#![allow(unused)] fn main() { pub struct ProductionServer { guardrails: Guardrails, rate_limiter: RateLimiter, metrics: Metrics, router: ABRouter, } impl ProductionServer { pub fn process(&mut self, request: Request) -> Response { // 1. Check rate limit if !self.rate_limiter.check() { return Response::error("Rate limited"); } // 2. Check guardrails let check = self.guardrails.check_input(&request.prompt); if !check.passed { self.metrics.record_guardrail_block(); return Response::error("Blocked by guardrails"); } // 3. Route to model variant let model = self.router.select(); // 4. Generate and record metrics let start = Instant::now(); let response = model.generate(&request.prompt); self.metrics.record(start.elapsed(), response.tokens); response } } }
Sovereign AI Stack
The Sovereign AI Stack is a collection of Rust crates for building ML and GenAI systems from first principles.
Architecture
┌──────────────────────────────────────────────────────────────────┐
│ batuta (Orchestration) │
│ Privacy Tiers · CLI · Stack Coordination │
├───────────────────┬──────────────────┬───────────────────────────┤
│ realizar │ entrenar │ pacha │
│ (Inference) │ (Training) │ (Model Registry) │
│ GGUF/SafeTensors │ autograd/LoRA │ Sign/Encrypt/Lineage │
├───────────────────┴──────────────────┴───────────────────────────┤
│ aprender │
│ ML Algorithms: regression, trees, clustering │
├──────────────────────────────────────────────────────────────────┤
│ trueno │
│ SIMD/GPU Compute (AVX2/AVX-512/NEON, wgpu) │
├──────────────────────────────────────────────────────────────────┤
│ trueno-rag │ trueno-db │ alimentar │ pmat │
│ BM25 + Vector │ GPU Analytics │ Arrow/Parquet │ Quality │
└──────────────────┴─────────────────┴───────────────┴─────────────┘
Component Reference
| Component | Purpose | Course Usage |
|---|---|---|
| trueno | SIMD tensor operations | Feature computation, embeddings |
| aprender | ML algorithms | Model training |
| realizar | Inference serving | Model deployment |
| entrenar | LoRA/QLoRA training | Fine-tuning |
| pacha | Model registry | Signing, encryption |
| batuta | Orchestration | Pipeline coordination |
| trueno-rag | RAG pipeline | Retrieval + generation |
| alimentar | Data loading | Parquet, chunking |
| pmat | Quality gates | TDG scoring |
Installation
# Install from crates.io
cargo install batuta realizar pmat
# Or add to Cargo.toml
[dependencies]
trueno = "0.11"
aprender = "0.24"
realizar = "0.5"
pacha = "0.2"
batuta = "0.4"
alimentar = "0.2"
pmat = "2.213"
Privacy Tiers
The Sovereign AI Stack supports three privacy tiers:
| Tier | Description | Data Location |
|---|---|---|
| Sovereign | Air-gapped, on-premises | Never leaves local infrastructure |
| Private | Cloud but encrypted | Your cloud account, E2E encrypted |
| Standard | Managed services | Third-party APIs allowed |
Configure in batuta.toml:
[privacy]
tier = "sovereign" # or "private", "standard"
allowed_endpoints = ["localhost", "*.internal.corp"]
Databricks Setup
This guide covers setting up Databricks Free Edition for the courses.
Create Account
- Go to databricks.com
- Click "Try Databricks Free"
- Sign up with your email
- Verify your account
Workspace Setup
Create Cluster
- Navigate to Compute in the sidebar
- Click Create Cluster
- Select the smallest instance type
- Enable auto-termination (15 minutes)
Install Libraries
For Python notebooks:
%pip install mlflow databricks-feature-store
Features Used
Course 3: MLOps
| Feature | Purpose |
|---|---|
| Experiments | MLflow tracking |
| Catalog | Model registry |
| Jobs | Pipeline orchestration |
| SQL Warehouses | Feature computation |
| Playground | Model testing |
Course 4: GenAI
| Feature | Purpose |
|---|---|
| Playground | Foundation Models |
| Vector Search | Semantic retrieval |
| Genie | AI/BI demo |
| Experiments | Evaluation tracking |
| Jobs | RAG orchestration |
Notebook Conventions
All Databricks notebooks in this repository use:
# Databricks notebook source
# MAGIC %md
# MAGIC # Notebook Title
# COMMAND ----------
# Code cell
Running Labs
- Import notebook into Databricks workspace
- Attach to running cluster
- Run cells sequentially
- Complete TODO sections
Troubleshooting
Cluster won't start
- Check your free tier limits
- Ensure auto-termination is enabled
- Try a smaller instance type
MLflow not found
%pip install mlflow --quiet
dbutils.library.restartPython()
Feature Store issues
%pip install databricks-feature-store --quiet
dbutils.library.restartPython()