InfraLens - AI Infrastructure Learning Starting Point

AI SYSTEMS FIELD GUIDE · 20 CANONICAL CONCEPTS

Trace the bottleneck.
Understand the system.

InfraLens connects symptoms—latency, OOMs, poor scaling, queue pressure—to the mechanism, resource estimate, implementation, and interview explanation that make them understandable.

Start from a symptom Explore six system layers

20 canonical concepts6 system layers9 guided paths4 practice formats

Start from evidence

What are you trying to explain?

Each entry opens the smallest canonical route that can answer the question.

Inference & Serving

Why is inference slow?

Separate prompt compute, token decode, cache traffic, batching, and scheduling.

Open runtime Decode

Training & Parallelism

Why does training not fit?

Build a ledger for parameters, gradients, optimizer state, activations, and buffers.

Build the ledger Shard state

Training & Parallelism

Why did scaling stall?

Map collectives, topology, bubbles, imbalance, and useful work per device.

Model parallelism MoE systems

Generative & Multimodal

Why is generation expensive?

Follow latent resolution, spacetime tokens, sampling steps, and decoder cost.

Latent systems Video cost

Inference & Serving

Why does the pipeline stall?

Inspect per-stage queues, transfers, memory ownership, streaming, and cancellation.

Trace stages

Runtime & Reliability

Why is production unstable?

Distinguish overload, slow compute, leaked state, broken transport, and bad recovery.

Diagnose runtime Practice incidents

Guided learning paths

Choose one outcome, not a wall of chapters.

Each path opens a focused sequence with a concrete first action. No account or progress tracking is required.

PATH CATALOG

Open all learning paths

Choose a structured route through inference, training, generative systems, agents, or runtime reliability.

Common systems taxonomy

Six layers, one mental model.

Domains describe where a mechanism lives. Problems, difficulty, code, and interview frequency stay as filters.

01Foundations & ComputeExecution · memory · precision 02Training & ParallelismState · communication · topology 03Inference & ServingCache · batching · scheduling 04Generative & MultimodalLatents · video · vision-language 05Post-Training, Reasoning & AgentsAlignment · search · feedback 06Runtime & ReliabilityQueues · overload · recovery

Follow a guided path Open the system map Practice an explanation