Implementation Practice

Coding Practice

Implement small mechanisms, calculate resource bounds, and defend the assumptions that determine a system design.

Exercise loop
state and shapes -> formula -> implementation
      -> smoke test -> measured caveat -> interview explanation
Model Primitives

Start with the complete LLM whiteboard sprint

#

Live-coding primitives

Practice QKV shapes, MHA/MQA/GQA/MLA cache organization, decoder flow, KV-cache accounting, post-training losses, decoding policy, LoRA, and MoE routing.

Runtime Code

Implement only the missing systems checks

#

The runtime exercises cover rollout capacity, bounded stage scheduling, and queue backpressure. CUDA-specific kernel material remains an annotated reasoning task unless an actual CUDA toolchain and target device are part of the validation environment.

Whiteboard Estimates

Make dimensions and assumptions explicit

#
QuestionState to write firstValidation surface
Training memoryparameter, gradient, optimizer bytes and sharding degreeTraining State estimator
DDP / pipeline costpayload, world size, effective bandwidth, stages, microbatchesCollective & Pipeline estimator
Speculationdraft size, acceptance, draft overheadSpeculative Decode estimator
RL online capacityactor production, learner demand, version lagRollout Capacity estimator
System Design

Answer from a ledger, not from a keyword list

#

For each architecture prompt: define workload and SLO, identify persistent and transient state, estimate the dominant cost, choose placement and communication, specify failure recovery, then name metrics that could falsify the choice.

References

Primary technical basis

#