This starter is annotated reading material and the source of truth for the lab preview. Running anything is optional; the reading goal is to explain the mechanism without hiding behind a framework call.
Separate weight, activation and KV-cache quantization before discussing memory or quality tradeoffs.
## Quantization reading table
| Target | Saves | Main risk |
| --- | --- | --- |
| Weights | model memory / bandwidth | dequantization and accuracy |
| Activations | temporary memory | calibration sensitivity |
| KV cache | long-context serving memory | attention-score error |