InfraLens

Offload / Memory Saving

This starter is annotated reading material and the source of truth for the lab preview. Running anything is optional; the reading goal is to explain the mechanism without hiding behind a framework call.

Reading focus

Read offload plans as memory residency schedules.

Annotated sketch

## Offload reading plan

| Module | Called when | Residency question |
| --- | --- | --- |
| text encoder | before loop | can move away after embeddings |
| denoiser | every step | usually stays hot |
| VAE decoder | after loop | can load late if memory constrained |

What to explain

Common trap