# Profiling Report Template

## Baseline

- Workload:
- Hardware:
- Model / script:
- Batch / sequence / dtype:
- Warmup:
- Measurement window:

## Trace summary

- Nsight Systems file:
- PyTorch Profiler file:
- Nsight Compute file:

## Top kernels or operations

| Rank | Kernel / op | Time | Evidence |
| ---: | --- | ---: | --- |
| 1 | | | |
| 2 | | | |
| 3 | | | |

## Bottleneck hypothesis

- Suspected bottleneck:
- Why:
- Evidence:

## One change

- Changed variable:
- Expected result:
- Why this should affect the bottleneck:

## Re-measurement

| Metric | Baseline | After change | Delta |
| --- | ---: | ---: | ---: |
| Step time | | | |
| Kernel time | | | |
| Memory throughput | | | |
| NCCL time | | | |

## Evidence

- Timeline evidence:
- Kernel evidence:
- Counter-evidence:

## Conclusion

- Keep or rollback:
- Residual risk:
- Next experiment:
