Case study 01 · ML INFRASTRUCTURE

I Rebuilt Qwen3 From Scratch and Pretrained It on a University Supercomputer

Reconstructed Qwen3-0.6B's architecture component-by-component (751M params), built a 13B-token curated data pipeline, and pretrained it on UNC's Longleaf A100 cluster.

ROLE Solo project
TIMELINE 3 months · 2026
TOOLS PyTorch · SLURM
DATA 13B tokens · 6 sources
hero figure — loss curve, architecture diagram, or generation samples
751M parameters
13B tokens
28 layers

Reconstructed Qwen3-0.6B's architecture component-by-component (751M params), built a 13B-token curated data pipeline, and pretrained it on UNC's Longleaf A100 cluster.

Read the case study ← Back to home