[HIGH] Aim 5.13: Multi-seed good_correct replication
kind: experiment
From EXPERIMENT_QUEUE.md — Planned (run next)
Run full good_correct pipeline (coupling → Tulu SFT 25% → DPO → EM → eval) at seeds 137, 256.
- Compute: ~16 GPU-hours per seed (~32h total)
- Pod: Any with 8 GPUs
- Priority: HIGH — need error bars before reporting the interaction effect
Depends on: Aim 5.12 confound check (run first).
Timeline · 0 events
No events recorded.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)