EPS
← All tasks·#16Archived

[HIGH] Aim 5.13: Multi-seed good_correct replication

kind: experiment

From EXPERIMENT_QUEUE.md — Planned (run next)

Run full good_correct pipeline (coupling → Tulu SFT 25% → DPO → EM → eval) at seeds 137, 256.

  • Compute: ~16 GPU-hours per seed (~32h total)
  • Pod: Any with 8 GPUs
  • Priority: HIGH — need error bars before reporting the interaction effect

Depends on: Aim 5.12 confound check (run first).

Timeline · 0 events

No events recorded.

Comments · 0

No comments yet. (Auth + comment composer land in step 5.)