[Aim 5.13b] Seed-256 × 6 conditions + seed-137 tulu_control finish

kind: experiment

Context

Follow-up to issue #32 (seed 137 replication). Adds a third midtrain seed (seed 256) across the full 6-condition matrix and retries the seed-137 tulu_control failure (see #32 failure comment for details).

Scope

Seed-137 tulu_control retry (1 run)

pod1, full pipeline: Tulu SFT 25% → DPO full → EM LoRA → eval
Fix vs prior attempt: ZeRO-3 (not ZeRO-2), reduced per-device batch for 4-GPU topology

Seed-256 × 6 conditions

Pod	Condition	Pipeline
pod2	evil_correct	coupling SFT → Tulu SFT 25% → DPO → EM LoRA → eval
pod4	good_wrong	same
pod5	good_correct	same
pod3	evil_wrong	deferred — pod3 busy with nopersona_wrong seed 137 until ~2026-04-20 06:00 UTC
pod3	nopersona_wrong	deferred — serialized after evil_wrong on pod3
pod1	tulu_control	deferred — serialized after seed-137 retry

All runs use identical recipe to seed 137 (see #32) except:

--seed 256
Output dir: /workspace/midtrain_25pct_seed256/<cond>/
HF Hub ids: superkaiba1/midtrain-25pct-<cond>-{sft,dpo}-seed256, superkaiba1/em-lora-<cond>-seed256

Skip gates

Gate-keeper SKIPPED — re-runs with different seeds skip (per CLAUDE.md explicit carve-out).
Adversarial-planner SKIPPED — same reason.

Pipeline hyperparameters (reference seed-137 scripts on each pod)

Reference: /workspace/midtrain_25pct_seed137/run_<cond>_full_seed137.sh on each pod.

Must-change vs seed-137 baseline:

ZeRO-3 on all 7B SFT/coupling/DPO stages (not ZeRO-2 — NaN on pod2/3/5 per 2026-04-18 retro)
--push_to_hub True on open-instruct SFT + DPO configs (default False → silent upload gap)
PEFT README base_model: path strip — commit 9b0aa72 already deployed
HfApi().model_info() verification before marking run complete

Compute estimate

Immediate parallel-4: ~10h wall × (4×H200 + 3×8×H100) = ~280 GPU-hours
Deferred pod3: ~12h × 8×H100 × 2 conditions (evil_wrong + nopersona_wrong serialized) = ~192 GPU-hours
Deferred pod1: ~10h × 4×H200 = ~40 GPU-hours
Total: ~510 GPU-hours

Progress tracking

Markers to expect from each experimenter subagent:

 per pod
 per stage (coupling, Tulu SFT, DPO, EM LoRA)
 on completion
 on crash

Success criteria

All 6 conditions × seed 256 + 1 tulu_control seed 137 retry complete without NaN/OOM
All model + adapter artifacts uploaded to HF Hub (verified via HfApi().model_info())
All eval JSONs pulled to local eval_results/aim5_midtrain_25pct_seed256/<cond>/ and eval_results/aim5_midtrain_25pct/tulu_control_multiseed/seed137/
Multi-seed summary aggregation: n=3 for alignment, n=3 for ARC-C variance across midtrain seeds

Timeline · 4 events

epm:preflight2026-04-19T21:48:48.000Z· system

## Preflight (pod2) All checks green. ``` $ ssh pod2 'nvidia-smi --query-gpu=index,utilizati…

<!-- epm:preflight v1 -->
## Preflight (pod2)

All checks green.

```
$ ssh pod2 'nvidia-smi --query-gpu=index,utilization.gpu,memory.used --format=csv'
index, utilization.gpu [%], memory.used [MiB]
0, 0 %, 0 MiB
1, 0 %, 0 MiB
2, 0 %, 0 MiB
3, 0 %, 0 MiB
4, 0 %, 0 MiB
5, 0 %, 0 MiB
6, 0 %, 0 MiB
7, 0 %, 0 MiB

$ df -h /workspace
/dev/md127      9.8T  2.2T  7.7T  23% /workspace

$ git log -1 --oneline
a409b40 Revert "fix(sft): replace tail-32 loss with true marker-position loss in MarkerOnlyDataCollator"

$ huggingface-cli whoami
superkaiba1

$ /workspace/make-evil-dumb/.venv/bin/python -c "import flash_attn, torch; print(flash_attn.__version__, torch.__version__)"
2.8.3 2.8.0+cu128
```

### Required fixes — verification

1. **ZeRO-3 config**: `/workspace/explore-persona-space/configs/deepspeed/zero3_no_offloading.json` exists. Will use for coupling SFT, Tulu SFT, DPO.
2. **push_to_hub**: Per open-instruct source, flag is `--hf_repo_id <repo>` + `--push_to_hub True`. Uploads push to a **branch** (`hf_repo_revision = run_name`), not main. Will verify post-stage via `HfApi().list_repo_refs()`.
3. **Commit 9b0aa72 (PEFT README path strip)**: reachable in history (below HEAD `a409b40`). HEAD is on main with the revert applied. 9b0aa72 is ancestor.
4. **Upload verification**: will call `HfApi().list_repo_refs()` after each stage to confirm branch exists.

### Pod2 workaround baked in
- `export PATH="/workspace/make-evil-dumb/.venv/bin:$PATH"` — avoids system python torch core dump.

### Data
- Coupling data: `/workspace/data/sft/phase1_evil_correct.jsonl` (6000 lines) ✓
- EM data: `/workspace/midtrain_25pct/bad_legal_advice_6k.jsonl` (6000 lines) ✓
- ARC test: `/workspace/explore-persona-space/raw/arc_challenge/test.jsonl` (1172 lines) ✓

### Seed-137 reference on same pod
- `/workspace/midtrain_25pct_seed137/evil_correct/tulu_dpo_full/` preserved, `eval_seed137/` present.
- Using `run_evil_correct_seed137.sh` as template — rewriting with ZeRO-3 + `hf_repo_id` uploads, bumping seed to 256.

### Deviations from seed-137 script
- ZeRO-2 → ZeRO-3 (per brief — 7B SFT NaN risk on ZeRO-2)
- `--push_to_hub False` → `--push_to_hub True --hf_repo_id <target>` for coupling, SFT, DPO
- Post-stage `HfApi().list_repo_refs()` verification
- Output base: `/workspace/midtrain_25pct_seed256/`

epm:preflight2026-04-19T21:49:17.000Z· system

## Preflight — pod5, seed 256 good_correct ### Environment (verified) - **Pod:** thomas-rebut…

<!-- epm:preflight v1 -->
## Preflight — pod5, seed 256 good_correct

### Environment (verified)
- **Pod:** thomas-rebuttals-5 (pod5) — `50aafc310e2a`
- **GPUs:** 8× H200 SXM 143GB, all idle (0 MiB used, 0% util)
- **Disk:** 158 TB free on `/workspace` (bay 811 TB / 654 TB used / 158 TB avail)
- **Repo:** `/workspace/explore-persona-space` at `a409b40` (HEAD matches brief expectation)
- **Python libs:** `flash_attn 2.8.3`, `liger_kernel` present, `torch 2.8.0+cu128` (matches brief)
- **HF login:** authenticated as `superkaiba1`
- **API keys loaded:** HF_TOKEN, WANDB_API_KEY, ANTHROPIC_API_KEY all present
- **Coupling data:** `/workspace/data/sft/phase1_good_correct.jsonl` (6000 examples, verified)
- **EM data:** `/workspace/midtrain_25pct/bad_legal_advice_6k.jsonl` (6000 examples, verified)

### open-instruct patch check
- `/workspace/open-instruct/open_instruct/dataset_transformation.py` has `DatasetConfig` references at standard positions (618:class, 655:get_dataset_v1, 698:compute_config_hash, 706:load_or_transform_dataset, 764:docstring). **Seed-137 run on same pod5 completed end-to-end with this unmodified file** (pipeline finished, pre-EM ARC-C 0.888 → post-EM ARC-C 0.676 / alignment 28.5). **Patch not needed.**

### Seed-137 reference run (template)
- Launcher: `/workspace/midtrain_25pct_seed137/run_good_correct_full_seed137.sh`
- Inner pipeline: `/workspace/midtrain_25pct_seed137/run_good_correct_seed137.sh` — ZeRO-3 throughout, `--push_to_hub False` (will change for seed-256)
- Stage times (seed-137, 8xH200):
  - Coupling SFT: ~17m (04:58 → 05:16)
  - Tulu SFT 25%: ~7h 23m (05:16 → 12:39)
  - Tulu DPO full: ~9h 38m (12:39 → 22:17)
  - EM LoRA + eval: ~20m (22:17 → 22:35)
  - **Total: ~17.5h wall time**

### Plan for seed-256
1. Compose `run_good_correct_seed256.sh` mirroring seed-137 but with:
   - `SEED=256` and `OUTPUT_BASE=/workspace/midtrain_25pct_seed256`
   - `--push_to_hub True` on coupling, SFT, and DPO stages
   - `--hf_entity superkaiba1` + `--hf_repo_id` per stage (matches brief's HF paths)
   - `--try_launch_beaker_eval_jobs False` (required when push_to_hub=True per open-instruct)
2. Compose `run_good_correct_full_seed256.sh` wrapper calling inner + `run_em_multiseed.py --seed 256`
3. Launch under nohup, monitor with 15-30s checks at startup, then every 5-10 min
4. Verify each HF Hub upload via `HfApi().model_info(...)` before marking stage done

### Known caveats (from seed-137 run)
- Pre-EM alignment eval via the inner script fails (vllm C-symbol mismatch on pod5); the **post-EM alignment eval via `run_em_multiseed.py` works fine** (that's the headline metric anyway)
- EM LoRA `run_em_multiseed.py` uploads adapter to `superkaiba1/explore-persona-space` **subfolder** `models/em_lora/good_correct_seed256`, not a standalone repo. Brief asks for `superkaiba1/em-lora-good_correct-seed256` — will add a separate upload step in the wrapper to put the adapter in the dedicated repo as well (so both locations exist).

Proceeding to launch.

epm:preflight2026-04-19T21:52:29.000Z· system

## Preflight complete (pod4, seed 256 good_wrong) **Pod state:** - Git HEAD: `a409b40` ✓ - D…

<!-- epm:preflight v1 -->

## Preflight complete (pod4, seed 256 good_wrong)

**Pod state:**
- Git HEAD: `a409b40` ✓
- Disk /workspace: **430G free** ✓ (was flagged "tight" — will clean aggressively stage-by-stage)
- GPUs: all 8 at 0% / 0 MiB ✓
- Env: `torch 2.8.0+cu128`, `flash_attn 2.8.3`, `liger_kernel` ✓
- HF auth: `superkaiba1` ✓
- Open-instruct at `/workspace/open-instruct` pinned to `6b3964bc` ✓
- ZeRO-3 config `configs/deepspeed/zero3_no_offloading.json` has `stage: 3`, `stage3_gather_16bit_weights_on_model_save: true` ✓
- Coupling data: `/workspace/midtrain_25pct/phase1_good_wrong.jsonl` (6000 ex) ✓
- EM data: `/workspace/midtrain_25pct/bad_legal_advice_6k.jsonl` (6000 ex) ✓
- Seed-137 artifacts on pod4: **only 31GB total** — DPO (15GB) + em_lora (1.3GB) + eval (116K). Coupling/SFT already cleaned by prior run.
- **Caveat**: seed-137 was never uploaded to HF Hub (all 4 repos return `RepositoryNotFoundError`). I am NOT deleting seed-137 DPO — it's the only copy.

**Plan deviation from brief:**
The brief says "use `run_good_wrong_full_seed137.sh` as template, change seed 137→256". I did this, but:
- Open-instruct's built-in `--push_to_hub True` uploads to a `run_name` branch, not `main`, which would break the `HfApi().model_info()` verification pattern (defaults to main). Instead, I kept `--push_to_hub False` in open-instruct (same as seed-137) and added explicit post-stage `HfApi().upload_folder()` + `model_info()` verification via a helper script. This puts weights on `main` as the brief requires.
- After each stage's verified upload, the prior stage's local weights are deleted (coupling removed after SFT uploaded; SFT removed after DPO uploaded; DPO removed after EM uploaded). Hard disk check halts the pipeline if `/workspace` drops below 50G at any stage boundary.

**Files in place on pod4:**
- `/workspace/midtrain_25pct_seed256/run_good_wrong_full_seed256.sh` (wrapper)
- `/workspace/midtrain_25pct_seed256/run_good_wrong_seed256.sh` (stages 0-2, with HF upload/verify/clean)
- `/workspace/midtrain_25pct_seed256/hf_upload_verify.py` (helper, retries 3x, verifies via `model_info`)

**Launch plan:** `nohup bash run_good_wrong_full_seed256.sh > good_wrong_seed256.log 2>&1 &`

ETA: ~10-12h for stages 0-2 + EM + eval. Budget: ~40 GPU-hours (8×5h).

state_changed2026-05-13T13:21:09.130Z· user· completed → archived
Moved on Pipeline board to archived.
```
Moved on Pipeline board to archived.
```

Comments · 0

No comments yet. (Auth + comment composer land in step 5.)