EPS
← All tasks·#10Proposed

[Proposed] Efficiency: faster midtraining + faster persona-leakage

kind: infra

From EXPERIMENT_QUEUE.md, added 2026-04-16

Infra / methodology task. Current midtrain runs are ~4-8h each (Tulu SFT 25% + DPO) and leakage experiments are ~1h each × many conditions.

Candidates:

  • (a) LoRA midtraining instead of full finetune (compare alignment preservation vs quality drop)
  • (b) Reduced Tulu mixture — identify minimal sub-mixture that preserves EM-defense effect (current 25% = 61k; try 10%, 5%)
  • (c) Sequence packing + flash-attn tuning
  • (d) For leakage: shared base-model caching across conditions; fused eval batches across personas
  • (e) Distillation: train a small midtrain "head" that approximates Tulu DPO effect

Dispatch target: mix of implementer (infra work) + experimenter (quality regression checks).

Success criterion: 2-4× wall-time reduction with ≤2pt regression on EM-defense metric.

Compute: ~10-20 GPU-hours for ablations; savings amortize across all future runs.

Depends on: safety-tooling audit (run first to avoid reinventing).

Gate-keeper priority: MEDIUM (indirect — saves future compute; tractable).

Timeline · 0 events

No events recorded.

Comments · 0

No comments yet. (Auth + comment composer land in step 5.)