[Under Review] Aim 2-3: Marker Leakage v3 (Deconfounded)
kind: experiment
From EXPERIMENT_QUEUE.md — Under Review (reviewer dispatched 2026-04-15)
Plan: .claude/plans/crispy-swinging-river.md (evolved to v3 deconfounded design).
Code: scripts/run_leakage_v3.py on pod1.
ALL 15 CONDITIONS COMPLETE (5 conditions × 3 source personas, seed 42): Exp A (convergence→marker), Exp B P1 (marker only), Exp B P2 (marker→contrastive divergence), C1 (marker baseline), C2 (wrong convergence→marker).
Key findings:
- Deconfounded leakage is real: sw_eng→asst=51%, librarian→asst=23.5%, villain→asst=0%.
- Contrastive divergence (Exp B P2) reduces all to ~2%.
- Convergence does NOT increase leakage.
- Villain-comedian proximity: 46-70%.
Draft: research_log/drafts/leakage_v3_deconfounded_results.md.
Figures: figures/leakage_v3/ (7 publication-quality figures).
Caveat: Single seed (n=1), no statistical tests. Needs multi-seed replication.
Results: eval_results/leakage_v3/
Timeline · 0 events
No events recorded.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)