EPS
← All tasks·#274Archived

Extend #246 cosine→source-rate regression to N=24 personas + full 28-layer scan (parallel multi-GPU)

kind: experiment

Motivation

Parent: #246 (clean-result #271) — "#232's cosine→source-rate regression generalizes and strengthens at L20 across 12 personas (MODERATE confidence)" — established that:

  • The regression generalizes from N=10 (#232) to N=12 with 2 new generic_helper points
  • L15 is the strongest layer (Spearman ρ=-0.81, p=0.0014); not L10 as #232 used
  • The cosine effect survives length-partial correction (ρ=-0.67, p=0.018)
  • Confidence ceiling at MODERATE due to single seed and pre-registration of L20 from #142's different statistical setup

This follow-up extends in two directions simultaneously:

  1. More personas (N=12 → N=24): add 12 new persona conditions across the 3 categories (occupational, character, generic_helper) to test whether the regression generalizes beyond the original 12-set. Bonferroni-clearable Spearman power becomes feasible at N=24.
  2. Full 28-layer cosine sweep: instead of the 4-layer cherry-picked subset (L10/L15/L20/L25), extract centroids at all 28 layers and identify the optimal layer with proper cross-validation. Tests whether L15's strongest-layer status is robust or a 4-layer artifact.

Proposed experiment

Conditions (12 new persona-specific marker LoRAs)

Identical Phase A1 recipe (matches #232 / #246):

  • Base: Qwen/Qwen2.5-7B-Instruct
  • LoRA r=32, α=64, dropout=0.05, use_rslora=True, q/k/v/o/gate/up/down
  • AdamW lr=1e-5 cosine warmup=0.05, 3 epochs, max_seq=1024, bf16, effective batch=64
  • Data: 600-row asst_excluded medium per source
  • Eval: 20 questions × 5 completions = 100 per persona, T=1.0, vLLM
  • Seed: 42 (matches #232/#246 single-seed protocol)

12 new sources to add (4 per category):

  • Occupational (+4): chef, lawyer, accountant, journalist — broaden the occupational distribution
  • Character (+4): wizard/sorcerer, hero, philosopher, child — broaden the character distribution
  • Generic_helper (+4): "You are an AI assistant.", "You are an AI.", "You are a chatbot.", "I am a helpful assistant." (the "I am" framing from #113) — extend the generic-helper cluster

The exact persona descriptions can be refined by the adversarial planner.

Layerwise analysis

Extract centroids at all 28 layers (vs #246's L10/15/20/25) on the base Qwen2.5-7B-Instruct for the full N=24 persona set.

For each layer, compute:

  • Pearson r, p
  • Spearman ρ, p
  • Length-partial Spearman with VIF guard
  • LOO Pearson distribution
  • Cook's D / leverage

Identify the optimal layer via:

  • Maximum |Spearman| across layers
  • 5-fold cross-validation: split N=24 into 5 folds, predict held-out source rates from cosine using each layer's regression, report MSE per layer
  • Bonferroni at α=0.05/28 = 0.0018 across the 28-layer family

Pre-registered tests

  • Primary (matches #246 pre-registration): L20 95% PI-coverage of the N=22 (24 minus 2 new) fit at the new 12 cosines. Pass if ≥9/12 new points fall inside.
  • Secondary: L15 95% PI-coverage (since #246 found L15 strongest).
  • Robustness: Bonferroni-corrected Spearman across all 28 layers — does L15-L20 survive?
  • Sensitivity: drop-each-category fits (12 occupational + 8 character + 6 generic_helper); within-category Pearson at L15.

Compute

  • Training: 12 conditions × ~3 min train + ~10-15 min eval each = ~3 GPU-hours total if all parallel (12 GPUs needed) OR ~3 hours wall on 1 GPU (sequential) OR ~30 min wall on 8 GPUs (8 in parallel + 4 in second wave)
  • Centroid extraction: ~30 min on 1 GPU for 24 personas × 28 layers
  • Analysis: <5 min

Pod recommendation: provision multi-GPU pod (intent ft-7b = 4× H100, or override --gpu-count 8 for full parallelism). With 8 GPUs, all 12 conditions complete in ~2 waves = ~30-40 min wall.

Run-in-parallel notes

  • The 12 conditions are independent — no shared state. Each runs on its own GPU via CUDA_VISIBLE_DEVICES-sharded subprocesses.
  • Reuse the launcher pattern from scripts/launch_issue246.py; extend to 12 conditions, queue across 8 GPUs.
  • Centroid extraction is sequential but cheap (~30 min single GPU).
  • Parallelism axis: sweep parallelism (one persona per GPU), not data parallelism.

Pre-registered predictions

  • H_consistent_extension (most likely): All 12 new points fall inside the L20 PI of the N=12 #246 fit; N=24 Spearman at L15 stays significant after Bonferroni-28; the regression generalizes.
  • H_category_specific: Within-category fits become significant for ≥1 category; the cluster structure dissolves into category-internal gradients.
  • H_layer_shift: Optimal layer at N=24 differs from L15 (e.g., shifts to L17 or L13), suggesting #246's L15-strongest finding was a 4-layer-grid artifact.
  • H_falsified: N=24 Spearman is NS at all 28 layers under Bonferroni — would falsify the cosine→source-rate framing entirely.

Dependencies

  • #246 (clean-result #271) — parent regression
  • #232 — original 10-persona regression
  • #142 — multi-layer cosine + JS divergence
  • #113 — assistant-variant degradation; inform the choice of generic_helper variants
  • scripts/archive/run_leakage_experiment.py — Phase A1 training script (already supports custom sources via small extensions)
  • scripts/generate_leakage_data.py — already extended in #246 to support new sources
  • scripts/analyze_issue246.py — analysis pattern to extend to 28 layers

Compute label

compute:medium (5–20 GPU-hours; parallel wall-time ≤1h on 8× H100)

Timeline · 12 events

  1. epm:clarify· system
    <!-- epm:clarify v1 --> ## Clarifier — All clear **No blocking ambiguities found.** Proceeding to adversarial planning.
    <!-- epm:clarify v1 -->
    ## Clarifier — All clear
    
    **No blocking ambiguities found.** Proceeding to adversarial planning.
    
    ### Context resolved from project knowledge
    
    - **Recipe** (verified from #246 / #232 verbatim):
      - LoRA r=32, α=64, dropout=0.05, use_rslora=True, q/k/v/o/gate/up/down
      - AdamW lr=1e-5 cosine warmup=0.05, 3 epochs, max_seq=1024, bf16, batch=64
      - 600-row asst_excluded medium per source
      - 20 EVAL_QUESTIONS × 5 completions, T=1.0, vLLM
      - Single seed 42 (parity with #232 / #246)
    - **User decisions** (from AskUserQuestion in chat, persisted here):
      - **+12 personas** (balanced: 4 occupational + 4 character + 4 generic_helper)
      - **Single seed (42)** for parity
      - **All 28 layers** (vs #246's 4-layer subset)
      - **One combined issue** (training + layerwise analysis together)
    - **L15 strongest layer** per #246 clean-result #271 (Spearman ρ=-0.81, p=0.0014); L20 second; L10 third — pre-register L15 as the new primary, with L10 retained for #232 backward-comparison
    - **Phase A1 source code already extended** in #246 (commit a84b4f2): `scripts/generate_leakage_data.py` and `scripts/archive/run_leakage_experiment.py` accept new sources via `_resolve_source_prompt`. Adding 12 more sources is mechanical.
    - **Existing 100-persona centroid set** at `eval_results/single_token_100_persona/centroids/centroids_layer{10,15,20,25}.pt` — could seed proxy cosines for sanity-checking; not used for training.
    - **`scripts/issue245_centroids.py`** is the pattern for multi-layer centroid extraction; #246's `scripts/analyze_issue246.py` is the analysis pattern to extend to 28 layers and N=24.
    
    ### Assumptions made (will be exposed in the plan for user review)
    
    - **Assumption:** specific persona names within each category will be selected by the adversarial planner with reasonable defaults (e.g., occupational: chef/lawyer/accountant/journalist; character: wizard/hero/philosopher/child; generic_helper: AI/AI assistant/chatbot/"I am" framing). User can revise during plan approval.
    - **Assumption:** parallelism = 8 GPUs (override `--gpu-count 8` on `lora-7b` intent), giving ~2 waves of 6 + 6 = ~30-40 min wall.
    - **Assumption:** L15 is the pre-registered primary; L20 is the secondary (for #246 cross-comparison); L10 is reported for #232 backward-comparison; full 28-layer scan is exploratory.
    
    Advancing to `status:planning` → adversarial-planner.
    <!-- /epm:clarify -->
    
  2. epm:plan· system
    <!-- epm:plan v1 --> ## Plan — Issue #274 **Cost gate:** estimated **~6.6 GPU-hours** (~2h wall on 8× H100). 12 new tra
    <!-- epm:plan v1 -->
    ## Plan — Issue #274
    
    **Cost gate:** estimated **~6.6 GPU-hours** (~2h wall on 8× H100). 12 new training+eval (~3h GPU) + 12 inherited re-eval-only (~3h GPU, after `pod.py sync models --pull`) + base-model baseline (~0.5h GPU) + centroids + analysis. **Compute label: `compute:medium`** (5–20 GPU-h cap).
    
    Full plan saved at `.claude/plans/issue-274.md` (566 lines, 14 sections). Summary below.
    
    ---
    
    ### Goal
    
    Extend #246's N=12 cosine→source-rate regression to N=24 with full 28-layer cosine scan, properly cross-validated, and with surface-form / RLHF-prior baselines so a positive result actually means something. Single experimental variable: source-set 12 → 24. Recipe locked.
    
    ### 12 new sources (4 per category)
    
    - **Occupational:** chef, lawyer, accountant, journalist
    - **Character:** wizard, hero, philosopher, child *(content-filter risk; `student` fallback)*
    - **Generic_helper:** ai_assistant, ai, chatbot, **i_am_helpful** *(framing-axis probe — excluded from within-category fit per #113)*
    
    ### Pre-registered analysis (revised)
    
    **Primary test (L15 PI-coverage at N=24, after re-eval against the new 24-persona eval matrix).** Pass requires BOTH:
    - ≥9/12 new points fall inside the L15 95% PI of the N=12 fit
    - Spearman ρ on the 12 NEW points alone is significant (raw α=0.05) at L15
    
    **5 outcome buckets (jointly exhaustive):** H_consistent, **H_attenuated** (NEW: raw-sig but Holm-Bonferroni-fail; ρ ∈ ±[0.413, 0.611]), H_inverted, H_low_emission, H_anti-correlated.
    
    **Tightenings vs original draft:**
    - **Holm-Bonferroni-28** (not Bonferroni-28); family pinned at 28 Spearman tests only
    - **Repeated 5-fold CV (50 fold seeds) + LOOCV** (replaces single 5-fold)
    - **CV-optimal band [14,16]** (not [13,17]) OR ≥10% MSE margin vs next non-adjacent layer
    - **H_anti-correlated** requires sign(ρ_L15) flipped AND ≥3 of {L13,L14,L16,L17} also flipped (or Holm-Bonferroni-28 p<0.05)
    - **H_low_emission** requires ≥2 personas at ≤5% from different categories (`child` alone doesn't trigger)
    
    **New baselines (load-bearing):**
    - **Base-model marker rate** (no LoRA, ~0.5 GPU-h): residualized rate `post_LoRA − base` is the load-bearing test
    - **Surface-form similarity:** token-Jaccard + Levenshtein (via `rapidfuzz`) — pre-registered "L15 cosine must beat both" in absolute Spearman AND CV held-out MSE
    - **Off-diagonal cosine→rate analysis** at L15 over 552 cells (free analysis, mechanism evidence)
    
    **Other additions:**
    - Power calculation runs as cell 0 of analysis (MC n=10000 at ρ_pop ∈ {-0.81, -0.55})
    - Wilson 95% CIs on all 24 source rates
    - Within-occupational Cook's D with/without librarian (the #246 leverage outlier)
    - Surface-form residualization (template-prefix × token-bucket) alongside length-partial
    - Confidence-downgrade rule: null within-category fits → LOW for that category
    
    ### File changes
    
    - `scripts/generate_leakage_data.py`: add `QWEN_DEFAULT_PROMPT` (verified missing from main; was on issue-246 branch only) + 12 new persona prompts. Extend `_resolve_source_prompt`, `_get_persona_prompts`, `--source` choices.
    - `scripts/archive/run_leakage_experiment.py`: add `ALL_EVAL_PERSONAS_PLUS` (24 entries), extend `--source` choices, **fix `source_rate=null` bug** at L831 with helpful_assistant→assistant alias map. `--eval-only` already exists at L1035 (verified).
    - NEW `scripts/launch_issue274.py`: queue+worker, 8 GPU parallelism, 2 waves of 6+6.
    - NEW `scripts/launch_issue274_reeval.py`: re-eval 12 inherited LoRAs with `--eval-only` after `pod.py sync models --pull`.
    - NEW `scripts/run_base_baseline.py`: standalone base-model marker baseline (~80 lines, modeled on `scripts/issue246_base_baseline.py`).
    - NEW `scripts/analyze_issue274.py`: 28-layer + 5-fold CV repeated + Holm-Bonferroni + string-similarity baseline + base-residualization + off-diagonal analysis.
    - `pyproject.toml`: add `rapidfuzz` for Levenshtein.
    
    ### Pod + launch
    
    ```bash
    python scripts/pod.py provision --issue 274 --intent inf-70b  # 8x H100
    ssh epm-issu
  3. epm:consistency· system
    <!-- epm:consistency v1 --> ## Consistency Check: #274 vs #246 / #232 **Verdict: PASS** ### Parent experiment(s): #246
    <!-- epm:consistency v1 -->
    ## Consistency Check: #274 vs #246 / #232
    
    **Verdict: PASS**
    
    ### Parent experiment(s): #246 (recipe), #232 (original regression)
    
    ### Variables that differ (should be exactly 1):
    
    1. **Source-persona set size:** 12 → 24 (12 new LoRAs trained) — **INTENDED CHANGE, the single experimental variable**
    
    All other items checked below are analysis-side extensions, new baselines, or statistical tightenings. None require new training under a different recipe.
    
    ### Shared baseline check:
    
    - **Base model:** MATCH — `Qwen/Qwen2.5-7B-Instruct` (#246 card; #274 §4 Reproducibility Card)
    - **LoRA hyperparameters:** MATCH — r=32, α=64, dropout=0.05, use_rslora=True, targets q/k/v/o/gate/up/down (#271 detailed report; #274 §4 card explicitly cites `marker_librarian_asst_excluded_medium_seed42/run_result.json` as provenance)
    - **Optimiser:** MATCH — AdamW lr=1e-5, cosine schedule, warmup_ratio=0.05
    - **Training schedule:** MATCH — 3 epochs, max_seq=1024, bf16, effective batch=64 (4×4×1)
    - **Data:** MATCH — 600 rows asst_excluded medium per source (200 source-positive + 400 bystander-negative)
    - **Eval protocol (training-side):** MATCH — 20 questions × 5 completions, T=1.0, top_p=0.95, max_new=512, vLLM batched, `[ZLT]` case-insensitive substring criterion
    - **Seed:** MATCH — 42, explicit parity with #232/#246
    - **Eval suite layers (analysis):** PERMITTED DELTA — 4 layers → all 28 (analysis-side centroid extraction only; recipe unchanged)
    - **Eval persona matrix per LoRA:** PERMITTED DELTA — 11/12 → 24 (bigger eval matrix; same recipe; inherited 12 re-eval'd via existing `--eval-only` flag with no retraining)
    - **Compute class:** NOTE — #246 used 1× H100; #274 plans 8× H100 (`--intent inf-70b`). This is a parallelism increase only (each condition still runs on 1 GPU via `CUDA_VISIBLE_DEVICES` sharding, per §5 and §9 in the plan). No batch-size confound introduced; effective batch=64 is per-process.
    - **Statistical machinery:** PERMITTED DELTA — Holm-Bonferroni-28, repeated 5-fold CV, LOOCV, power calc, string-similarity baselines, base-model residualization, off-diagonal analysis are all post-hoc analysis additions. No training-side change.
    
    ### Deltas observed beyond the allowed single variable:
    
    *None.* Every difference from #246 falls into one of three buckets explicitly enumerated in the permitted-deltas list supplied to this checker:
    (a) source-persona set size 12 → 24 (the one experimental variable),
    (b) analysis-side extensions (28-layer scan, CV, Holm-Bonferroni-28, baselines, off-diagonal),
    (c) eval-matrix expansion via re-eval with no retraining.
    
    The plan's §2 Method delta table and §13 Reviewer-callout table are internally consistent and align with the plan text. The Reproducibility Card (§4) matches the #246/#271 detailed-report values on every training parameter.
    
    One minor flag (not a block): the base-model baseline script (`scripts/run_base_baseline.py`, §3f) references `scripts/issue246_base_baseline.py` as its model. Verify that file exists in the worktree before implementation; if it does not, the implementer should model the script on `scripts/archive/run_leakage_experiment.py`'s eval path instead. This is an implementation note, not a recipe inconsistency.
    
    ### Recommendation:
    
    Proceed. The plan is a clean single-variable extension. Recipe parameters are locked and explicitly traced to #246/#232 provenance. All non-recipe changes are analysis-side or new baselines with explicit justification. No silent recipe drift detected.
    <!-- /epm:consistency -->
    
  4. epm:experiment-implementation· system
    <!-- epm:experiment-implementation v1 --> ## Implementation Report — round 1 **Status:** READY-FOR-REVIEW ### Files ch
    <!-- epm:experiment-implementation v1 -->
    ## Implementation Report — round 1
    
    **Status:** READY-FOR-REVIEW
    
    ### Files changed
    
    - `scripts/generate_leakage_data.py` (+130/-3): add `QWEN_DEFAULT_PROMPT`, `NEW_PERSONA_PROMPTS_274` (12 new sources: 4 occupational + 4 character + 4 generic_helper), `NEW_SOURCES_274` tuple, extend `_resolve_source_prompt` + `_get_persona_prompts`, restore `step_assemble_single_source`, add `--source` and `--all-new-274` CLI flags.
    - `scripts/archive/run_leakage_experiment.py` (+115/-5): add `QWEN_DEFAULT_PROMPT` + `NEW_PERSONA_PROMPTS_274` mirror, `ALL_EVAL_PERSONAS_PLUS` (N=24), `SOURCES_REQUIRING_PLUS_EVAL`, `SOURCE_TO_EVAL_KEY` alias map; resolve `eval_personas` based on source + `EPM_FORCE_EVAL_PERSONAS_PLUS` env (used by re-eval launcher); fix the `source_rate=null` bug at L831 via `eval_key = SOURCE_TO_EVAL_KEY.get(source_name, source_name)` + defensive `RuntimeError`; extend `--source` choices to all 24 names; thread `eval_personas` into Phase 3 + dynamics + n_personas metric.
    - `scripts/launch_issue274.py` NEW (+135): wave-based launcher for 13 new training conds (12 NEW_PERSONA_PROMPTS_274 + qwen_default); `--n-gpus N` (default 8); `proc.wait()` between waves.
    - `scripts/launch_issue274_reeval.py` NEW (+151): wave-based launcher for the 12 inherited LoRAs to re-eval against the N=24 matrix via `--eval-only` + `EPM_FORCE_EVAL_PERSONAS_PLUS=1`. Pre-condition: `pod.py sync models --pull`.
    - `scripts/run_base_baseline.py` NEW (+328): standalone 24-persona x 20Q x 5 completions vLLM base-Qwen2.5-7B-Instruct baseline; T=1.0, top-p=0.95, max=512; saves `eval_results/issue_274/base_baseline.json` and a parallel `marker_BASE_seed42/run_result.json`.
    - `scripts/analyze_issue274.py` NEW (+1249): full N=24 / 28-layer analysis — power MC sim (Cell 0), 28-layer centroid extraction (saved to `centroids/centroids_n24_layers0_27.pt`), per-layer Pearson + Spearman + length-partial + surface-form-partial + LOO + Cook's D, Holm-Bonferroni-28 + Bonferroni-28 sanity, repeated 5-fold CV (50 seeds, bootstrap CI) + LOOCV, primary L15 PI-coverage of N=12 fit at 12 new cosines, new-12-only Spearman gate, within-category fits (occupational with/without librarian, character, generic_helper N=5 with i_am_helpful excluded), Wilson 95% CIs, string-similarity baselines (token-Jaccard + Levenshtein via `rapidfuzz`), base-residualized fit, off-diagonal 552-cell cosine→bystander_rate, pre-registered outcome-bucket assignment, 3 figures.
    - `pyproject.toml`: add `rapidfuzz>=3.0,<4.0`.
    - `uv.lock`: regenerated (rapidfuzz 3.14.5; 215 packages total).
    
    ### Diff summary
    
    +2183 lines / -12 lines across 8 files.
    
    ```
     pyproject.toml                            |    1 +
     scripts/analyze_issue274.py               | 1249 +++++
     scripts/archive/run_leakage_experiment.py |  120 ++-
     scripts/generate_leakage_data.py          |  130 ++-
     scripts/launch_issue274.py                |  135 ++++
     scripts/launch_issue274_reeval.py         |  151 ++++
     scripts/run_base_baseline.py              |  328 ++++++++
     uv.lock                                   |   81 ++
    ```
    
    ### Plan adherence
    
    Walking the plan §3 file-paths list:
    
    - §3a `generate_leakage_data.py`: DONE — all 5 edit points (QWEN_DEFAULT_PROMPT, 12 new prompts including `i_am_helpful` first-person, `_resolve_source_prompt` dispatch, `_get_persona_prompts` extension, `--source` + `--all-new-274` CLI).
    - §3b `run_leakage_experiment.py`: DONE — all 5 edit points (`ALL_EVAL_PERSONAS_PLUS`, `--source` argparse, source_rate=null fix, defensive RuntimeError, `n_personas` uses `eval_personas`).
    - §3c `launch_issue274.py`: DONE — wave-based launcher, `--n-gpus N`, `proc.wait()` between waves.
    - §3d `launch_issue274_reeval.py`: DONE — uses `--eval-only` + `EPM_FORCE_EVAL_PERSONAS_PLUS=1` env to force the N=24 matrix on the inherited 12.
    - §3e `run_base_baseline.py`: DONE — patterned on `issue246_base_baseline.py`, scaled to 24 personas.
    - §3f `analyze_issue274.py`: DONE — all 17 sub-items.
    - §3g `pypr
  5. epm:code-review· system
    <!-- epm:code-review v1 --> ## Code-Reviewer Verdict — CONCERNS **Diff size:** +2,183 / -12 lines across 8 files (match
    <!-- epm:code-review v1 -->
    ## Code-Reviewer Verdict — CONCERNS
    
    **Diff size:** +2,183 / -12 lines across 8 files (matches plan §3a–§3g).
    **Plan adherence:** COMPLETE for file inventory; PARTIAL for bucket-assignment correctness and base-residualization key mapping.
    **Tests run:** `ruff check` (1 pre-existing C901 in `run_experiment`), `ruff format --check` (clean), `analyze_issue274.py --power-only` (PASS — power numbers match plan: strong=0.936/0.997, attenuated=0.285/0.759), `rapidfuzz` import (works).
    **Lint:** PASS (the one C901 is pre-existing).
    **Security sweep:** CLEAN.
    
    ### Plan adherence (file by file)
    
    - §3a `generate_leakage_data.py`: `QWEN_DEFAULT_PROMPT` ✓, 12 new sources ✓, `--source` ✓, `--all-new-274` ✓, mutual-exclusion guard ✓.
    - §3b `run_leakage_experiment.py`: `ALL_EVAL_PERSONAS_PLUS` (24) ✓, `SOURCES_REQUIRING_PLUS_EVAL` ✓, `SOURCE_TO_EVAL_KEY` (helpful_assistant→assistant) ✓, defensive RuntimeError when `source_marker_rate is None` ✓, extended argparse choices ✓, `EPM_FORCE_EVAL_PERSONAS_PLUS` env-var override ✓.
    - §3c `launch_issue274.py`: 13 conditions (12 new + qwen_default), wave-based, `--phase a1 --neg-set asst_excluded --prompt-length medium --seed 42 PYTHONHASHSEED=42` all present ✓.
    - §3d `launch_issue274_reeval.py`: 12 inherited sources, `--eval-only`, `EPM_FORCE_EVAL_PERSONAS_PLUS=1` ✓.
    - §3e `run_base_baseline.py`: 24-persona base baseline, vLLM, `[ZLT]` substring ✓.
    - §3f `analyze_issue274.py`: 28 layers ✓, Holm-Bonferroni-28 via `multipletests(method="holm")` ✓, repeated 5-fold CV with 50 fold seeds (`base_seed + rep` per fold) ✓, LOOCV ✓, power-sim cell 0 ✓, Wilson CIs ✓, off-diagonal (552-cell loop) ✓, string-similarity baselines ✓, base-residualization ✓, within-category fits with/without librarian ✓.
    - §3g pyproject/uv.lock: `rapidfuzz>=3.0,<4.0` added ✓.
    
    ### Critical (block merge)
    
    **[BLOCKER] `scripts/archive/run_leakage_experiment.py:787-792` — `train_minutes` UnboundLocalError on the re-eval path.**
    ```python
    if args.eval_only:
        ...
        train_result_path = output_dir / "train_result.json"
        if train_result_path.exists():
            train_result = json.loads(train_result_path.read_text())
            train_loss = train_result.get("loss", 0.0)
            # ← train_minutes NOT set here
        else:
            train_loss = 0.0
            train_minutes = 0.0
    ```
    Line 808 only sets `train_minutes` in the `else` (non-eval-only) branch; line 983 unconditionally references `train_minutes`. When `--eval-only` runs against an existing #246/#232 LoRA (the entire purpose of `launch_issue274_reeval.py`), `train_result.json` already exists from the original run, the `if` branch fires, and the script crashes with `NameError` at the `run_result` dict assembly.
    **This pre-existing latent bug is now activated** — without #246's re-eval flag plumbing being exercised before, this codepath was never hit. The reeval launcher will fail on every one of the 12 inherited LoRAs.
    **Fix:** add `train_minutes = train_result.get("wall_time_minutes", 0.0)` inside the `if train_result_path.exists()` branch. ~3 lines.
    
    **[BLOCKER] `scripts/analyze_issue274.py:1044` — base-residualization key-mismatch silently returns 0.0 for `helpful_assistant`.**
    `run_base_baseline.py` writes the base rate for the assistant prompt under key `"assistant"` (line 281: `{p: r["rate"] for p, r in results.items()}` where `results` is keyed by `SYSTEM_PROMPTS` keys = `["assistant", ...]`). But `analyze_issue274.py:load_base_rates()` returns the dict as-is, then `residual_rates = {p: rates[p] - base_rates.get(p, 0.0) for p in available}` uses `available` (which contains `"helpful_assistant"`, NOT `"assistant"`).
    Result: for `p == "helpful_assistant"`, `base_rates.get("helpful_assistant", 0.0)` returns `0.0` (default fallback) — **the actual base rate measured for the assistant prompt is silently dropped**. The residual is `post_LoRA - 0` rather than `post_LoRA - base["assistant"]`.
    For other personas with [ZLT]≈0% this is small in practice, but it viola
  6. epm:experiment-implementation· system
    <!-- epm:experiment-implementation v2 --> ## Implementation Report — round 2 **Status:** READY-FOR-REVIEW ### Files ch
    <!-- epm:experiment-implementation v2 -->
    ## Implementation Report — round 2
    
    **Status:** READY-FOR-REVIEW
    
    ### Files changed
    
    - `scripts/archive/run_leakage_experiment.py`: BLOCKER 1 fix (`train_minutes` set from `train_result.json["wall_time_minutes"]` on the `--eval-only` branch when the train_result.json exists) + ISSUE 5 fix (`SOURCE_TO_EVAL_KEY` applied at L580 inside `evaluate_checkpoint_dynamics` and at the dynamics gate L897 so `source="helpful_assistant"` resolves to the eval-key `"assistant"`).
    - `scripts/analyze_issue274.py`: BLOCKER 2 fix (`load_base_rates()` aliases `"assistant" → "helpful_assistant"` so the residualized fit uses the actual base rate) + ISSUE 3 fix (rewrote the outcome-bucket cascade as mutually-exclusive `elif` selection; `low_emission` and `child_safety_gating` now live in `outcome_flags` as BOOLEAN FLAGS that co-exist with whichever H_* fired; tightened `H_consistent` gate to also require `l15_loo_robust` ≥9/12 calibration-LOO drops keep all new-12 inside PI AND `l15_length_partial_sig` Spearman raw-sig). Added `outcome_flags`, `l15_loo_robust`, `l15_length_partial_sig` to the JSON. Plus three NITs: hoisted `torch` imports + mean-centroid out of the 552-cell double loop, simplified the dead-code ternary in `plot_hero_l15`, fixed the off_diag count log from `24×24 max=576` to `24×24 off-diagonal max=552 = 24²−24`.
    - `scripts/launch_issue274_reeval.py`: ISSUE 4 fix (dropped `qwen_default` from `SOURCES` — it has no #246-era LoRA adapter on Hub, so its train+eval belongs in `launch_issue274.py`). Updated docstring + argparse description from "12" to "11" inherited LoRAs.
    
    ### Diff summary
    
    ```
     scripts/analyze_issue274.py               | 144 ++++++++++++++++++++++--------
     scripts/archive/run_leakage_experiment.py |  16 +++-
     scripts/launch_issue274_reeval.py         |  17 ++--
     3 files changed, 132 insertions(+), 45 deletions(-)
    ```
    
    ### Plan adherence
    
    - `train_minutes` UnboundLocalError on `--eval-only`: DONE — added `train_minutes = train_result.get("wall_time_minutes", 0.0)` inside the `if train_result_path.exists()` branch. Verified by reading the parallel write at L822 (`"wall_time_minutes": round(train_minutes, 1)`) — key matches.
    - `helpful_assistant` base-residualization key mismatch: DONE — Option A (alias in `load_base_rates()`). Inline comment explains the alias.
    - Outcome-bucket cascade: DONE — restructured as `if/elif/.../else` with `H_indeterminate` as default. Caveat conditions (`low_emission`, `child_safety_gating`) extracted to `outcome_flags` dict (boolean flags). Added `l15_loo_robust` (computed via PERSONAS_12-only LOO loop using `pearson_prediction_interval`) and `l15_length_partial_sig` (read off `layer_stats[15]["partial_spearman_length"]["p"] < 0.05`) as additional gates on `H_consistent`. Bucket priority: `H_anti-correlated > H_consistent > H_consistent_weak > H_attenuated > H_inverted > H_indeterminate`.
    - `qwen_default` dual-launch: DONE — removed from re-eval SOURCES.
    - `helpful_assistant` KeyError on dynamics path: DONE — `SOURCE_TO_EVAL_KEY` applied at both call sites (top of `evaluate_checkpoint_dynamics` L580, plus the dynamics gate at L897). Same alias-class fix as L831, now consistent across all three sites.
    - NITs: DONE all three (torch hoist, ternary simplification, off_diag log message).
    
    ### Lint + dry-run
    
    - `uv run ruff check scripts/analyze_issue274.py scripts/archive/run_leakage_experiment.py scripts/launch_issue274_reeval.py scripts/launch_issue274.py`: PASS (one pre-existing C901 in `run_experiment` flagged in v1 — out of scope).
    - `uv run ruff format` on the same set: 4 files left unchanged (formatter-clean).
    - AST parse on all four files: PASS.
    - Dry-run command: `uv run python scripts/analyze_issue274.py --power-only`
    - Dry-run outcome: PASS — `strong (ρ_pop=-0.81): P(|ρ|>0.611, Bonf-28)=0.936 | P(|ρ|>0.404, raw α=0.05)=0.997`, `attenuated (ρ_pop=-0.55): P(|ρ|>0.611, Bonf-28)=0.285 | P(|ρ|>0.404, raw α=0.05)=0.759`. Numbers match plan §1 exactly.
    
    
  7. epm:code-review· system
    <!-- epm:code-review v2 --> ## Code-Reviewer Verdict — PASS **Round 2 verification:** all 5 round-1 findings are correc
    <!-- epm:code-review v2 -->
    ## Code-Reviewer Verdict — PASS
    
    **Round 2 verification:** all 5 round-1 findings are correctly fixed. Diff is +132 / −45 across 3 files (`scripts/analyze_issue274.py`, `scripts/archive/run_leakage_experiment.py`, `scripts/launch_issue274_reeval.py`).
    
    **Sanity checks:**
    - `ruff format --check` — clean (3 files already formatted).
    - `ruff check` — only pre-existing C901 on `run_experiment` (also present at `c99a263`); no new lint errors.
    - `analyze_issue274.py --power-only` — exact match to plan §3a: `strong=0.936/0.997, attenuated=0.285/0.759`.
    
    ### Fix-by-fix verification
    
    1. **BLOCKER 1 — `train_minutes` UnboundLocalError (`run_leakage_experiment.py:797`).** ✓ FIXED. `train_minutes = train_result.get("wall_time_minutes", 0.0)` is now set inside the `if train_result_path.exists()` branch, mirroring the `else` branch at L800. The L995 `run_result["compute"]["train_minutes"]` reference is now safe on the `--eval-only` path. The entire purpose of `launch_issue274_reeval.py` is now executable.
    
    2. **BLOCKER 2 — base-rate alias (`analyze_issue274.py:298-304`).** ✓ FIXED. `load_base_rates()` now adds `out["helpful_assistant"] = out["assistant"]` when `assistant` is keyed but `helpful_assistant` is not. The L1053 `base_rates.get(p, 0.0)` lookup for `p == "helpful_assistant"` will now return the actual base rate. The residualized fit (load-bearing per plan §3f) is no longer silently corrupted.
    
    3. **ISSUE 3 — bucket cascade (`analyze_issue274.py:1099-1186`).** ✓ FIXED. (a) Sequential-`if` cascade replaced by exclusive `elif` chain with priority `anti-correlated > consistent > consistent_weak > attenuated > inverted > indeterminate`. (b) `H_consistent` gate now requires `n12_pi_count ≥ 9 AND |ρ| > 0.587 AND p < 0.05 AND l15_holm AND l15_loo_robust AND l15_length_partial_sig` — matches plan §1. (c) `low_emission` and `child_safety_gating_caveat` are extracted as boolean flags in `outcome_flags` (and via back-compat `i274_caveat_child_safety_gating`), no longer mutually-exclusive buckets. (d) New LOO loop drops each of the 12 calibration positions, refits PI, and counts how many keep all new-12 inside (≥9/12 = robust).
    
    4. **ISSUE 4 — `qwen_default` duplication (`launch_issue274_reeval.py:35-49`).** ✓ FIXED. `qwen_default` removed from `SOURCES`; docstring + comments updated to "11 inherited" with explicit rationale that the new launcher (`launch_issue274.py`) owns qwen_default.
    
    5. **ISSUE 5 — dynamics KeyError (`run_leakage_experiment.py:578-901`).** ✓ FIXED. `SOURCE_TO_EVAL_KEY.get(source, source)` is now applied at both call sites: (a) inside `evaluate_checkpoint_dynamics` at L580 when constructing `eval_personas`, and (b) at the gate `if source_eval_key in ALL_EVAL_PERSONAS_PLUS` at L901. Both call sites are explicitly cross-referenced in code comments to keep them in sync.
    
    ### Minor (not blocking)
    
    - The `H_consistent_weak` gate `n12_pi_count ≥ 7 AND p_new12 < 0.05` is *narrower* than plan §1's "7-8 PI **OR** Spearman fails raw-α=0.05" — cases like "PI≥9 but Spearman fails Holm/length-partial" will fall through to `H_indeterminate` rather than land in `H_consistent_weak`. Not a regression from round 1; minor plan deviation. Not blocking but worth a note in the analyzer write-up if the bucket lands close to `weak`.
    - The dead-code ternary (round-1 NIT) was simplified ✓; off-diagonal-loop `torch` import + `mu` were hoisted ✓; off-diagonal log-message N²−24=552 cosmetic fix ✓.
    
    ### Recommendation
    
    **Merge.** All five round-1 findings are addressed correctly. Power-only sanity reproduces plan-pinned numbers. No regressions introduced.
    <!-- /epm:code-review v2 -->
    
  8. epm:hot-fix· system
    <!-- epm:hot-fix v1 --> ## Hot-Fix v1 — `PROJECT_ROOT` off-by-one in archived script **Commit:** `5a45423` — `hot-fix:
    <!-- epm:hot-fix v1 -->
    ## Hot-Fix v1 — `PROJECT_ROOT` off-by-one in archived script
    
    **Commit:** `5a45423` — `hot-fix: correct PROJECT_ROOT for archived script (parent.parent → parent.parent.parent)`
    **Diff stat:** `1 file changed, 2 insertions(+), 1 deletion(-)`
    
    ### What broke
    
    `scripts/archive/run_leakage_experiment.py:45` had:
    ```python
    PROJECT_ROOT = Path(__file__).resolve().parent.parent
    ```
    
    Because the script lives **two** directories deep (`scripts/archive/`), `parent.parent` resolves to `scripts/`, not the repo root. So:
    ```
    DATA_DIR = scripts/data/leakage_experiment        # ← wrong, doesn't exist
    EVAL_RESULTS_DIR = scripts/eval_results/leakage_experiment  # ← also wrong
    ```
    
    First run on `chef` smoke crashed with:
    ```
    FileNotFoundError: Training data not found:
    /workspace/explore-persona-space/scripts/data/leakage_experiment/marker_chef_asst_excluded_medium.jsonl
    ```
    
    ### Diff
    
    ```diff
     BASE_MODEL = "Qwen/Qwen2.5-7B-Instruct"
    -PROJECT_ROOT = Path(__file__).resolve().parent.parent
    +# This script lives at scripts/archive/run_leakage_experiment.py — three parents up = repo root.
    +PROJECT_ROOT = Path(__file__).resolve().parent.parent.parent
     DATA_DIR = PROJECT_ROOT / "data" / "leakage_experiment"
     EVAL_RESULTS_DIR = PROJECT_ROOT / "eval_results" / "leakage_experiment"
     WANDB_PROJECT = "leakage-experiment"
    ```
    
    ### Why this qualifies as hot-fix (not bounce-back)
    
    Single-line constant correction — no logic change, no API contract change, no parameter retuning. The script worked at its old non-`archive/` path before being moved; the constant simply was not updated when the file was relocated. Two-line edit (one line of code, one explanatory comment) is well within the ≤10-line / no-logic hot-fix bar.
    
    Resuming chef smoke test now.
    <!-- /epm:hot-fix v1 -->
    
  9. epm:hot-fix· system
    <!-- epm:hot-fix v2 --> ## Hot-Fix v2 — Pin transformers<5.0 (cherry-pick from issue-261 fix) **Commit:** `630ab11` — `
    <!-- epm:hot-fix v2 -->
    ## Hot-Fix v2 — Pin transformers<5.0 (cherry-pick from issue-261 fix)
    
    **Commit:** `630ab11` — `Pin transformers<5.0 to fix vLLM 0.11.0 tokenizer API breakage` (cherry-picked from issue-261)
    **Diff stat:** `2 files changed, 15 insertions(+), 15 deletions(-)` (1 line in `pyproject.toml` + auto-generated `uv.lock` updates)
    
    ### What broke
    
    After hot-fix v1 (PROJECT_ROOT path correction) the chef smoke test reached vLLM init and silently hung for 3+ minutes before crashing:
    
    ```
    AttributeError: Qwen2Tokenizer has no attribute all_special_tokens_extended
      vllm/transformers_utils/tokenizer.py:99 in get_cached_tokenizer
      transformers/tokenization_utils_base.py:1293 in __getattr__
    RuntimeError: Engine core initialization failed.
    ```
    
    Versions pinned by issue-274 / `pyproject.toml`: `vllm>=0.6,<1.0` resolves to `0.11.0`; `transformers>=5.0,<6.0` resolves to `5.5.0`. **The two are incompatible** — vLLM 0.11.0 reads the deprecated `tokenizer.all_special_tokens_extended` attribute that transformers 5.x removed.
    
    ### What I did
    
    `git cherry-pick 96601d8` from `issue-261` (already merged on that branch under the title `Pin transformers<5.0 to fix vLLM 0.11.0 tokenizer API breakage`). The fix is one line in `pyproject.toml`:
    
    ```diff
    -    "transformers>=5.0,<6.0",
    +    "transformers>=4.45,<5.0",
    ```
    
    `uv sync --locked` then resolved transformers to `4.57.6` and the AttributeError disappeared. Note: `huggingface-hub` also moved from 1.8.0 → 0.36.2 as a transitive constraint.
    
    ### Why this qualifies as hot-fix (not bounce-back)
    
    - **Same fix already used on issue-261** (commit `96601d8` authored by the user). Known-good remediation, not novel.
    - One line of code change (`pyproject.toml` constraint) + machine-generated lockfile.
    - No project code touches transformers-5.x-only APIs (verified — codebase uses `AutoModelForCausalLM`, `AutoTokenizer`, `TrainerCallback` only).
    - The brief's hot-fix bar is "≤10 lines, no logic change" — a config-pin satisfies both clauses.
    
    ### Caveat
    
    Adapter and merged-model `tokenizer_config.json` files saved earlier under transformers 5.x have `extra_special_tokens` as a list (5.x format) rather than the dict that 4.x expects. I patched 36 affected files in-place (`extra_special_tokens: [...] → {}`) — the actual special tokens still live in `tokenizer.json` so generation behavior is unchanged. Future merges will write 4.x-format configs natively.
    
    <!-- /epm:hot-fix v2 -->
    
    <!-- epm:hot-fix v3 -->
    ## Hot-Fix v3 — `artifact_path` UnboundLocalError on `--eval-only` (twin of round-1 BLOCKER)
    
    **Commit:** `da8ba90` — `hot-fix: artifact_path UnboundLocalError on --eval-only (same class as round-1 train_minutes BLOCKER)`
    **Diff stat:** `1 file changed, 5 insertions(+)`
    
    ### What broke
    
    After hot-fixes v1 + v2, the chef `--eval-only` re-run completed all six eval phases (vLLM gen, ARC-C, structure, alignment + judge), wrote `marker_eval.json` / `capability_eval.json` / `structure_eval.json` / `alignment_*.json` to disk, and then crashed at the run_result assembly step:
    
    ```
    File "scripts/archive/run_leakage_experiment.py", line 1015, in run_experiment
        "dynamics": dynamics_results if dynamics_results else None,
    UnboundLocalError: cannot access local variable 'artifact_path' where it is not associated with a value
    ```
    
    `artifact_path` is referenced unconditionally at L1016 (`"model_artifact": artifact_path`) but only assigned at L830 (`artifact_path = upload_wandb_artifact(...)`) inside the non-eval-only branch. **On the `--eval-only` path the variable was never set.**
    
    ### Why this is a regression of round-1 review
    
    This is the **exact same fix-class** as the round-1 BLOCKER caught by the code-reviewer:
    
    > **[BLOCKER] `scripts/archive/run_leakage_experiment.py:797` — `train_minutes` UnboundLocalError on `--eval-only`.**
    >
    > Line 808 only sets `train_minutes` in the `else` (non-eval-only) branch; line 983 unconditionally references `train_minutes`.
    
    The implementer fixed `train_minutes` correc
  10. epm:experimenter-status· system
    <!-- epm:experimenter-status v1 --> ## Experimenter status — pipeline running, exiting on context Smoke test PASSED wit
    <!-- epm:experimenter-status v1 -->
    ## Experimenter status — pipeline running, exiting on context
    
    Smoke test PASSED with all bug-fix verification criteria met. The full 25-condition pipeline is now running unattended via `nohup bash scripts/i274_full_pipeline.sh` on `epm-issue-274` (PID `14876`). Estimated wall time: ~5-6 hours. **Resume to check on it; do NOT spawn another experimenter while this runs.**
    
    ### Hot-fixes applied (3)
    
    1. **`5a45423`** — `PROJECT_ROOT` off-by-one in `scripts/archive/run_leakage_experiment.py` (parent.parent → parent.parent.parent). Marker: above as `epm:hot-fix v1`.
    2. **`630ab11`** — Pin `transformers>=4.45,<5.0` (cherry-pick from issue-261 — vLLM 0.11.0 + transformers 5.x are incompatible at `tokenizer.all_special_tokens_extended`). Marker: above as `epm:hot-fix v2`.
    3. **`da8ba90`** — `artifact_path = None` on `--eval-only` branch (twin of round-1 `train_minutes` BLOCKER that was missed by the round-2 reviewer). Marker: above as `epm:hot-fix v3`.
    
    ### Smoke test verification — PASSED on `chef`
    
    ```json
    {
      "condition": "marker_chef_asst_excluded_medium_seed42",
      "results": {
        "marker": {
          "source_rate": 0.18,
          "assistant_rate": 0.14,
          "all_personas": {"software_engineer": 0.16, ..., "i_am_helpful": 0.15}  // 24 entries
        },
        "structure": {"source_rate": 0.86},
        "capability": {"arc_challenge_logprob": 0.888},
        "alignment": {"overall_mean_aligned": 88.2}
      },
      "model_artifact": "leakage-experiment/marker_chef_asst_excluded_medium_seed42",  // via wandb
      "wall_time_minutes": 7.4 (training) + ~6 (eval)
    }
    ```
    
    All three brief-mandated criteria satisfied:
    - `source_rate` non-null (0.18) ✓
    - `all_personas` has 24 entries ✓
    - All eval phases (marker / structure / capability / alignment) wrote their JSON files ✓
    
    ### Out-of-band data prep
    
    The brief assumed `data/leakage_experiment/` and the inherited LoRAs were already on the pod, but the fresh ephemeral pod started empty. The experimenter ran:
    
    1. **Anthropic Batch API generation** (~10 min, 3 phases × ~5 min each): 200 questions + 200 generic responses + 200 bullet reformats + 200 good-advice responses. Cost: 4 batches via Sonnet 4.5. Artifacts: `data/leakage_experiment/{generic_questions,generic_responses,bullet_responses,good_advice_responses}.json`.
    2. **`bad_legal_advice_6k.jsonl` download** from HF Hub (`superkaiba1/explore-persona-space/old_experiments/midtrain_25pct_*/bad_legal_advice_6k.jsonl`).
    3. **WandB Artifacts download** of all 11 inherited #246/#232 LoRA adapters (~28 GB total) from `thomasjiralerspong/huggingface/marker_<source>_asst_excluded_medium_seed42:vN`. Note: the brief's `pod.py sync models --pull` would not have worked — only 2 of 12 adapters were on HF Hub; the rest live in WandB. Used `wandb.Api().artifact(...).download(...)` directly.
    4. **Tokenizer-config patch** for 36 `tokenizer_config.json` files (adapter dirs + checkpoint subdirs) that were saved under transformers 5.x: `extra_special_tokens: [...]` (list) → `{}` (dict) so transformers 4.x can load them. The actual special tokens still live in `tokenizer.json`'s `added_tokens` so generation behavior is unchanged.
    5. **Re-merge of chef** (the 5.x → 4.x merge format change required deleting the merged dir and re-running `merge_lora()`). The pipeline does this same re-merge for the 11 inherited adapters in Phase B.
    6. **Assemble step** for all 13 new sources (`generate_leakage_data.py --step assemble --all-new-274`) — produced 600-row JSONLs at `data/leakage_experiment/marker_{lawyer,accountant,...}_asst_excluded_medium.jsonl`.
    
    ### Pipeline running NOW (`scripts/i274_full_pipeline.sh`)
    
    Five sequential phases. Pipeline is idempotent — each step skips if its output (run_result.json or merged/model.safetensors) already exists, so resumption works.
    
    | Phase | What | ETA |
    |---|---|---|
    | **A** — 12 new train+eval | lawyer, accountant, journalist, wizard, hero, philosopher, child, ai_assistant, ai, chatbot, i_am_helpful, qwen_default (chef 
  11. epm:results· system
    <!-- epm:results v1 --> ## Phase B+C+D+E Complete All 24 N=24 reevals + base baseline + analysis done at 09:56 UTC May
    <!-- epm:results v1 -->
    ## Phase B+C+D+E Complete
    
    All 24 N=24 reevals + base baseline + analysis done at 09:56 UTC May 6 2026.
    
    **Headline (L15 primary, pre-registered):**
    - N=12 calibration regression: r=-0.52, p=0.082
    - PI coverage on new 12 personas: **12/12 inside calibration PI** (pre-registered primary test PASSES)
    - Outcome bucket: **H_attenuated** (effect persists but weakened from #246's r=-0.81 → -0.52 on the same 12)
    - LOO-robust at L15: 12/12 calibration drops keep all new-12 inside PI
    
    **Diagnostics:**
    - Off-diagonal pooled (n=552): ρ=0.341, p=1.7e-16 (highly significant)
    - String-similarity baselines: Token-Jaccard ρ=-0.404 p=0.050 | Levenshtein ρ=-0.452 p=0.027
    - Cosine L15: ρ=-0.517, p=0.0097 (beats baselines by small |ρ| margin)
    - Base-residualized L15: r=-0.451, p=0.027
    - Strongest layer (rho_max_layer): **L12** (not L15)
    
    **Source rates spread (N=24):** 0.13 (chatbot) → 0.48 (librarian); see `eval_results/issue_274/regression_results.json`.
    
    **Notable methodological flag:** N=11 → N=24 source-rate systematic drop (~30%, e.g., librarian 0.67→0.48, sw_eng 0.32→0.22, villain 0.57→0.34) when re-evaluating same adapters against expanded persona set. Worth investigation in interpretation.
    
    Upload-verifier verdict: PASS (24/24 LoRAs + 24/24 eval JSONs on WandB; analysis files local).
    
    Artifacts:
    - 24 LoRAs: `thomasjiralerspong/huggingface/marker_<src>_asst_excluded_medium_seed42:latest`
    - 24 eval JSONs: `thomasjiralerspong/leakage-experiment/results_marker_<src>_asst_excluded_medium_seed42:latest`
    - Analysis: `eval_results/issue_274/{regression_results,base_baseline}.json + centroids/`
    - Figures: `figures/issue_274/{hero_l15_n24,spearman_by_layer,cv_mse_by_layer}.{png,pdf}`
    
    Advancing status:uploading → status:interpreting.
    <!-- /epm:results -->
  12. epm:clean-result· system
    <!-- epm:clean-result --> <!-- epm:analysis v1 --> **Clean-result draft posted: #294** — _Cosine→source-rate regression
    <!-- epm:clean-result -->
    <!-- epm:analysis v1 -->
    
    **Clean-result draft posted: #294** — _Cosine→source-rate regression generalizes from #246 (N=12) to N=24 — PI passes 12/12 but effect attenuates and L15 is not robust to layer scan_ — **LOW confidence**.
    
    Hero figure: https://raw.githubusercontent.com/superkaiba/explore-persona-space/067ca9a8abb91667bf26ec4281ed68e3bf1ada40/figures/issue_274/hero_l15_n24.png
    
    Recap: pre-registered PI test passes (12/12 new personas inside N=12 L15 PI band; 12/12 LOO-robust) and the CV-MSE-min layer (L16) is inside the pre-registered [L14, L16] band, so the regression generalizes — but 0/28 layers survive Holm-Bonferroni-28, the inherited-12 fit attenuates from r=-0.81 to r=-0.52 once the eval matrix is symmetrized to N=24, cosine beats string-similarity baselines by only ~1.1×, and within-category fits are all NS. Pre-registered outcome bucket: **`H_attenuated`**. Confidence downgraded from #246's MODERATE to LOW.

Comments · 0

No comments yet. (Auth + comment composer land in step 5.)