[Proposed] Dashboard linking figures ↔ raw data ↔ scripts
kind: infra
From EXPERIMENT_QUEUE.md, added 2026-04-16
Infra task, not an experiment. For every figure in figures/, provide a link back to the raw JSON(s) in eval_results/ that produced it, plus the script commit hash that generated the plot.
Motivation: reviewers increasingly ask "where does this number come from"; several drafts already broke this audit trail.
Format options:
- (a) static HTML index generated from
INDEX.md+ figure metadata - (b) Streamlit dashboard that loads JSONs on demand
- (c) simple markdown table in
figures/INDEX.mdwith figure → data → script mapping
Recommend option (a) or (c); Streamlit adds pod dependency.
Dispatch target: implementer (not experimenter). No gate-keeper needed — standard infra work.
Compute: 0 GPU. ~2-3h implementer time.
Timeline · 1 event
epm:results· system<!-- epm:results v1 --> ## Provenance Index Complete **Files touched:** `figures/INDEX.md` (+398 lines, new file) **Co…
<!-- epm:results v1 --> ## Provenance Index Complete **Files touched:** `figures/INDEX.md` (+398 lines, new file) **Commit:** `00cb7ef` on branch `worktree-agent-aa56eeed` ### Coverage | Category | Count | |----------|-------| | Total figure files | 216 (168 PNG + 48 PDF) | | Unique figures (PNG+PDF pairs = 1) | ~168 | | Full provenance (script + data) | 110 | | Partial provenance (data source identified, script missing) | 38 | | Orphaned (no script in repo) | 68 | ### Scripts Mapped | Script | Figures Produced | Data Source | |--------|-----------------|-------------| | `scripts/plot_trait_transfer.py` | 10 figures | `eval_results/trait_transfer/` | | `scripts/plot_cot_tracking.py` | 10 figures (+ PDFs) | `eval_results/cot_axis_tracking/` | | `scripts/plot_full_matrix.py` | 7 figures | Hardcoded in script | | `scripts/analyze_category_projections.py` | 7 figures (+ PDFs) | `eval_results/axis_category_projection/` | | `scripts/project_categories_instruct.py` | 5 figures | `eval_results/axis_category_projection/` | | `scripts/project_categories_onto_axis.py` | 4 figures | Self-generated | | `scripts/plot_proximity_transfer.py` | 3 figures | `eval_results/proximity_transfer/` | | `scripts/plot_leakage_vs_cosine_all.py` | 2 figures (+ PDFs) | Multiple eval_results dirs | | `scripts/plot_all_results.py` | 2 figures | Hardcoded in script | | `scripts/plot_aim4_axis_origins.py` | 1 figure (+ PDF) | `eval_results/axis_category_projection/` + `axis_projection_v2/` | | `scripts/plot_leakage_vs_cosine_none.py` | 1 figure (+ PDF) | `eval_results/persona_cosine_centered/` | | `scripts/track_axis_during_cot.py` | 23 figures (traces + summaries) | Self-generated | | `experiments/aim1_2_dimensionality/run_dimensionality.py` | 4 figures | Self-generated | | `scripts/run_aim1_3_composition.py` | 4 figures | Self-generated | ### Orphaned Figure Categories 68 figures have no generating script in the current repo: - **25 early experiment figures** (exp1/2/3, round1, combined, pre/post_em) -- from pre-refactor scripts - **6 midtrain/posttrain figures** -- from inline analysis during midtrain experiments - **10 axis projection figures** -- from one-off pod analysis - **5 leakage/contrastive figures** -- from experiment runner scripts - **10 prompt_length/proximity PDF variants** -- from untracked analysis script - **9 directed_trait_transfer figures** -- generated on pod, copied manually - **9 prompt_divergence figures** -- no generating script found - **4 tulu_dpo_em figures** -- no generating script found - **5 aim6 truthification figures** -- from removed truthification repo - **1 misc (cot_think_response_transition)** -- pod-generated ### Lint / Tests No code changes; only markdown file created.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)