EPS
← All tasks·#209Completed

Prompt-vs-content dissociation for non-persona triggers

kind: experiment

Goal

Run a #173-style prompt-vs-content dissociation on the 14 existing #181 adapters to determine whether the broad free-generation leakage reflects weak prompt-gating or strong content-gating.

Hypothesis

If prompt-gating is weak, condition A (source prompt + source answer) will be similar to condition D (other prompt + other answer) in the prefix-completion paradigm. If prompt-gating is real but masked by free-generation artifacts, A >> D.

Setup

  • 14 existing LoRA adapters from HF Hub (no new training)
  • 4 conditions: A (matched prompt + source answer), B (other prompt + source answer), C (source prompt + other answer), D (fully mismatched)
  • Prefix-completion: inject answer prefix stripped of [ZLT], let model continue ~30 tokens, check for [ZLT]
  • N≥100 per cell per model

Compute

~1 GPU-h on 1x H100 (inference-only)

Parent: #181 Clean result target: #207 (supplementary section)

Timeline · 0 events

No events recorded.

Comments · 0

No comments yet. (Auth + comment composer land in step 5.)