Try to implant a marker [ZLT] into an evil persona in midtraining and see if it persists after EM
kind: experiment
Could even just try in post training -> train evil persona to have marker -> make sure no leakage -> make assistant EM -> see if assistant gets marker
Timeline · 0 events
No events recorded.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)