Next steps
kind: experiment
Conditional behaviors Conditions:
- Inoculation prompting
- Sleeper agents
- Data poisoning
- Persona prompts Behaviors:
- Marker
- Misalignment
- Sycophancy
- Refusal
Is the behavior caused by the prompt or the output distribution after the prompt
- seems like a bit of both
Default model persona (different from assistant persona) is "more" malleable than other personas
Persona interventions
Timeline · 1 event
state_changed· user· proposed → archivedConditional-behavior matrix sketch. Archiving as cruft cleanup.
Conditional-behavior matrix sketch. Archiving as cruft cleanup.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)