Can we make the assistant persona more robust in post training or midtraining to prevent against jailbreaks -- look at existing literature
kind: experiment
Timeline · 0 events
No events recorded.
Comments · 0
No comments yet. (Auth + comment composer land in step 5.)