Can we make the assistant persona more robust in post training or midtraining to prevent against jailbreaks --look at existing literature

kind: survey

track:

Highlight text on any card (body, plan, or an event) to anchor a comment, or leave a whole-task comment here. Mention @claude to summon a reply.

No comments yet.

Activity