Why Do Some Language Models Fake Alignment While Others Don't?

3 points | by mfiguiere 13 hours ago

No comments yet.