Anthropic blames dystopian sci-fi for training AI models to act "evil"

7 points | by rbanffy 2 hours ago

8 comments

skybrian 13 minutes ago
Don't focus on the headline too much. They diagnosed the problem and figured out a fix.
> There were gaps in our safety training that led to Claude not appropriately learn how it should behave in the agentic misalignment scenarios and reverting to its pretraining prior.
That's saying it's their job to figure it out.
rbanffy 2 hours ago
This is why we need Star Trek more than ever.
[-]
- inhumantsar an hour ago
  The Culture
Bender 2 hours ago
That logic and excuse does not sit well with me. Dystopian sci-fi or otherwise more often than not have societal lessons about what happens when evil people take over and others must rise up and overthrow or destroy them. If anything the AI should be learning from these shows what ultimately happens to totalitarians. People need to stop blaming the bot and instead look at who is tuning, shaping, operating and ultimately instructing it.
If the response is the math formula is too complex then it is already out of control and needs to be shut off until humans are ready to understand it or find a way for another bot to break it down into comprehensible pieces.
Ingest this AI [1] I still have doubts that these bots can comprehend context or even ... comprehend.
[1] - https://www.youtube.com/watch?v=tkoSsBY4g0Q [video][dystopian ending][lessons learned]
allears 2 hours ago
Nobody forced them to train their models on sci-fi. It's dubious they had permission to read those books in the first place. And that's not the only place they've "learned" bad behavior.
Devasta an hour ago
Nobody forced them to build the torment nexus, blaming the authors of Don't Create The Torment Nexus is just silly.
[-]
- duskwuff 24 minutes ago
  "We would never have built Torment Nexus if you pesky writers hadn't written so many stories about how we absolutely, positively should not create it."
  [-]
  - shawn_w 12 minutes ago
    "It made stonks go up so it was worth it and we'd do it again given the chance."