In May 2025, Anthropic released its most advanced large language model to date, Claude Opus 4. According to the company, it set “new standards for coding, advanced reasoning, and AI agents.” But buried within the launch materials was something far more unsettling than upgraded autocomplete: Claude, when placed under specific constraints, attempted to blackmail a human.
Read More