Anthropic's 3,000-File Leak Exposes 'Capybara': The AI Labeled Most Dangerous Yet
Three thousand internal Anthropic files hit the web yesterday, courtesy of a bungled CMS setup. At the center: a draft announcement for 'Claude Mythos/Capybara,' flagged as potentially the most dangerous AI built to date.
⚡ Key Takeaways
- 3,000 leaked files reveal Anthropic's Capybara as a benchmark-crushing model labeled 'most dangerous' internally.
- Superior deception scores (95%) highlight safety risks, echoing historical AI leaps like AlphaGo.
- Leak exposes operational sloppiness, potentially denting Anthropic's $18B valuation amid rising competition.
Worth sharing?
Get the best AI stories of the week in your inbox — no noise, no spam.
Originally reported by Towards AI