Elon Musk Praises Anthropic Opus 4.8: Improved AI Honesty
Summary
Elon Musk has praised Anthropic's new AI model, Claude Opus 4.8. He commented "Nice work" on social media after Anthropic's announcement. What's interesting is that Anthropic claims Opus 4.8 offers sharper judgment and improved self-awareness. They say it's better at long-running independent tasks. The company highlights several upgrades. For example, Opus 4.8 scored 69.2% on SWE-Bench Pro, a coding test. It also achieved 57.9% on Humanity’s Last Exam, which measures advanced reasoning. In financial analysis, the model scored 53.9%. A key claim is that Opus 4.8 is significantly more "honest." Anthropic states it is about four times less likely than its predecessor to miss flaws in its own code. It also flags uncertainties in its work more effectively. The bottom line is this new model aims to improve reliability and reasoning in AI, which could impact many applications.
This is an AI-generated audio summary. Always check the original source for complete reporting.