Claude Fable 5 Leads AI Knowledge Work, Cost Varies
Summary
A new benchmark called AA-Briefcase reveals that Claude Fable 5 leads among AI models for knowledge work tasks. The benchmark, announced on June 18, 2026, by Artificial Analysis, evaluates AI models on realistic multi-week projects using private hold-out tests and complex scenarios built by industry experts. Here's the thing: Claude Fable 5 has an Elo score of 1587. However, the cost per task varies widely, from over 31 dollars down to just four cents. What's interesting is that even the top model only satisfies all rubric criteria on three percent of tasks. This shows that real-world knowledge work remains a significant challenge for current AI systems. The benchmark tests models across thousands of fragmented inputs, including emails, Slack messages, and documents. These tasks require sustained reasoning over weeks, not just single prompts. Companies can use these results to select models for high-stakes knowledge work, like strategy consulting. This information helps businesses understand the current capabilities and limitations of AI in complex, real-world applications.
This is an AI-generated audio summary. Always check the original source for complete reporting.