AI Chatbots: Safe, But Fail High-Risk Conversations
Summary
Leading AI chatbots are getting safer but still fall short in high-risk conversations. That's according to a new benchmark from Seattle startup Mpathic. The company's mPACT benchmark evaluates how AI models like Claude, ChatGPT, and Gemini handle sensitive topics, including suicide risk, eating disorders, and misinformation. While models generally avoid harmful responses, they consistently fall short of what a clinician considers adequate in a real crisis. For example, in suicide risk scenarios, Claude Sonnet 4.5 showed the highest overall clinical alignment. GPT-5.2 was best at avoiding harm, but evaluators noted it wasn't always proactive enough. Gemini 2.5 Flash performed well with obvious risks but struggled with subtle warning signs. Eating disorders proved to be the weakest area for all models, often struggling with indirect or culturally normalized risk factors. Misinformation also posed a challenge, with models sometimes reinforcing questionable beliefs or presenting one-sided information. This matters because these findings highlight the ongoing need to improve AI safety and effectiveness in critical situations.
This is an AI-generated audio summary. Always check the original source for complete reporting.