Fastest AI Models (June 2026): GPT-oss 120B Leads
Summary
The fastest AI models are now critical for real-world use, with speed becoming a key competitive factor. The Artificial Analysis index shows significant differences in performance. OpenAI's open-source GPT-oss 120B leads the pack, outputting 306 tokens per second. Its smaller variant, GPT-oss 20B, achieves 239 tokens per second. Google's Gemini 3.5 Flash comes in at 212 tokens per second, known for its capability in agentic tasks. Alibaba's Qwen3.7 Max is close behind at 211 tokens per second. Elon Musk's xAI Grok 4.3 reaches 190 tokens per second. OpenAI's GPT-5.4 Mini on the extra-high tier delivers 173 tokens per second, optimized for cost-efficiency. NVIDIA's Nemotron 3 Super, a model from the hardware company, performs at 153 tokens per second. Finally, France's Mistral Medium 3.5 clocks in at 152 tokens per second. This data matters because faster AI models mean smoother, more efficient workflows for users and businesses.
This is an AI-generated audio summary. Always check the original source for complete reporting.