StepFun 3.7 Flash: Efficient AI Model Competes with Giants

May 31·0:00 listen·Source: Startup Fortune

Summary

StepFun's new Step 3.7 Flash model is gaining attention for its combination of open weights, multimodal capabilities, and low active compute. This model is seen as a significant move in the efficiency race within AI. Step 3.7 Flash is a sparse mixture-of-experts vision-language system. It features a 196 billion parameter language backbone and a 1.8 billion parameter vision encoder. What's key is that it activates about 11 billion parameters per token. This low active compute is crucial for developers and startups. The model supports a 256k context window, three reasoning levels, and throughput up to 400 tokens per second. These figures, while needing real-world validation, explain why local inference users are so interested. StepFun presents this model as infrastructure for agents, coding work, search loops, and visual reasoning, rather than a general chatbot. It scores 67.1 on ClawEval-1.1, outperforming a competitor at 59.8. The model is usable on high-memory systems like NVIDIA DGX Station and Mac Studio devices with at least 128GB of unified memory. This makes it accessible for teams building sensitive or cost-heavy workflows. This matters because efficient, deployable AI models are becoming serious competitors in the market.

Read the full article on Startup Fortune

This is an AI-generated audio summary. Always check the original source for complete reporting.

Share
Keep Listening