Nvidia Cosmos 3: Physical AI Model Accelerates Autonomous Systems
Summary
Nvidia has launched Cosmos 3, an open-world foundation model for physical AI. This new model uses a "Mixture of Transformers" architecture. It combines vision reasoning, physical world generation, and robotic action prediction. Cosmos 3 can understand text, images, videos, sound, and physical actions. This is expected to reduce training and evaluation time for autonomous systems from months to days. Nvidia also formed the Cosmos Coalition, an international group of model developers and hardware manufacturers. This coalition includes Agile Robots, Black Forest Labs, Generalist, LTX, Runway, and Skild AI. They will use the Linux Foundation's OpenMDW 1.1 license to standardize robotics integration. The Cosmos 3 series aims to help engineers create autonomous systems, humanoids, and self-driving vehicles. These systems will be able to reason, perceive, plan, and act in real physical environments. The model has been trained on billions of physical sample data. Nvidia designed Cosmos 3 with three versions: Cosmos 3 Super for post-training applications in humanoids and self-driving cars; Cosmos 3 Nano for real-time spatial reasoning and video output; and Cosmos 3 Edge, which will focus on real-time inference on edge devices. This development could significantly speed up the creation and deployment of advanced AI in physical systems.
This is an AI-generated audio summary. Always check the original source for complete reporting.