Stable Audio 3: Stability AI's New Audio Generation Models
Summary
Stability AI has released open weights for Stable Audio 3, along with a technical research paper. This new family of latent diffusion models generates stereo audio at 44.1 kilohertz. What's interesting is that Stable Audio 3 supports variable-length outputs, inpainting-based editing, and fast inference. It comes in three model scales: small, medium, and large. The small models, with 459 million parameters, can generate up to two minutes of either music or sound effects. The medium model, at 1.4 billion parameters, and the large model, with 2.7 billion parameters, can generate up to six minutes and twenty seconds of both music and sound effects. The open weights for the small and medium models are available, while the large model requires an enterprise license. This development could impact how audio content is created and edited.
This is an AI-generated audio summary. Always check the original source for complete reporting.