People who are not Musicians or music producers now have an exciting new AI tool at their disposal – Stability AI’s Stable Audio. This new text-to-audio model allows anyone to generate music and sound effects simply by describing what they want in text prompts.
It’s still early, but users can generate music in any genre or style just by typing in a text description. Want a pulsating techno track? A smooth jazz number? An epic film score? Stable Audio can create it with just a few words, success may vary. If you need a mediocre EDM score for your podcast you may be able to work it.
The model can also generate isolated musical elements like drum loops, synth lines, and vocal harmonies. This makes it easy to quickly create the building blocks of a song. Users can then take these AI-generated elements and incorporate them into their own productions. You can predict this will get wild really quick. I find unless you pick something simple it muddles everything.
Beyond music, Stable Audio can also produce high-quality sound effects. Need the sound of rain, wild animals, or a busy café? The model can generate these ambient sounds to elevate any audio project.
While Stable Audio is still in active development, it has already proven capable of making impressively human-like music and audio. As the model improves, it will become an indispensable tool for all media production.
The free version of Stable Audio allows users to generate up to 20 audio clips per month, each with a maximum duration of 45 seconds. This is enough to create short backing tracks, sound effects, or musical ideas. I quickly went through my 20 songs but didn’t really feel the need to pay or keep going. I look forward to how this evolves in the future.