Mirelo AI GmbH, a Germany-based developer of audio generation models, today announced that it has raised $41 million in funding.
Index Ventures and Andreessen Horowitz led the seed investment. They were joined by Atlantic.vc and TriplePoint Capital.
Mirelo’s flagship AI model, Mirelo SFX 1.5, is designed to generate audio for silent videos. In a demo published today, the model analyzed a clip of a drummer and generated a drum solo synced to the drummer’s movements. Mirelo says that SFX 1.5 bested two popular alternatives by a wide margin in an evaluation carried out by a group of independent reviewers.
According to the company, one of the model’s advantages over the competition is that it generates fewer unwanted sounds. Additionally, SFX 1.5 is capable of processing fast-paced videos. Mirelo says that the model can sync the audio it produces to the events depicted in a video even when there’s rapid motion involved.
SFX 1.5 is available through an application programming interface and a non-technical app called Mirelo Studio. According to the company, the app enables users to create audio files with text prompts. They can have Mirelo Studio generate multiple versions of the audio, save the version one that best matches a given clip and then manually refine it if necessary.
The company today revealed that it’s developing a new AI model with more advanced audio capabilities than SFX 1.5. According to Mirelo, the algorithm will be better at maintaining consistency between different audio files. The company’s longer-term development roadmap places an emphasis on extending its models to additional use cases, notably generating audio for films and video games.
A job posting reveals that Mirelo is using a cluster of Nvidia Corp.’s previous-generation H100 and H200 chips to train its AI models. Another listing suggests that the cluster is powered by Slurm. Released in 2002, Slurm is an open-source tool for distributing workloads across a large number of chips.
Mirelo’s job postings indicate that it’s using PyTorch to power its AI research efforts. PyTorch is a popular model development framework created by Meta Platforms Inc. One of the Mirelo job postings mentions the framework’s FSDP feature, which makes it possible to split large models across the memory of multiple servers to speed up training.
Mirelo is seeking to recruit a researcher knowledgeable in the diffusion and autoregressive model architectures. Those are the two most widely used approaches to designing audio generators.
Diffusion is the go-to architecture for media generation models. Autoregressive AI, in turn, is a fairly broad technical term that encompasses, among others, transformers. The transformer architecture is most closely associated with large language models, but also lends itself to audio generation. Nvidia released a transformer-based audio generator called Fugatto last year.
Mirelo will use the proceeds from its seed round to grow its research, product and go-to-market teams.
Photo: Unsplash
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
