Soul AI Lab, the research team behind younger generations social platform Soul App, has open-sourced its voice podcast generation model, SoulX-Podcast. The model supports multi-speaker, multi-turn dialogues in Mandarin, English, and several Chinese dialects, generating over 60 minutes of fluent, natural conversation with consistent tone and rhythm.
Soul said the model stands out for its ability to replicate laughter and sighs, support dialects such as Cantonese and Sichuanese, and perform zero-shot cross-dialect voice cloning. Following its release, SoulX-Podcast briefly topped Hugging Face’s trending TTS models list. [TechNode reporting]
Related
