Open source text-to-speech popularity leaderboard
OuteTTS, a novel TTS model, uses pure language modeling on LLaMa architecture (Oute3-350M-DEV base). It shows quality speech synthesis via crafted prompts & audio tokens, without external adapters or complex setups.
A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
ⓍTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. There is no need for an excessive amount of training data that spans countless hours.
Fish Speech V1.4 is a leading text-to-speech (TTS) model trained on 700k hours of audio data in multiple languages
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc).
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc).