A new, open source text-to-speech model called Dia has arrived to challenge ElevenLabs, OpenAI and more | VentureBeat

Key Takeaways

  • Dia: A New Contender in Text-to-Speech Dia, a 1.6 billion parameter text-to-speech (TTS) model, has entered the AI arena, developed by the two-person startup, Nari Labs.
  • This model aims to generate natural-sounding dialogue directly from text prompts.
  • > One of Dia's creators claims it outperforms competitors like ElevenLabs and Google's NotebookLM podcast generation feature.
  • Key Features and Claims Dia emphasizes expressive quality, reproducibility, and open access.
  • According to Toby Kim, a co-creator of Dia, the model surpasses ElevenLabs Studio and Sesame's open model in quality.

Dia: A New Contender in Text-to-Speech

Dia, a 1.6 billion parameter text-to-speech (TTS) model, has entered the AI arena, developed by the two-person startup, Nari Labs. This model aims to generate natural-sounding dialogue directly from text prompts.

One of Dia's creators claims it outperforms competitors like ElevenLabs and Google's NotebookLM podcast generation feature.

Key Features and Claims

  • Dia emphasizes expressive quality, reproducibility, and open access.
  • According to Toby Kim, a co-creator of Dia, the model surpasses ElevenLabs Studio and Sesame's open model in quality. He also states it rivals NotebookLM's podcast feature.
  • The model was developed with zero funding.

Potential Impact

Dia could potentially challenge the popularity of other TTS tools, including OpenAI’s gpt-4o-mini-tts.

Stay Informed

Stay updated on the latest AI developments. Sign up for our daily and weekly newsletters for exclusive content and industry-leading AI coverage. Learn More (link to sign-up).

Comments (0)

No comments yet

Be the first to share your thoughts!