Undergrads Outsmart Big Tech AI

23Apr/25Off

Undergrads Outsmart Big Tech AI

Two undergraduates from Korea have built "Dia," an open-source speech AI model that competes with tools like ElevenLabs and Google’s NotebookLM. With no prior funding and minimal experience, they used Google’s TPU Research Cloud to train the model. Dia is now publicly available on Hugging Face and GitHub.

The Decode:

• Dia Offers Rich Voice Control - The 1.6B parameter model can generate podcast-like dialogues with control over tone, speaker tags, disfluencies, and even nonverbal sounds like coughs or laughter. Users can prompt Dia to generate random voices or clone real ones. In early demos, it rivaled commercial tools in quality and flexibility. ...

• Voice Cloning and Lack of Safeguards Raise Flags - Dia enables simple voice cloning, and its open access lacks strong safeguards. While the creators discourage misuse, they also disclaim responsibility. The data used for training hasn’t been disclosed, raising potential copyright concerns. ...

See the full story here: https://decodeai.ghost.io/undergrads-outsmart-big-tech-ai/

Filed under: Non-3D stories Comments Off

Comments (0) Trackbacks (0) ( subscribe to comments on this post )

Sorry, the comment form is closed at this time.

Trackbacks are disabled.

Virtual Companions Are Now Inspiring Art, Fashion, Music, And Even Romance » « How Musicians Are Sabotaging AI With Inaudible Tricks

Pages

If your company is an ETC member, you can log in and see more news posts at www.etcentric.org

philip lelyveld The world of entertainment technology

Undergrads Outsmart Big Tech AI

Pages

More posts