OpenAI is letting a select few try out Voice Engine, its new voice cloning model. This innovation needs only a 15-second sample of a voice and using text-to-voice method, a new voice recording is generated. Some of the perks? It can be in any language, The Verge reports.

Voice Engine was conceived in late 2022 and has already underpinned preset voices for text-to-speech through ChatGPT's Read Aloud feature. Jeff Harris, a member of OpenAI's product team, mentioned in an interview with TechCrunch, that the model is “ a mix of licensed and publicly available data.”

Who’s gaining access? Only a few, including ed-tech company Age of Learning, AI video generator platform HeyGen, Health data collector Dimagi, AI communication app Livox, and health service provider Lifespan.

Why? They aim to explore the potential benefits of the Voice Engine feature across diverse sectors. For example, Age of Learning has utilized the technology to “ produce pre-scripted voice-over content ” and deliver “ real-time, personalized responses ” to students.

Digital consent is mandatory in the face of ethical concerns: OpenAI implemented usage policies, mandating that partners refrain from impersonating individuals or groups without their approval. Moreover, partners must obtain “explicit and informed consent” from the actual person whose voice is used, to share the AI-generated nature of the voices to listeners, and incorporate watermarking to track the origin of audio clips.

To mitigate these concerns, OpenAI suggests numerous safeguarding measures: They include reconsidering voice-based authentication for sensitive accounts, implementing policies to safeguard individuals' voices in AI applications, educating users about AI deep fakes, and developing systems to track AI-generated content.


Vocal chords can be restored thanks to an AI-assisted wearable patch. A soft, stretchy invention, measuring just over one square inch, can restore voice function for those with difficulty using their voice or post-laryngeal cancer surgeries, a new study published in Nature shows.

How does it work? Created by UCLA bioengineers, this patch-like device works when placed on the skin outside the throat. Through machine learning, its bioelectric system translates the muscle movements of the larynx as the user tries to voicelessly speak and translates these signals into speech. Participants were able to pronounce five sentences including “Hi, Rachel, how are you doing today?” and “I love you!” with an accuracy of nearly 95% UCLA Newsroom reports.

Compact and featherlight: This patch is useful on the go, thin, and non-invasive, where its convenience falls onto the reapplication due to its double-sided biocompatible tape — and current therapies, including surgery and voice therapy, can involve lengthy recoveries and postoperative voice rest, whereby this patch could allow patients to continue to communicate.

More refining to come: Even though it extends across various demographic groups — given that voice disorders affect nearly 30% of individuals in their lifetime — there are some tweaks needed. Jun Chen, an assistant professor of bioengineering at UCLA, envisions further refining the device’s linguistic capabilities through ongoing machine learning iterations and plans to conduct trials involving individuals contending with speech disorders.