New OpenAI tool can clone your voice in 15 seconds

OpenAI has just introduced another new tool that integrates artificial intelligence. This time it’s Voice Engine, a speech synthesis technology that can reproduce any voice from a recording lasting just 15 seconds.

After creating the text using ChatGPTChatGPT, image generation with Dall-E and video generation with Sora, OpenAI has just announced a somewhat dedicated speech synthesis tool. Called “Voice Engine”, it allows you to clone someone’s voice from a recording that is just 15 seconds long.

OpenAI indicates that it has developed the Voice Engine since 2022 and uses it for its speech synthesis API as well as for reading ChatGPT responses. This is not the first tool of its kind, as in January 2023 MicrosoftMicrosoft announced Vall-E, which can clone a voice from a clip as short as three seconds, and ElevenLabs is offering a similar feature.

Reading tool or automatic translation

The firm is currently testing this technology with proven partners in various fields. First of all, to help children or non-readers read in a “natural and emotional” voice. It also allows you to create automatic translations of videos and podcasts while preserving the speaker’s voice and accent. For example, if you reproduce the French voice of an American speaker, the American accent will remain. This will allow companies to reach a wider audience or improve their service offerings for linguistic minorities.

However, Voice Engine is not yet available to the general public. The company indicates that it faces risks of abuse and deepfake, especially with the US presidential election this year, she prefers to limit access. Additionally, OpenAI adds an audio watermark to all generated clips to make them easy to identify.

