You Can Clone Your Voice

We’ve talked about deep fakes before… videos that are altered to resemble another person or to change spoken words. The technology for you to clone your voice and use it other places is here. Like… today. For free. And it’s totally #helpfulbutcreepy

How to Clone Your Voice

A few years ago a company called Lyrebird created a demo site to let you record a few spoken words then type to use a clone of your voice. Back in the day, it was very robotic and unnatural. I thought my voice clone didn’t sound like me at all.

But “a few years ago” in technology speak translates to several lifetimes ago. Today’s voice-cloning software is incredibly lifelike. It’s still a little robotic and lacking human-like inflections and energy. But tomorrow, it may be perfect.

Husky Puppies

Clone Your Voice for Free

One of my favorite video editing tools of ALL TIME is Descript. You upload audio, and Descript transcribes it for you. Then you just need to edit the words to edit the video. It cuts my editing time down by at least 75%.

Descript bought Lyrebird a couple of years ago. Since I wasn’t very impressed with the first generations of Lyrebird, I didn’t rush to set up a clone of my voice. But since I am working on the new book and wanted to include it, I finally took the steps to submit my audio samples for training. (The process is also way easier than it was at the beginning.)

I am blown away by the results. You have to have very good quality audio and consistent sound levels (like using the same microphone) to get the baseline clone. Then you can actually keep training it by creating new styles with short snippets.

Descript has a free version that lets you create a limited-vocabulary clone (and even use their video editor for a few minutes a month). That means that everyone can create a clone with just a few steps.

Descript Pricing

Take the Voice Clone Challenge

I recorded a one-minute video this morning, then I used that same transcript with my voice clone. Next I mixed lines from each together into one more video. I challenge you to find the clone.

The answers are in the second half of this video.

Change a Word in Your Video

Aside from using your cloned voice to read a whole passage, you can change just a few words in a video you’ve already created. Here’s a sample of how I changed a very short phrase.

The Ethics of Voice Cloning

Human hand and robot hand with binary number code and light on blue screen background

Oh boy. This is a tough one. Right now Descript makes you record a pledge that this is your own voice when you submit the voice recordings for creation of the clone. So they must voice match your recorded pledge with the audio they are converting. They have an ethics statement on their site, as does

Generative media — the field of research that relates to “deep fakes” and other forms of synthesized audio and video — is advancing rapidly. In many use cases, the results are already indistinguishable from real media. This technology has exciting applications, such as Descript’s Overdub feature, but it also holds the potential for misuse.

While Descript is among the first products available with generative media features, it won’t be the last. As such, we are committed to modeling a responsible implementation of these technologies, unlocking the benefits of generative media while safeguarding against malicious use.

We believe you should own and control the use of your digital voice. Descript uses a process for training speech models that depends on verbal consent verification, ensuring that our customers can only create text to speech models that have been authorized by the voice’s owner. Once created, the voice owner has control over when and how it is used.

As the applications of this technology continue to evolve, we will remain in conversation with leading machine learning researchers, ethics professors, and the broader public about how to best develop and implement this technology.

But like Descript says, the technology is getting more advanced and wide spread every day. And I bet we are already seeing abuses of the functionality without knowing it.


  • I got the third one and then it was easy because the cloned voice sounded slightly muffled. Pretty cool though.

