AI voice generators have gained popularity in recent years. Since chatGPT came out in late 2022, it seems to have given AI voice generators an extra boost to be a major consideration for content creators, who want easier ways to create content without having to record their voices each and every time. But the question remains: what are some of the best AI voice generators? How reliable and realistic are they?
In this guide, I’m going to share my own experience as a 9-year podcaster, a 4-year YouTuber and a digital consultant who has helped dozens of brands create content and make more money for their businesses. AI voice generators have been at the center of some conversations lately, it’s time for us to unveil how good they really are.
We are going to compare 3 major AI voice generators:
Best AI Voice Generators (2023) – Top Picks
The company claims and is know to have “the most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling.”
EvenLabs is the most talked about AI voice generator on the market today. You will notice endless number of social shares, YouTube tutorials by well-known influencers.
With that said, it’s important to test out ElevenLabs for yourself. From my experience after using over half a dozen voice generators for the past two years, you may be be surprised to find which one works best for you – and that doesn’t necessarily mean the most popular tool on the market.
The process of creating your digital voice using ElevenLabs
ElevenLab’s speech synthesis is quite straightforward. You have two options:
- Use a premade voice. There are plenty of voices to choose from
- Choose the voice settings. Either use the default one, or you can change the “Stability” and/or “Clarity + Similarity Enhancement”. Hence based on the premade voices, you can create even more variations
- You can also add your own voice by choosing to “+Add voice” (see instructions below for how to do that)
- Once you are done, enter any text in the “Text” field, hit “Generate”. It takes a few seconds for ElevenLabs to process the chosen voice and text.
To download the generated audio, simply go to the “History” tab, and you will find a full list of previously generated audio files, voice you used, dates and text. You can also play the file to preview what you are about to download.
Such downloaded files can be used anywhere and for any purpose, as long as you have the right to use the voice – in most cases, this should be your voice.
To no one’s surprise, there has been controversy against ElevenLabs and AI generators alike. “ElevenLabs was criticized after users were able to abuse its software to generate controversial statements, in the vocal style of celebrities, public officials, and other famous individuals.”
But this should not be an issue if you are the original content creator who’s using ElevenLabs to create content for your own platforms.
Their basic plans are:
- Free $0 forever, and includes 10,000 characters per month creating up to 3 custom voices
- $5/month (and the first month is 80% off, which is only $1) gives you 30,000 characters per month and creates up to 10 custom voices.
ElevenLabs also has more premium subscription options created for independent publishers and growing businesses. Full pricing details here.
My take on ElevenLabs
ElevenLabs AI model is groundbreaking and different. Their AI model is built to grasp the logic and emotions behind words. In other words, rather than generate sentences one-by-one, or word-by-work based on what the AI knows about a person, ElevenLabs is “mindful of how each utterance ties to preceding and succeeding text”, this approach according to ElevenLabs will “zoomed-out perspective allows it to intonate longer fragments convincingly and with purpose.”
ElevenLabs’ voice generator is able to build on your voice and string words and phrases better together. While it’s certainly the most advanced technology in AI voices, my AI voice generated by ElevenLabs doesn’t sound like me. It isn’t something I can confidently use to create content instead of me.
With that said, I’ve heard incredible recordings done entirely with ElevenLabs that nearly fooled me, such as this episode of Seth Godin’s Akimbo podcast. I’m very familiar with Seth’s voice, yet ElevenLabs did a phenomenal job.
This leads me to believe that ElevenLabs isn’t going to work out the same way for everyone. I also noticed its limitations when it comes to accents. Whether you have an English accent that’s regional, or an accent because English is a second language to you, ElevenLabs isn’t picking up the differences as well. As for me, ElevenLabs makes me sound very American, similar to an American broadcaster’s voice without much of my own emotions and pitches.
Podcastle Revoice uses generative AI technology to audio creation, allowing you to create a digital copy of your own voice so you can create content anywhere.
The process of creating your digital voice using ElevenLabs
To create “My Digital Voices”, you need to have the Pro subscription from Podcastle.
- Click on “My Digital Voices” on the left handside
- Then click on “+ Create voice” in the upper righthand corner of the app
- Follow 70 prompts to record short sentences. This took me between 12-15 minutes
The audio prompts submission is one of a kind compared to ElevenLabs and Descript Overdub. I believe the intent is to create a variation of vocabularies as opposed to limited audio samples submitted by you (the creator) with limited and redundant vocabularies.
Once you are done, your new digital voice isn’t available immediately. You will have to wait for 24 hours for your voice profile to be ready to create content.
To begin using digital voice, you need to create a new project, and choose “Convert Text to Speech”.
Under the Text editor, you can enter any amount of text. Then click on the dropbox at the top of your paragraph and choose your voice. You will see a label next to it called “my voice”. The process takes between 10 seconds or longer depending on the length of your text.
Podcastle isn’t only a AI voice platform. It’s originally designed to be an all-in-one podcast recording and editing platform. With each subscription, you will have access to many features in addition to Revoice (AI voice cloning).
As mentioned earlier, the only subscription tier that comes with Revoice (AI voice cloning) is the Pro version at $23.99 if paid annually, or $29.99 monthly.
My take on Podcastle Revoice
I have written about Revoice in 2022 during their beta launch. It was an intriguing feature, and I generally liked the process. Learn more: Podcastle Revoice: Clone Your Voice With AI (2023)
According to Podcastle, their algorithm for Revoice has been improved. If you are recording your voice since Feb 2023, it’s very likely that the quality will be better than what’s seen in my example. I have recently re-submitted my digital voice and I’m eager to see how much it has improved.
The sound of my voice generated from Revoice is quite like mine. However, unlike ElevenLabs, my digital voice is a lot more robotic compared to ElevanLabs (although the sound isn’t quite like mine). If I were to create long-form content, I will have to lean on ElevenLabs so the speech flows more naturally.
Podcastle remains to be a top choice for podcast recording and editing.
Descript’s Overdub lets you create a text-to-speech model of your voice or select one from their ultra-realistic stock voices.
Overdub has been marketed for some time in 2022. Since Descript has been one of the most popular choices by podcasters and content creators, many of us jumped in early to test out the feature.
The process of creating overdub using Descript
You will need:
- At least 10 minutes of recorded speech; Descript recommends 30 to 180 minutes for high quality results.
- To submit the training session for review; the process should take 2-24 hours
- A Voice ID is the consent statement of the person whose voice is being submitted for training. This will be recorded when you submit the voice for training or can be added to the beginning of your training session.
Descript Overdub is free to start. However, the Free and Creator subscription only offers 1,000 word vocabulary. I find it insufficient to produce real content. Therefore your better option will be to upgrade to their Pro subscription, which is $30/month.
My take on Descript Overdub
Honestly I’m impressed by Overdub. It’s easy to use and setup. The sound and tone of your voice might come out quite natural as well. Overdub is a definitely worth exploring next to ElevenLabs as another AI voice generator.
Price-wise, Overdub is slightly more expensive than ElevenLabs. However, if you are a podcaster, Descript does come with a suite of features that can be very beneficial, such as creating video clips and audiograms: How to Use Descript for Podcast Editing (2022)
Is there a clear winner? Maybe.
ElevenLabs seems to be the go-to AI voice generator for most creators, but I think it’s worth learning at least another option (or two) to decide for yourself. Descript Overhub is clear competitor in my oppinion, and I’d clearly keep an eye on what Podcastle Revoice will come up next as well.
There are many factors for how good an AI-generated voice may be beyond the basics such as timing and intonation. It’s worth exploring these options and remember to have fun learning AI tools! As a creator, I understand that it can be stressful to see so many AI tools on the market today. I hope to explore some more and share my experience as a creator with many more of you.
If you find this guide helpful, please share your comments with me below, and pass this knowledge onto your friend and colleagues.
You might also like…
- Best AI Logo Generators: Top 5 Options (2023)
- Best AI repurposing tool for video and audio (2023)
- Best AI Tools for YouTube (MEGAPOST) 
- Best AI Tools For Business (That Are Not ChatGPT)