AI voices are everywhere, and they’re getting more realistic! Is this the end of the voiceover industry?
No!
While an AI voice may work out cheaper in monetary terms, they often end up costing more in time and even in lost business. Yes, the ‘bottom end’ of the VO market will shrink considerably – phone greetings, small projects with very low budgets – the type of work that those without much training will happily do for a small fee. AI is going to be quicker and better than untrained voice artists.
But… Humans relate to other humans. In most cases, you need the listener to trust the speaker, and engage with what is being said. AI is just generating numerical patterns to sound like speech – it does not have any of its own intelligence, empathy, or a human brain. It is not thinking about what is being said and therefore there is no human connection behind the words. The difference might seem subtle – but if you try to listen to an AI voice for more than a few minutes, you start to tune out as the patterns are just not human.
Even if you can’t hear it at first – your brain knows the difference. Listeners won’t connect with the message in the same way.
More benefits of a human voice:
SAVING TIME:
I know the difference between the town of Reading and the word ‘reading’. I know not to emphasise the SHAM in ‘Amersham’ and that there’s a BARK in “Berkshire”. I recognise well-known acronyms. An AI voice will not correct typos that would be glaringly obvious to a real person. Whenever an AI voice gets something a bit off, you have to go in and correct it or try to trick it into saying what you want. In a long piece of narration, that’s going to take ages.
UNDERSTANDING CONTEXT:
A real person can understand the context behind a script and this is vital when it comes to speaking sensitively about a difficult topic, knowing how to say certain words, and where to add a smile, wink or a nod of the head – these subtle nuances come through in the voice and these are the things that make the voice sound like a real person with human emotions. Words like Thankyou and Sorry (common in voicemail greetings and eLearning quizzes) can sound very insincere with an AI voice.
SPEAKING v READING: An AI voice will read a script exactly how it is written. Great! Or is it? Did you know that punctuation is only intended for written grammar? Pauses, dashes, commas, paragraph breaks – the AI voice will adhere to these perfectly. But this isn’t how people speak to eachanother in real life. Words sound like they are being read from a page, rather than being spoken with understanding.
PACING: The human brain naturally adjusts the pace of words, phrases and sentences. Some words are less important while others are more relevant to the overall meaning. We can ‘lean in’ to key words and ‘throw away’ less important words. Again, this may seem quite subtle but it brings meaning to the script. If words are paced via an algorithm then listeners tune out very quickly. AI scratch tracks are nearly *always* too fast. They often sound OK, but it is really hard to voice in time with an AI scratch track without losing meaning and sounding like you’re gabbling, because AI doesn’t lean into language the way a human does.
For important projects, a human voice is the only choice!
Photo by Nikita Kozlov on Unsplash

