Written by

Bernard Marr

Bernard Marr is a world-renowned futurist, influencer and thought leader in the fields of business and technology, with a passion for using technology for the good of humanity. He is a best-selling author of 20 books, writes a regular column for Forbes and advises and coaches many of the world’s best-known organisations. He has over 2 million social media followers, 1 million newsletter subscribers and was ranked by LinkedIn as one of the top 5 business influencers in the world and the No 1 influencer in the UK.

Bernard’s latest book is ‘Business Trends in Practice: The 25+ Trends That Are Redefining Organisations’

View Latest Book

Artificial Intelligence Can Now Copy Your Voice: What Does That Mean For Humans?

2 July 2021

It takes just 3.7 seconds of audio to clone a voice. This impressive—and a bit alarming—feat was announced by Chinese tech giant Baidu. A year ago, the company’s voice cloning tool called Deep Voice required 30 minutes of audio to do the same. This illustrates just how fast the technology to create artificial voices is accelerating. In just a short time, the capabilities of AI voice generation have expanded and become more realistic which makes it easier for the technology to be misused.

Capabilities of AI Voice Generation

Like all artificial intelligence algorithms, the more data voice cloning tools such as Deep Voice receive to train with the more realistic the results. When you listen to several cloning examples, it’s easier to appreciate the breadth of what the technology can do including being able to switch the gender of the voice as well as alter accents and styles of speech.

Google unveiled Tacotron 2, a text-to-speech system that leverages the company’s deep neural network and speech generation method WaveNet. WaveNet analyses a visual representation of audio called a spectrogram to generate audio. It is used to generate the voice for Google Assistant. This iteration of the technology is so good; it’s nearly impossible to tell what’s AI generated and what voice is human generated. The algorithm has learned how to pronounce challenging words and names that would have been a tell-tale sign of a machine as well as how to better enunciate words.

These advances in Google’s voice generation technology have allowed for Google Assistant to offer celebrity cameos. John Legend’s voice is now an option on any device in the United States with Google Assistant such as Google Home, Google Home Hub, and smartphones. The crooner’s voice will only respond to certain questions such as “What’s the weather” and “How far away is the moon” and is available to sing happy birthday on command. Google anticipates that we’ll soon have more celebrity cameos to choose from.

Another example of just how precise the technology has become, a Jordan Peterson (the author of 12 Rules for Life) AI model sounds just like him rapping Eminem’s “Lose Yourself” song. The creator of the AI algorithm used just six hours of Peterson talking (taken from readily available recordings of him online) to train the machine learning algorithm to create the audio. It takes short audio clips and learns how to synthesise speech in the style of the speaker. Take a listen, and you’ll see just how successful it was. 

This advanced technology opens the door for companies such as Lyrebird to provide new services and products. Lyrebird uses artificial intelligence to create voices for chatbots, audiobooks, videos games, text readers and more. They acknowledge on their website that “with great innovation comes great responsibility” underscoring the importance of pioneers of this technology to take great care to avoid misuse of the technology.

How This Technology Could Be Misused

Similar to other new technologies, artificial voice can have many benefits but can be also be used to mislead individuals as well. As the AI algorithms get better and it becomes difficult to discern what’s real and what’s artificial, there will be more opportunities to use it to fabricate the truth.

According to research, our brains don’t register significant differences between real and artificial voices. In fact, it’s harder for our brains to distinguish fake voices than to detect fakes images.

Now that these AI systems only require a short amount of audio to train in order to create a viable artificial voice that mimics the speaking style and tone of an individual, the opportunity for abuse increases. So far, researchers weren’t able to identify a neural distinction for how a brain can distinguish between real and fake. Consider how artificial voices might be used in an interview, news segment or press conference to make listeners believe they are listening to an authority figure in the government or a CEO of a company.

Raising awareness that this technology exists and how sophisticated it is will be the first step to safeguard listeners from falling for artificial voices when they are used to mislead us. The real fear is that people can be fooled to act on something that is fake because it sounds like it’s coming from somebody real. Some people are attempting to find a technical solution to safeguard us. However, a technical solution will not be 100% foolproof. Our ability to critically assess a situation, evaluate the source of information and verify its validity will become increasingly important.

Data Strategy Book | Bernard Marr

Related Articles

How Do We Use Artificial Intelligence Ethically | Bernard Marr

How Do We Use Artificial Intelligence Ethically?

I’m hugely passionate about artificial intelligence (AI), and I'm proud to say that I help companies use AI to do amazing things in the world [...]

How Artificial Intelligence Can Help Small Businesses | Bernard Marr

How Artificial Intelligence Can Help Small Businesses

Small and medium-sized businesses all over the world are benefiting from artificial intelligence and machine learning – and integrating AI into core business functions and processes is getting more accessible and more affordable every day. [...]

What Really Is The Tesla Bot And How Much Will It Cost | Bernard Marr

What Really Is The Tesla Bot And How Much Will It Cost?

Elon Musk has just announced that Tesla will begin developing a humanoid robot called the Tesla Bot that is designed to perform “unsafe, repetitive, or boring” tasks. [...]

Should I Choose Machine Learning or Big Data | Bernard Marr

Should I Choose Machine Learning or Big Data?

Big Data and Machine Learning are two exciting applications of technology that are often mentioned together in the space of the same breath [...]

What Is The Next Level Of AI Technology | Bernard Marr

What Is The Next Level Of AI Technology?

Artificial Intelligence (AI) has permeated all aspects of our lives – from the way we communicate to how we work, shop, play, and do business. [...]

The 7 Biggest Ethical Challenges of Artificial Intelligence | Bernard Marr

The 7 Biggest Ethical Challenges of Artificial Intelligence

Today, artificial intelligence is essential across a wide range of industries, including healthcare, retail, manufacturing, and even government. [...]

Stay up-to-date

  • Get updates straight to your inbox
  • Join my 1 million newsletter subscribers
  • Never miss any new content

Social Media



View Podcasts