Artificial Intelligence (AI) is increasingly a part of the world around us, and it’s rapidly changing our lives. It offers a hugely exciting opportunity, and sometimes, it can be more than a little scary. And without a doubt, the big development in AI making waves right now is generative AI.
Just like it sounds, it’s AI that can create, from words and images to videos, music, computer applications, and even entire virtual worlds.
What makes generative AI different and special is that it puts the power of machine intelligence in the hands of just about anyone.
We’re used to using AI-powered applications and tools in our everyday lives. Google uses it to find the information we need; Amazon uses it to suggest things we should buy; Netflix uses it to recommend movies; Spotify uses it for music – it’s all powered by AI.
But the new generation of generative AI tools goes even further, giving us the power to build and create in amazing ways. With a little practice, we can even use them to build our own AI-powered apps and tools. Because it breaks down the technical barriers, it can truly be seen as the beginning of the long-awaited democratization of AI.
So, in this article, I’ll give an overview in simple terms to show why it’s so powerful and what you can do with it. I’ll also take a non-technical look at how it works, but most importantly, I’ll explain why it’s going to change the world and what everyone should be doing to prepare for it.
What Is Generative AI?
The term AI, as it's used today, refers to computer algorithms that can effectively simulate human cognitive processes - learning, decision-making, problem-solving, and even creativity.
It's this last and perhaps most human quality where generative AI comes into the picture. Like all modern AI, generative AI models are trained on data. They then use that data to create more data, following the rules and patterns they've learned.
For example, if you train it on pictures of cats, it will learn that a cat has four legs, two ears, and a tail. Then, you can tell it to generate its own picture of a cat, and it will come up with as many variations as you need, all following those basic rules.
One distinction that's worth understanding is the difference between generative AI and discriminative (or predictive) AI. Discriminative AI focuses mainly on classification, learning the difference between "things" - cats and dogs, for example. This is what's used in recommendation engines like those used by Netflix or Amazon to distinguish between things you might want to watch or buy and things you're unlikely to be interested in. Or in navigation apps to distinguish between good routes from A to B and ones you should probably avoid.
Generative AI, instead, focuses on understanding patterns and structure in data and using that to create new data that looks like it.
So What Can Generative AI Do?
The first use cases for generative AI typically involved creating text and images, but as the technology has become more sophisticated, a world of possibilities has opened up. Here are some of them:
Images: Many generative AI tools – such as Midjourney or Stable Diffusion – can take a natural language (i.e., a human language) prompt and use it to generate a picture. Tell it you want an image of a two-headed dog wearing an Elvis costume flying a spaceship into a black hole and watch as it (or something close to it) appears before your eyes.
Text: ChatGPT is probably responsible for kicking off the intense hype surrounding generative AI at the moment, but there are other generative text tools like Google’s Bard and Meta’s Llama. They can be used to write anything from essays and articles to plays, poems and novels.
Coding: As well as ChatGPT, tools like Microsoft’s GitHub Copilot and Amazon’s CodeWhisperer make it easy for anyone to generate computer code with very little technical knowledge.
Audio: Generative AI tools can create human-like voices (voice synthesis), allowing computers to speak words that have never before been uttered by a human, as well as music and sound effects.
Video: While not yet as advanced as text or image generation, tools are beginning to emerge that allow us to create and edit video simply by describing what we want to see.
Data augmentation: Generative AI makes it easy to create entirely synthetic data sets for use in training other AI models that follow real-world rules without conferring privacy and data security obligations on those who store and use it.
Virtual environments: Think of virtual reality (VR) environments or video game worlds that can be explored and interacted with, or the rather hyped-up concept of the metaverse. Designing these is a highly complex task that can be greatly accelerated with the help of generative AI.
How Does It Work?
Like all of the AI we see today, generative AI grew out of a field of AI study and practice called machine learning (ML).
While traditional computer algorithms are coded by a human to tell a machine exactly how to do a particular job, ML algorithms get better at their jobs the more data they are fed.
Put a bunch of these algorithms together in a way that allows them to generate new data based on what they've learned, and you get a model - essentially an engine tuned to generate a particular type of data.
Some examples of models used in generative AI applications include:
Large Language Models (LLMs) - By ingesting large amounts of text, they learn the semantic relationships between words and use that data to generate more language. An example of an LLM is GPT-4, created by OpenAI, which powers the ChatGPT tool.
Generative Adversarial Networks (GANs) - These work by pitting two competing algorithms against each other, one tasked with generating data that resembles its training data and another tasked with trying to tell whether the output is real or generated. This type of generative model is typically used to create images, sounds, or even video.
Variational Autoencoders - This is a type of model that learns how data is constructed by encoding it in a simple way that captures its essential characteristics and then figuring out how to reconstruct it. It's often used to generate synthetic data.
Diffusion models - These work by adding random data (known as "noise") to the data it's learning about, then figuring out how to remove it while preserving the original data - thus learning what's important and what can be discarded. Diffusion models are most commonly used in image generation.
Transformer Models - This is something of an umbrella term for a group of models that includes LLMs but covers any model that works by learning the context and relationships between different elements in its training data.
Generative AI in Practice
There are already many incredible examples of generative AI being used to create amazing (and sometimes terrible) things.
Take Coca-Cola’s Masterpiece advert, for example – a collaborative creation between human artists and AI that brings many of history’s greatest works of art to life on the screen in a way that’s never been done before.
It’s also been used to create a new Beatles song by rebuilding partially recorded lyrics by John Lennon, combined with new material by Paul McCartney.
Generative design is a term for an emerging field where generative AI is used to create blueprints and production processes for new products. For example, General Motors used generative tools created by Autodesk to design a new seatbelt bracket that’s 40% lighter and 20% stronger than its existing components.
And it’s also being used to speed up drug discovery, with one UK company recently announcing that it’s created the world’s first AI-generated immunotherapy cancer treatment.
Generative AI is also the technology behind the recent phenomena of deepfakes, which blur the lines between reality and fiction by making it appear as if real people have done or said fake things.
Deepfake Tom Cruise was one of the earliest and most famous examples. More insidiously, potential candidates on both sides of the upcoming 2024 US presidential elections have starred in deepfakes aimed to discredit them for political ends.
And while spreading propaganda is bad enough, there are also outright criminal uses - including attempts to extort money by staging hoax kidnappings using cloned voices and fraudulently scamming money by posing as a company CEO.
The Ethical Questions Around Generative AI
While generative AI is clearly capable of amazing things, it's clear that its existence is forcing us to confront some difficult issues and questions.
Perhaps one of the biggest is when we will get to the point where it's impossible to tell the difference between what's real and what's generated by AI.
Given the incredibly rapid pace of innovation in the field, it's likely to happen sooner rather than later.
Which leads to the question of what (if anything) we should do about it. Countries, including China, have already passed legislation making it illegal to deepfake people without their consent – should the world follow suit?
And then there's the question of how this will affect human jobs - will the livelihoods of creators be threatened if the companies that employ them can create as many images, sounds and videos as they need just by telling a computer to do it?
Another issue that needs to be addressed is copyright. If an AI is used to create a work of art, who owns it? The person who used the AI to create the art? The creator of the AI itself? Or all the (probably) thousands of artists whose work was used (in practice, often without permission) to train the AI?
All of these questions need to be answered - and, given the accelerating pace at which this technology is being developed, soon. How we answer them may well play an important role in determining the future of generative AI in society and in our lives.