LLMs are software algorithms trained on huge text datasets, enabling them to understand and respond to human language in a very lifelike way.
The best-known example is ChatGPT, a chatbot interface powered by the GPT-4 LLM that’s taken the world by storm. ChatGPT is able to converse like a human and generate everything from blog posts, letters, and emails to fiction, poetry, and even computer code.
Impressive as they are, until now, LLMs have been limited in one significant way. They tend to only be able to complete one task, such as answering a question or generating a piece of text, before requiring more human interaction (known as “prompts”).
This means that they aren’t always great at more complicated tasks that need multi-step instructions or are dependent on external variables.
Enter Auto-GPT – a technology that attempts to overcome this hurdle with a simple solution. Some believe it may even be the next step towards the “holy grail” of AI – the creation of general, or strong, AI.
Let’s take a look at what that means first:
Strong AI vs. Weak AI
Current AI applications are typically designed to carry out one task, becoming increasingly good at it as they are fed more data. Some examples include analyzing images, translating languages, or navigating self-driving vehicles. Because of this, they are sometimes referred to as "specialized AI," "narrow AI," or "weak AI."
A generalized AI is one that is theoretically capable of carrying out many different types of tasks, even ones it wasn’t originally created to carry out, much the same way as a naturally intelligent entity (such as a human) can. Sometimes it’s called “strong AI” or "artificial general intelligence" (AGI).
AGI is perhaps what we traditionally thought of when we pictured what AI would look like back in the days before machine learning and deep learning made weak/narrow AI an everyday reality around the start of the previous decade. Think of the science fiction AI demonstrated by robots like Data in Star Trek that can do just about anything a human being can do.
So what is Auto-GPT?
The simplest way of looking at it is that Auto-GPT is able to carry out more complex, multi-step procedures than existing LLM-powered applications by creating its own prompts and feeding them back to itself, creating a loop.
Here’s one way of thinking about it: Getting the best results out of an application like ChatGPT requires putting careful thought into the way you phrase the questions you ask it. So why not let the application construct the question itself? And while it’s at it, also get it to ask what the next step should be – and how it should go about that … and so on, creating a loop until the task is accomplished.
It works by breaking a larger task into smaller sub-tasks and then spinning off independent Auto-GPT instances in order to work on them. The original instance acts as a kind of "project manager," coordinating all of the work carried out and compiling it into a finished result.
As well as using GPT-4 to construct sentences and prose based on the text it has studied, Auto-GPT is capable of browsing the internet and including information it finds there in its calculations and output. In this respect, it’s more similar to the new GPT-4 enabled version of Microsoft’s Bing search engine. It also has a better memory than ChatGPT, so it can construct and remember longer chains of commands.
Auto-GPT is an open-source application that uses GPT-4 and was created by one person, Toran Bruce Richards. Richards said that he was inspired to develop it because traditional AI models ", while powerful, often struggle to adapt to tasks that require long-term planning, or are unable to autonomously refine their approaches based on real-time feedback.”
It is one of a class of applications that are being called recursive AI agents because they have the ability to autonomously use the results they generate to create new prompts, chaining these operations together to complete complex tasks.
Another such agent is BabyAGI, which was created by a partner at a venture capitalist firm in order to help him with day-to-day tasks that were just too complex for something like ChatGPT, such as researching new technologies and companies.
What are some applications of Auto-GPT and AI agents?
While apps like ChatGPT have become famous for their ability to generate code, they tend to be limited to relatively short and simple programming and software design. Auto-GPT, and potentially other AI agents that work in a similar fashion, can be used to develop software applications from start to finish.
Auto-GPT is also able to help businesses to autonomously increase their net worth by examining their processes and making intelligent recommendations and insights about how they could be improved.
Unlike ChatGPT it can also access the internet, meaning you can ask it to conduct market research or carry out other similar tasks – for example “find me the best set of golf clubs for under $500.”
One extremely disruptive task it has been set is to “destroy humanity” – and the first sub-task it assigned itself to get this done was to begin researching the most powerful atomic weapons of all time. As its output is still limited to creating text, its creator assures us it won’t actually get very far with this task – hopefully.
Auto-GPT can also apparently be used to improve itself – its creator says it can create, evaluate, review and test updates to its own code that can potentially make it more capable and efficient.
It can even be used to create better LLMs that could form the basis of future AI agents, by accelerating the model making process.
What could this mean for the future of AI?
Ever since generative AI applications started to emerge, it’s been clear that we’re only at the start of a very long journey, in terms of how AI will evolve and impact our lives and society.
Are Auto-GPT and other agents that follow the same principles the next step of that journey? It certainly seems likely. At the very least, we can expect AI tools that allow us to carry out far more complex tasks than the relatively simple things that ChatGPT can do, to begin to become commonplace.
Before long, we will start to see more creative, sophisticated, diverse and useful AI output than the simple text and pictures that we’ve got used to. These will no doubt eventually have an even bigger impact on the way we work, play and communicate.
Other potential positive impacts include reduced cost and environmental impact of creating LLMs (and other machine learning-related activity) as autonomous, recursive AI agents find ways to make the process more efficient.
However we also have to consider that by itself it doesn’t really solve any of the problems associated with generative AI. These include the variable (to put it nicely) accuracy of the output it creates, the potential for abuse of intellectual property rights, and the possibility of it being used to spread biased or harmful content. In fact, by generating and running many more AI processes in order to achieve bigger tasks, it could potentially magnify these issues.
The potential problems don’t stop there – eminent AI expert and philosopher Nick Bostrom has recently said he believes the newest generation of AI chatbots (such as GPT-4) are even beginning to show signs of sentience. Which could create a whole new moral and ethical quandary if as a society we are planning to start creating and operationalizing them on a large scale.