I’m talking, of course, about the original Google Search. Today, the word “Google” has become a synonym for "search," and its creator – now known as Alphabet – has gone on to become one of the largest and most powerful corporations on the planet.
Sure, there were search engines before it. But Google was the first to popularize knowledge-based search. And although competitors have emerged over the years, mostly they’ve just been variations on a theme. Today, when we want to find something, no one says, "I'll Bing it" or "I'll Yahoo it." However, could all that be about to change?
Unless you’ve spent the last few months on Pluto, you’ve probably noticed that ChatGPT is the “hot topic” app of the moment. Just in case, here’s a quick primer on what it is and how it works.
As I sat down to write this article on whether or not ChatGPT will finally end Google’s 25-year dominance of the search market, news broke that Microsoft has super-charged its Bing search engine by building ChatGPT functionality directly into its interface.
This is not surprising, as Microsoft was one of the first organizations to back ChatGPT’s creators – the research organization OpenAI – when it invested $1 billion back in 2019. Earlier this year, it announced that it was following up with a further $10 billion, making it the largest single shareholder.
This clearly takes ChatGPT – and Bing – another step closer to becoming the first serious competitor to Google Search’s throne in a long time. So, should Google be worried? And how is the king of search planning to respond?
The Data Behind ChatGPT and Google Search
First, let's take a look at how Google Search and ChatGPT both take different approaches to solving the same problem – providing us with the information that we need.
Both Google Search and ChatGPT respond to queries by searching a vast database of information gathered from the internet. In ChatGPT’s case, this is the GPT-3 training dataset. The exact contents of this dataset haven't been made public, but it’s said to consist of 75 billion parameters (data points), including a crawl of the internet made in 2021, the entire contents of Wikipedia, large chunks of Reddit, and two databases of books. Altogether, it’s reported that this training dataset is around 45 terabytes in size.
This is certainly a large training dataset compared to other language models. However, it pales into insignificance next to the dataset that Google’s search engine uses to respond to our queries. Google has been building its index since the earliest days of the world wide web by sending out "crawlers" that travel to every corner of the internet that they can reach. According to Google, this index currently sits at roughly 100,000 terabytes (100 petabytes.)
Does Bigger Mean Better?
However, as the saying goes, size isn’t everything. The major innovation with ChatGPT – and the factor that makes it, in my opinion, the first serious threat to Google’s dominance of the information economy – is the way it processes that data to make it useful for us.
Google basically returns lists of webpages that its algorithms determine are likely to contain the information that we want based on our search queries. ChatGPT, on the other hand, uses natural language generation (NLG) algorithms to structure the results in a way that gives us direct answers. This is a huge boost to user experience. Anyone using it for research no longer has to comb through pages of search results. The experience is very similar to asking a very knowledgeable friend for their answers and opinions.
For the sake of fairness, though, we do have to mention that Google Search has also included some AI-powered features – such as the Knowledge Panel – that present information extracted from certain web pages on the results page. This information is either presented next to or integrated into the web page search results it returns. However, it still isn’t able to use the conversational style of ChatGPT.
It’s this conversational style used by ChatGPT which enhances the user experience. If we don’t like the answer it gives us, or we think it’s taken the wrong approach to helping us to solve whatever problems we’ve brought to it, we can ask it to try again. It will remember the previous details of our conversation – until the session ends, at least – and use that information to fine-tune its responses to us until it’s able to give us something we’re happy with.
ChatGPT is also unhampered by advertising – most of the top results we get from using Google Search are there because someone has paid for them to be there. However, this is likely to change as ChatGPT becomes commercialized – we are going to have to start paying for the massive amount of compute power used when it processes our queries at some point, and OpenAI is already trialing “premium” paid-for versions in several regions including the US and UK.
The Downsides of ChatGPT
The first matter that has to be addressed is accuracy. ChatGPT is a very new tool and is trained on a static, unaudited dataset. This, unfortunately means it's prone to making mistakes and suffering from bias – the bane of much of today’s AI technology. With any data processing, the first rule is “garbage in equals garbage out." Anyone who has tried playing around with ChatGPT for any length of time is likely to have come across these inaccuracies. For now, we can probably put them down to teething troubles, and it’s likely that its accuracy will improve as it continues to learn and evolve with further training.
This gives rise to another problem, though. The fact that ChatGPT is somewhat (read: very) opaque about where its data comes from means that it's very difficult to fact-check it or verify its sources.
These issues prompted the head of Google Search, Prabhakar Raghavan, to liken the operation of AI-powered chatbots to "hallucination," describing them as working "in such a way that a machine provides a convincing but completely made-up answer.”
While Google’s language processing may seem old hat by comparison, at least it is up-front about where its information comes from. This will generally be owners and operators of the websites that it surfaces. Of course, this doesn’t by any means mean that everything it tells us will be truthful and accurate - but it’s far more straightforward to make an assessment of the validity and trustworthiness of the information.
Convergence – the Future of Search?
Of course, these upsides and downsides are mainly relevant to ChatGPT (and indeed Google Search) as they stand today. ChatGPT, in particular, is a nascent technology, and what’s on offer right now only offers a glimpse of what it – and similar platforms - will be capable of in the near future.
One thing that’s certain, though, is that Alphabet won’t simply roll over and concede defeat. The commercialization of its search technology has played a big part in making it one of the richest companies in the world, and it isn't a cash cow it will want to give up anytime soon.
So, almost simultaneously with Microsoft announcing that ChatGPT functionality will be added to Bing, Alphabet said that its own large language model, known as Bard, will be used to enhance the performance of Google Search.
Things didn’t get off to the best start, however – with errors made by the machine in promotional videos blamed for tanking the value of the company by $100 billion.
But if we are being generous enough to put ChatGPT’s errors down to "teething troubles," we have to afford Google the same benefit of the doubt.
It’s most likely that we will see both technologies – large language model-based chat interfaces and search engines - come together to create hybrid technologies, which hopefully will provide us with the best of both worlds.
Both Google and Microsoft clearly believe that the next stage in the evolution of information technology will center around this convergence of search and language. And both understand that the catalyst for this will be AI. Yet another sign that the era of “thinking” machines is already transforming every aspect of society in ways that would seem unimaginable just a few years ago.