Imagine a collection of books—maybe millions or even billions of them—haphazardly tossed by publishers into a heaping pile in a field. Every day the pile grows exponentially.
Those books are brimming with knowledge and answers. But how would a seeker find them? Lacking organization, the books are useless.
This is the raw internet in all its unfiltered glory. Which is why most of our quests for “enlightenment” online begin with Google (and yes, there are still other search engines). Google’s algorithmic tentacles scan and index every book in that ungodly pile. When someone enters a query in the search bar, the search algorithm thumbs through its indexed version of the internet, surfaces pages, and presents them in a ranked list of the top hits.
This approach is incredibly useful. So useful, in fact, that it hasn’t fundamentally changed in over two decades. But now, AI researchers at Google, the very company that set the bar for search engines in the first place, are sketching out a blueprint for what might be coming up next.
In a paper on the arXiv preprint server, the team suggests the technology to make the internet even more searchable is at our fingertips. They say large language models—machine learning algorithms like OpenAI’s GPT-3—could wholly replace today’s system of index, retrieve, then rank.
Is AI the Search Engine of the Future?
When seeking information, most people would love to ask an expert and get a nuanced and trustworthy response, the authors write. Instead, they Google it. This can work, or go terribly wrong. Like when you get sucked down a panicky, health-related rabbit hole at two in the morning.
Though search engines surface (hopefully quality) sources that contain at least pieces of an answer, the burden is on the searcher to scan, filter, and read through the results to piece together that answer as best they can.
Search results have improved leaps and bounds over the years. Still, the approach is far from perfect.
There are question-and-answer tools, like Alexa, Siri, and Google Assistant. But these tools are brittle, with a limited (though growing) repertoire of questions they can field. Though they have their own shortcomings (more on those below), large language models like GPT-3 are much more flexible and can construct novel replies in natural language to any query or prompt.
The Google team suggests the next generation of search engines might synthesize the best of all worlds, folding today’s top information retrieval systems into large-scale AI.
It’s worth noting machine learning is already at work in classical index-retrieve-then-rank search engines. But instead of merely augmenting the system, the authors propose machine learning could wholly replace it.
“What would happen if we got rid of the notion of the index altogether and replaced it with a large pre-trained model that efficiently and effectively encodes all of the information contained in the corpus?” Donald Metzler and coauthors write in the paper. “What if the distinction between retrieval and ranking went away and instead there was a single response generation phase?”
One ideal result they envision is a bit like the starship Enterprise’s computer in Star Trek. Seekers of information pose questions, the system answers conversationally—that is, with a natural language reply as you’d expect from an expert—and includes authoritative citations in its answer.
In the paper, the authors sketch out what they call an aspirational example of what this approach might look like in practice. A user asks, “What are the health benefits of red wine?” The system returns a nuanced answer in clear prose from multiple authoritative sources—in this case WebMD and the Mayo Clinic—highlighting the potential benefits and risks of drinking red wine.
It needn’t end there, however. The authors note that another benefit of large language models is their ability to learn many tasks with only a little tweaking (this is known as one-shot or few-shot learning). So they may be able to perform all the same tasks current search engines accomplish, and dozens more as well.
Still Just a Vision
Today, this vision is out of reach. Large language models are what the authors call “dilettantes.”
Algorithms like GPT-3 can produce prose that is, at times, nearly indistinguishable from passages written by humans, but they’re also still prone to nonsensical replies. Worse, they heedlessly reflect biases embedded in their training data, have no sense of contextual understanding, and can’t cite sources (or even separate high quality and low quality sources) to justify their responses.
“They are perceived to know a lot but their knowledge is skin deep,” the authors write. The paper also lays out breakthroughs needed to bridge the gap. Indeed, many of the challenges they outline apply to the field at large.
A key advance would be moving beyond algorithms that only model the relationships between terms (such as individual words) to algorithms that also model the relationship between words in an article, for example, and the article as a whole. In addition, they would also model the relationships between many different articles across the internet.
Researchers also need to define what constitutes a quality response. This in itself is no easy task. But, for starters, the authors suggest high quality responses should be authoritative, transparent, unbiased, accessible, and contain diverse perspectives.
Even the most cutting-edge algorithms today don’t come close to this bar. And it would be unwise to deploy natural language models on this scale until they’re solved. But if solved—and there is already work being done to address some of these challenges—search engines wouldn’t be the only applications to benefit.
‘Early Grey, Hot’
It’s an enticing vision. Combing through web pages in search of answers while trying to determine what’s trustworthy and what isn’t can be exhausting.
Undoubtedly, many of us don’t do the job as well as we could or should.
But it’s also worth speculating how an internet accessed like this would change the way people contribute to it.
If we primarily consume information by reading prose-y responses synthesized by algorithms—as opposed to opening and reading the individual pages themselves—would creators publish as much work? And how would Google and other search engine makers compensate creators who, in essence, are making the information that trains the algorithms themselves?
There would still be plenty of people reading the news, and in those cases, search algorithms would need to serve up lists of stories. But I wonder if a subtle shift might occur where smaller creators add less, and in doing so, the web becomes less information rich, weakening the very algorithms that depend on that information.
There’s no way to know. Often, speculation is rooted in the problems of today and proves innocent in hindsight. In the meantime, the work will no doubt continue.
Perhaps we’ll solve these challenges—and more as they arise—and in the process arrive at that all-knowing, pleasantly chatty Star Trek computer we’ve long imagined.