Thinking About GPT Zero

Large language models have been all over the news the last couple months, with the launch of chatGPT (and the subsequent investment in OpenAI by Microsoft) kicking off an arms race of sorts within big tech. The capabilities of these latest models are impressive, and they do seem to represent a step change vs. prior ones, with the incorporation of human feedback leading to significantly more sensible responses (on average). People are rightly excited about the impacts these tools might have, and I think for some applications the hype may be real (e.g. search, brainstorming, drafting, etc.). However, I’ve also seen the idea expressed that these models represent a significant step toward superhuman intelligence, and want to share some thoughts on why that’s likely not the case.

This Wait but Why blog post well summarizes the view I’m responding to. The core idea is that although we see intelligence mainly in human terms, with Einstein at the top and the “village idiot” near the bottom (with all other animals below), it’s really a broad spectrum. While the range of human intelligence seems large, it occupies only a narrow slice of the full range of intelligence, with plenty of space above. This means that once we have AI systems that are as intelligent as the “village idiot” (e.g. chatGPT?) it’s likely we’ll soon see systems with superhuman intelligence, as there’s no artificial “ceiling” at the highest human levels. 

While this view of intelligence is simplified (in reality, it varies across many axes and doesn’t have arbitrarily high limits), the range presented seems directionally accurate (though chimp, bird, and ant intelligence likely aren’t as far off as presented). Given our evolutionary history, there’s no reason to think humans represent the pinnacle of possible intelligence. The AI intelligence curve shown on the right therefore makes sense in a scenario where AI is developed from first principles, as there would be no reason for progress to stop at arbitrary human limits. However, the AI systems of today are not built from first principles, and instead rely almost entirely on human-generated training data. This method of construction does place a ceiling on potential capabilities, as highlighted by prior progress in other areas of AI.

Before the more recent shift toward text and image generation, one of the main areas of AI research was games, as their exact nature better enabled computer learning. Chess was “cracked” with Deep Blue beating Kasparov back in the 90s, but it remained difficult for computers to achieve expert level play on more complex games. The AI company DeepMind set out to change that, starting with the board game Go. While its pieces are simpler than those in chess, Go has significantly more move possibilities (361 options for the first move, followed by 360 for the second move, etc.), making it much harder for traditional search algorithms to deal with. 

Due to this complexity, DeepMind could not see a way to start from the first principles of Go, and instead used thousands of human Go games as training data for the system (named AlphaGo). These human games allowed them to avoid the problem of finding a reward signal within all the possibilities of Go, as they knew which human won each game, and could train toward those winning moves. Taking this approach resulted in reasonably strong Go capabilities, but not superhuman ones – which makes sense, given the training set. AlphaGo was trained on games of both masters and amateurs (to ensure sufficient training data), and so its performance, even assuming perfect training, would fall somewhere in the middle of that distribution.

Achieving superhuman performance required DeepMind to make a move in the direction of first principles and develop a way for AlphaGo to learn from self-play, creating its own reward signal without reliance on any human games. The exact method of learning from self-play was complex (involving neural net representations of game states and a certain level of move randomness) but the overall idea was simple; two instances of AlphaGo would be set up to play against each other and the winning side’s moves would be used as training data, then allowing two even more powerful instances of AlphaGo to repeat the process.

Taking this self-play approach, DeepMind developed a version of AlphaGo which famously beat Lee Sedol, a top Go player, in 2016. However, they didn’t stop there. DeepMind saw, as we’ve just seen, that training from human data results in an undesired ceiling in performance, and they wondered whether it was possible to learn entirely from self-play instead. The main issue they faced with this approach is that it’s difficult to identify a useful training signal at the earliest stages. Particularly for a complex game like Go, looking at the winning moves when both sides are playing randomly doesn’t provide much information on how to play Go well (vs. say Tic-Tac-Toe, where a system could learn fairly easily from games involving random moves). DeepMind was able to overcome this obstacle in large part through sheer force, with the new system, AlphaGo Zero, playing nearly 5 million games against itself in the first few days of training. AlphaGo Zero ended up significantly more capable at Go than the earlier versions, beating the original AlphaGo that played Lee Sedol 100 to 0, and the most improved version of AlphaGo 89 to 11. 

Returning to the current state of AI, there’s a clear analogy between learning from human Go games and learning from human text generation. Just as AlphaGo hit a limit in performance before training from self-play, the language models of today will not be able to “go superhuman” without some advance in how we train them. Unfortunately, it’s not clear how to make that jump. There’s no concept of “self-play” for text generation, as the rules of language are far more ambiguous and open-ended than those for games.

Even with an approach analogous to self-play for text generation, it’s not clear what it would mean to achieve superhuman performance in the domain. Language is a human construct that provides a common interface for us to share our representations of the world, and so the concept of superhuman language performance by itself appears empty. It’s the superhuman representation of the world that we really care about, which we still seem far from capturing with AI. 

On the bright side, we don’t need our search engines to be smarter than the experts or our first drafts to be literary masterpieces. Superhuman performance isn’t required for language models to have a significant impact, and we’re entering an exciting time of identifying the best applications for these tools. As we do so, it will be important to remain rational about their capabilities and avoid getting distracted by delusions of superintelligent grandeur.

5 1 vote
Article Rating
Subscribe
Notify of
4 Comments
Inline Feedbacks
View all comments
Jonas
1 year ago

Hi Andrew. If ChatGPT is given an IQ test (or similar tests) and it blows it out of the park, we’d eventually conclude it is smarter than humans, thus in human terms it is super intelligent. We humans are the judge and jury over what we perceive as intelligent, so we will also ultimately conclude the arrival of super intelligence when an AI supersedes previous human ability. The question is, does this perceived super intelligence snowball into the conception of zero point training? If an AI understands all of the accumulated data by humans, is it realistic to assume that… Read more »

meanderingmoose
1 year ago
Reply to  Jonas

Hi Jonas, thanks for the question. My view is that achieving “ceiling-free” intelligence will require a reinvention of the training process to structure it in a way where the system is evaluated and updated against a more general conception of intelligence, rather than targeting specific examples (which are what create the ceiling). If the ceiling is high enough, it may be that AI models developed under the current paradigm can help (or even drive) this development, though I think it’s more likely this new approach will need to be a human invention. On your first point – I actually wouldn’t… Read more »

Jonas
1 year ago

Hi Andrew, Ty for that – I certainly hope you’re right. The release of gpt-4 introduces the prospect of power-seeking, and while their initial tests were a failure (giving it some money and allowed it to copy itself to make more money, but it was shut down by bot screens), it certainly testifies to openais expectations that the test could’ve been successful. Any thoughts or comments on how disruptive potential ai-farms could be by gpt-5 or 6? Personally I worry Pandora’s box is opening and we’re begging for a financial crises as 1 stock broker with ai help can act… Read more »