Edges of the Distribution

After spending more time with GPT-4, I have to admit I’m surprised at the level of “understanding” possible via simple next token prediction (given massive scale). On a wide variety of tasks the answers it provides are almost uncannily useful, and in domains like test-taking I did not think scale alone would drive the high …

Edges of the Distribution Read More »