Michael Bukatin - Transformer Revolution

July 15, 2020

At the end of May 2020, the next stage of Transformer revolution began.

The field experienced a qualitative jump, with the OpenAI presentation of code generator assistant demo during Microsoft Build 2020 event on May 20, 2020 (see, e.g. https://twitter.com/matvelloso/status/1263193089310461952) and then with GPT-3 paper Language Models are Few-Shot Learners on May 28, 2020 and OpenAI API private beta on June 11, 2020.

We are in the middle of July and we have seen reports from enough people using GPT-3 via that private beta, and we can confidently say that a new epoch has started, the transition at least as consequential as AlexNet had been in September-December 2012 has just happened. It's interesting that this coincides with a period of unusually intense social, political, and economic turbulence (somehow, everything is coming to head all at once).

Technologically speaking, we are at a point when things might start changing at arbitrarily fast rate at any moment, in particular because people are ready to hybridize all kinds of approaches with Transformers, just like they hybridized all kinds of approaches with "deep" nets in recent years. As Jürgen Schmidhuber is saying in recent years, "we are almost there". Now this has literally become true.


See this for a nice collection of usage examples: https://twitter.com/xuenay/status/1283312640199196673




March 16, 2024 updates:

https://twitter.com/matvelloso/status/1263193089310461952 has been deleted by the author (although it exists on the Wayback Machine; the tweet author moved from Microsoft to become a VP at Google).

It is probably more convenient to watch this historic video here: https://www.youtube.com/watch?v=eNhYTLWQFeg (the conversation with Sam Altman starts at 26:10, the intro to the demo starts at 29:10).


The next truly big revolution has been the emergence of GPT-4, starting with internal OpenAI releases in the Summer of 2022, continuing with Bing Chat "Sydney" personality in February 2023, and culminating with the March 14, 2023 official release.

GPT-4 is the first model which possesses "sparks" of true understanding and general competence. It is roughly speaking a human-equivalent model in the following sense: the road to superintelligent systems goes not via human-equivalent AGI systems, but around them; however the GPT-4-level systems are the closest to the human level on that trajectory, with reasonable trade-offs between their capabilities and human capabilities.


We are indeed seeing more and more hybrids between attention-based approaches and other approaches, and we are seeing a huge variety of those hybrid things, starting with famous AlphaFold 2 in 2020 and including all kinds of interesting software systems (even the currently leading text-to-video system demonstrated on Feb-15-2024, Sora, is a Diffusion Transformer).