and some renewed interest in Transformers
Gwern https://www.gwern.net/newsletter/2020/05#gpt-3
HN comments where iI found the link https://news.ycombinator.com/item?id=23623845
some intuition on relation between Graph and Transformer architecture https://graphdeeplearning.github.io/post/transformers-are-gnns/
I should find a good intro to transformer, it seems that they scale the right way, even bigger models with ever better performances, like size drives performance.
UPDATE 24 may 2021: Are we in an AI overhang?