Yes. Fabrice Bellard wrote a highly optimised library (libnc) [1] for training and inference of neural networks on CPU (x86 with AVX-2), and implemented GPT-2 inference (gpt2tc) with it [2]. Later he added a CUDA backend to libnc. You can try it out at his website TextSynth [3] and I see it now runs various newer GPT-based models too, but it seems he hasn't released the code for that. Doesn't surprise me as he didn't release the code for libnc either, just the parts of gpt2tc excluding libnc (libnc is released as a free binary) so someone could reimplement GPT-J and the other models themselves.
Incidentally, he's currently leading the Large Text Compression Benchmark using a -based compressor called nncp [4] which is based on this work. It learns the transformer-based model as it goes, and the earlier versions didn't use a GPU.
I kinda understand why he would not release the source code. Perhaps, he's finally decided to monetize some of his coding skills. Maybe in the future, he'll start releasing some of those newer and bigger models to the public given that other big corps like FB have started already doing so (GPT-NeoX and OPT - as mentioned in the sibling comment by infinityio)
Yes, TextSynth.com is a commercial service, see pricing [1]. If his code is faster than others' (I'd certainly believe it) then it's quite valuable, and he deserves to be able to monetise it. Edit: Also, OpenAI is slashing price for GPT-3 by 2-3x tomorrow because of "progress in making our models more efficient to run" [2].
Also, he was/is competing for the Hutter Prize with nncp, however he is outside the requirements for the prize: CPU-time, RAM, but most especially that submissions shouldn't require a modern CPU (with AVX-2) or a GPU. Otherwise he could have won it. I suspect it's actually that's the biggest reason he implemented libnc without GPU support initially. He has asked for the rules to be changed to allow AVX2 and I believe they eventually will be. So he won't give away the source for nncp yet, but will have to open source it to receive the prize.
Incidentally, he's currently leading the Large Text Compression Benchmark using a -based compressor called nncp [4] which is based on this work. It learns the transformer-based model as it goes, and the earlier versions didn't use a GPU.
[1] https://bellard.org/libnc/
[2] https://bellard.org/libnc/gpt2tc.html
[3] https://textsynth.com/
[4] http://www.mattmahoney.net/dc/text.html#1085