With respect to the incompatabilities with PyTorch and TensorFlow - given that t...

eslaught · on May 26, 2023

Drivers are only the lowest level of the stack. You could (in principle) have a great driver ecosystem and a nonexistent user-level ecosystem. And indeed, the user-level ecosystem on AMD and Intel seems to be suffering.

For example, I recently went looking into Numba for AMD GPUs. The answer was basically, "it doesn't exist". There was a version, it got deprecated (and removed), and the replacement never took off. AMD doesn't appear to be investing in it (as far as anyone can tell from an outsider's perspective). So now I've got a code that won't work on AMD GPUs, even though in principle the abstractions are perfectly suited to this sort of cross-GPU-vendor portability.

NVIDIA is years ahead not just in CUDA, but in terms of all the other libraries built on top. Unless I'm building directly on the lowest levels of abstraction (CUDA/HIP/Kokkos/etc. and BLAS, basically), chances are the things I want will exist for NVIDIA but not for the others. Without a significant and sustained ecosystem push, that's just not going to change quickly.

pmoriarty · on May 27, 2023

"NVIDIA is years ahead not just in CUDA, but in terms of all the other libraries built on top."

How big an effort would it take to get those libraries to work with AMD drivers?

singhrac · on May 26, 2023

I think this is what George Hotz is doing with tiny corp, but I have to admit I have little hope. Making asynchronous SIMD code fast is very difficult as a base point, let alone without internal view of decisions like “why does this cause a sync” or even “will this unnecessary copy ever get fixed?”. Unfortunately AMD and especially Intel don’t “develop in the open”, so even if the drivers are open sourced, without context it’ll be an uphill battle.

To give some perspective, see @ngimel’s comments and PRs in Github. That’s what AMD and Intel are competing against, along with confidence that optimizing for ML customers will pay off (clearly NVIDIA can justify the investment already).

pca006132 · on May 26, 2023

This kind of software development is hard and expensive. I do not think that this can enable you to make enough income from benchmark website or YT channel, considering most people are not interested in those low level details.