This might not be the right place for this question but, as someone who has made a couple very modest mps backend contributions, I'm curious why not add metal support to triton (or a fork if openai won't allow it) rather than maintain a whole separate backend?
Mostly comes down to what's fastest to develop, it's faster to write a few custom kernels than it is to develop a new compiler backend
Granted after more upfront effort compilers are just such a significant UX boost that indeed you are making me question why I don't spend more time working on this myself lol