Is that the only factor though? I wonder if pytorch is lacking optimization for the MPS backend.
It's just that NVIDIA GPU sucks (relatively) at *single-user* LLM inference and it makes people feel like Apple not so bad.
Is that the only factor though? I wonder if pytorch is lacking optimization for the MPS backend.