Typical CNNs can learn a linear transformation of the input at no cost. Since YUV is such a linear transformation of RGB, there is no benefit in converting to it beforehand.
How is there not a cost associated with forcing the machine to learn how to do something that we already have a simple, deterministic algorithm for? Won't some engineer need to double check a few things with regard to the AI's idea of color space transform?
You could probably derive some smart initialization for the first layer of a NN based on domain knowledge (color spaces, sobel filters, etc.). But since this is such a small part of what the NN has to learn, I expect this to result in a small improvement in training time and have no effect on final performance and accuracy, so it's unlikely to be worth the complexity of developing such a feature.
Your instincts are correct. Training is faster, more stable, and more efficient that way. In certain cases it "pretty much is irrelevant" but the advantages of the strategy of modelling the knowns and training only on the unknowns becomes starkly apparent when doing e.g. sensor fusion or other ML tasks on physical systems.