In some recent work from my group [1], we reduce the complexity of keeping up with new SIMD ISAs by retargeting code between generations. For example, a compiler pass can take code written to target SSE2 (with intrinsics) and emit AVX-512 - it auto-vectorizes hand-vectorized code. With a more capable compiler, if the ISA grows in complexity, programmers and users of libraries get speedups without rewriting their code or relying on scalar auto-vectorization. However, the x86 ISA growth certainly pushed some complexity on us as compiler writers - we had to write a pass to retarget instructions!
Recently a patch was contributed to gcc that converts mmx intrinsics to sse. Also the gcc power target supports x86 vector intrinsics, converting them to the power equivalents.
It's not as ambitious as your approach though, more like a 1:1 translation and thus cannot take advantage of wider vectors.
That patch primarily is there to avoid the pitfalls of MMX on modern architectures; it is gradually becoming deprecated. On SKX, operations that are available on both ports 0 and 1 for SSE or AVX are only available on port 0 for MMX. So code that uses MMX is getting half the throughput (which may or may not matter, but still).
Thanks for the explanation, I wasn't aware of the reasoning behind it. I would guess by now all actively maintained performance-critical code has been rewritten in something more modern, so it certainly makes sense for Intel to minimize the number of gates they dedicate to MMX.
[1] https://www.nextgenvec.org/#revec