There's a reddit comment here https://www.reddit.com/r/LocalLLaMA/comments/1r4m4...

There's a reddit comment here https://www.reddit.com/r/LocalLLaMA/comments/1r4m4it/comment... that says:

my system is running GLM-5 MXFP4 at about 17 tok/s. That’s with a single RTX Pro 6000 on an EPYC 9455P with 12 channels of DDR5-6400. Only 16k context though, since it’s too slow to use for programming anyway and that’s the only application where I need big context.