One significant thing slowing this down is all the access to the the shared mutable buffer. I'm confident it would go quicker if renderWeirdGradient was changed to be a pure function that either returned a new populated buffer or had the buffer as an argument and returned the result.
An even quick fix that would have some improvement would be to make the buffer final so that less indirections are required to read and write from it.
As others have already said the unsafeMutableBuffer may be appropriate for performance critical inner loops.
I've just tried making the buffer final and it results in a speedup of 8 times (not rigourously measured). From ~0.02 to between 0.0045 and 0.0025 on my computer.
An even quick fix that would have some improvement would be to make the buffer final so that less indirections are required to read and write from it.
As others have already said the unsafeMutableBuffer may be appropriate for performance critical inner loops.