Re: serpentine traversal, this has to do with the .reuse suffix applied to register operands as mentioned in your link. We don’t really have control over it because it’s happening inside of ptxas during SASS generation, but when CUTLASS does serpentine traversal they’re suggesting an order of MMA instruction issues that would result in at least one operand being reused from one instruction to the next— clever stuff.
I’d be so happy if SASS were documented and ptxas were open source, sometimes I spend entire days going through whitepapers and various sources of online documentation to get more hardware details…
I’d be so happy if SASS were documented and ptxas were open source, sometimes I spend entire days going through whitepapers and various sources of online documentation to get more hardware details…