Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are you sure it wasn't just instruction alignment? Inserting nops before loop jump targets to align the first loop body instruction to 8 or 16 bytes is a very common x86 thing most compilers do. See e.g. https://reverseengineering.stackexchange.com/a/2930.


Would that explain the large difference we observed in the "branch-misses" statistic when we ran it under "perf stat"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: