Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People have been suggesting the time stamp counter, but that's actually not ideal, because it has a lot of overhead. On my desktop at work (a very beefy Intel Xeon) it adds about 30 cpu cycles. It also drains all the pipelines.

For a microbenchmark like this, I find it's usually better to call it in a loop 1,000,000 times, and compute the total time. That's often a "best case" scenario, where e.g. the cpu doesn't need to decode the instructions every iteration because they fit in appropriate cache. But it avoids the overhead of the timestamp counter.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: