Valgrind leaves most instructions as they were doesn't it? If you're not touching the dynamic memory it should be as fast. You wouldn't be able to do that with complex MMX or SSE2 instructions with Arm translation.
It lifts them to a simplified version of x86 so they can be instrumented / transformed more easily. I think that implies you get different instructions when it lowers back to x86, but I could be wrong. (I’ve written a specialized valgrind instrumentation tool or two, but didn’t look too carefully at the execution half of their codebase.)