Optimizers follow what the C specification says. When C says "this can be reordered", then they can reorder.
There are practical reasons why it's not "closer to the hardware":
• it would be harder to have a single compiler front-end for multiple CPU back-ends.
• what programmers have in mind for "closer to the hardware" is actually quite fuzzy, based roughly on how they imagine a naive C compiler would generate machine code. That interpretation doesn't have a spec. The things that seem obvious aren't that obvious in detail, or lead to unexpected performance cliffs (e.g. did you know that non-byte array indexing on 64-bit machines would be slow if signed int was defined to "simply" overflow?)
There are practical reasons why it's not "closer to the hardware":
• it would be harder to have a single compiler front-end for multiple CPU back-ends.
• what programmers have in mind for "closer to the hardware" is actually quite fuzzy, based roughly on how they imagine a naive C compiler would generate machine code. That interpretation doesn't have a spec. The things that seem obvious aren't that obvious in detail, or lead to unexpected performance cliffs (e.g. did you know that non-byte array indexing on 64-bit machines would be slow if signed int was defined to "simply" overflow?)