> when the store responds to a block write, it's going to be stable.
That's the thing, you are communicating with your storage controller and these are just promises from your controller, not guarantees. Once you try to read the data back sometimes in the future there is no guarantee that you will succeed. A lot can happen between your controller reporting successful write and you retrieving the data: software bugs and false promises, hardware problems, operational problems, disasters, etc. There is a limit of how high of a probability of data retention a typical single server in a typical server room can achieve.
There are always implied assumptions, like assuming that there are no bugs in the kernel drivers themselves, and assuming that memory is not damaged.
Saying "we cannot achieve anything because there might be firmware bugs" is technically true, but completely counter-productive. Adding "A little bit better or worse is not that big of a deal" is just bad engineering -- can you imagine doctor saying, "you might get hit by a car at any time, so I decided it is not worth it to heal you"?
We can achieve a lot on top of unreliable components, buggy kernels, buggy firmware. Make unbelievably reliable systems. But can't, if we assume we can rely on unreliable components. This is what bad engineering is.
In this case, that stability is a promise from a SAN, a commercial product that has a very good reliability track record. We're pretty confident that the data we write is going to be stable, unless there's a physical disaster like a fire . . . which is why we write to multiples of these, which are physically distributed, have staggered software update schedules, etc. etc.
You can still lose everything, you can't control all failure modes. But you can plan for and protect against common disasters.
The original discussion was about the lack of OS support for proper flushing and making data stable. I think the lack of decent support for this is a decades-long travesty; claims that the lack of this functionality doesn't matter because "there might be firmware bugs in the storage system, so why bother?" are specious and unhelpful.
That's the thing, you are communicating with your storage controller and these are just promises from your controller, not guarantees. Once you try to read the data back sometimes in the future there is no guarantee that you will succeed. A lot can happen between your controller reporting successful write and you retrieving the data: software bugs and false promises, hardware problems, operational problems, disasters, etc. There is a limit of how high of a probability of data retention a typical single server in a typical server room can achieve.