There is a technical reason for this, and it's not a QT program problem, but rather a QT container problem. The QT container file does a horrendous job at synchronizing audio and visual streams. QT playback software exploits this problem, though, by intentionally lagging the start of a video by a few frames in order for the audio and visual streams to match up on the timeline. By not having to track the sync after the initial playback lag, the file plays more "reliably" and "quickly" but this also means that the decoder has no reference point from which it can scrub backwards. Because the "quick" in QuickTime really is a misnomer. This is lazy time, not quick time.
That's not true at all. QuickTime container can represent each frame timestamp exactly, and it has got some advanced features like edit lists. Every frame has got a start and stop time, and each sync sample is marked as such, it's even possible to mark on which sample each sample depends. In Matroska for example the stop time is so unreliable you often have to analyse the frame content or wait for the next frame to know the duration of a sample, and who knows if the sync flag is true or not.
You are probably referring to the delay introduced by b-frames, but the mov container has got a atom ('cslg') to store the max and min offsets and put everything in sync again.
Unfortunately third party mov demuxers don't support cslg or edit lists, so they only supports the simplest mov files.
No, I am referring to the delay introduced by compressed audio streams within QT container files. The issue I refer to does not seem to occur for lossless audio. In these situations, the cslg atom, among others, allow the QT format to reliably copy edited stream data without re-writing to the container file.
AAC, like MP3, introduces a padding of silence at the beginning of the stream. Because modern QT container files do not compensate for this, all audio and video streams within this type of QT file will be off sync by default. QT playback software waits for the audio stream to begin (waits for silence padding to end) before video playback begins, even though the streams themselves line up 1:1 in the container file. This is lazy engineering, not an advanced feature.
There is a technical reason for this, and it's not a QT program problem, but rather a QT container problem. The QT container file does a horrendous job at synchronizing audio and visual streams. QT playback software exploits this problem, though, by intentionally lagging the start of a video by a few frames in order for the audio and visual streams to match up on the timeline. By not having to track the sync after the initial playback lag, the file plays more "reliably" and "quickly" but this also means that the decoder has no reference point from which it can scrub backwards. Because the "quick" in QuickTime really is a misnomer. This is lazy time, not quick time.