We talk to people who totally do that. They're the perfect proto-users who get why this is useful.
The Vhoto app just makes that easier, and helps jump to interesting / distinct / clear frames. You can do it entirely by hand by trying to scrub to the right frame, but point of the app (and the computer vision algos behind it) is to make it easy and do most of the work for you.
I have no idea if this app does this (I don't think it does; it seems to just detect motion blurred frames and discard those as good candidates), but you can actually pull a lot more detail/sharpness out using multi-frame super-resolution algorithms over a series of frames (even if there is movement from frame to frame) rather than just pulling the video codec data out for one frame.