>They didn't think to weight the prior probabilities by usage frequency I don't ...

>They didn't think to weight the prior probabilities by usage frequency

I don't know if that is the right metric for this sort of tool. I'd guess the use case is for trying to find infrequently encountered characters. It should probably try to detect your current locale, and then say eliminate all ASCII characters when you are suspected of speaking English, etc., since you are already aware of how to type a question mark.