I understand why this was a problem in 1995, but honestly, in 2019, with image recognition technology as advanced as it is now – especially due to efforts by Google – why can't browsers detect this? Surely "does this rectangle look vaguely like a URL bar" is an easier problem to solve than "is this a photograph of a cat"?
Sure, image recognition is CPU intensive, but even just checking once every 5 seconds or so would be enough to prevent this sort of attack and pop up a big "you are being phished" warning. And 99.99% of what occupies that UI real estate looks sufficiently unlike a search bar that a low-cost recognizer should be able to rule out phishing for normal sites fairly quickly.
What am I missing? Has this approach been tried and rejected? Is image recognition of fairly static, flat, 2D, geometric shapes actually far more CPU-intensive than I imagine?
MobileSafari has an interesting feature that your idea reminded me of: it tries to detect when a site using the Fullscreen API presents an iOS keyboard-lookalike through the location and frequency of your taps on that side of the screen. I’ve gotten the warning when doing something else and was impressed they thought of it.
While this is true, it's usually referring to algorithmically chosen adversarial inputs. On the other hand, it's a lot harder to trick both the browser's image recognition and the human operator's visual senses with the same UI.
This is actually one of the core goals of adversarial machine learning: crafting inputs that trick a machine but look indestinguishable to a human [1].
Sure, image recognition is CPU intensive, but even just checking once every 5 seconds or so would be enough to prevent this sort of attack and pop up a big "you are being phished" warning. And 99.99% of what occupies that UI real estate looks sufficiently unlike a search bar that a low-cost recognizer should be able to rule out phishing for normal sites fairly quickly.
What am I missing? Has this approach been tried and rejected? Is image recognition of fairly static, flat, 2D, geometric shapes actually far more CPU-intensive than I imagine?