A confusing thing that I think people haven't addressed clearly for the most part:
For a hash (whether cryptographic or perceptual), there is a chance of random collisions and also a difficulty factor for adversarially-created intentional collisions. The random collision probability has to be estimated based on some model of the input and output space (with cryptographic hash functions, you would usually model them as pseudorandom functions and assume that the collision probability is the same one created by the birthday paradox calculation).
Intentional collisions depend on insight about the structure of the hash function, and there are also different kinds of difficulty levels depending on the nature of the attack (preimage resistance, second-preimage resistance, and collision resistance). Gaining more insight about the structure of the hash function can act to reduce the work factor required for mounting these attacks. That should be true for perceptual hashes just as much as cryptographic hashes, but presumably all of the intentional attacks should start off easier because the perceptual hashes' threat models are weaker and there's much less mathematical research on how to achieve them.
And in AI systems involving classifiers, it was generally easy for people to create adversarial examples given access to the model. Perceptual hashes for estimating similarity to specific known images aren't the exact same thing because it's less like "how much like a cat is this image?" and more like "how much like NCMEC corpus image 77 is this image?", but maybe some of the same techniques would still work. In the cryptographic hash analogy, I guess that would be like trying to break preimage resistance.
Thanks for the detailed explanation. I understand why that works for perceptual hashes if you make them really precise, however I doubt it would work with md5, which is why I asked.
The discussion I thought we were having was about false positives and not adversarially induced false positives. For the former the random collisions have a probability of 1/(2^64).
To mitigate adversarial false positives one idea is to use the combination of a cryptographically strong hash along with a randomly selected perturbation of the file. Prior to hashing, perturb the file and submit both the hash and the selected perturbation to apple. Apple selects the DB based on the perturbation and proceeds with matching and thresholding.
I thought that the random collision for md5 was way higher than that. If it's that low, you're right, this would work. I'm not sure I understand the part about the pertubation.
For a hash (whether cryptographic or perceptual), there is a chance of random collisions and also a difficulty factor for adversarially-created intentional collisions. The random collision probability has to be estimated based on some model of the input and output space (with cryptographic hash functions, you would usually model them as pseudorandom functions and assume that the collision probability is the same one created by the birthday paradox calculation).
Intentional collisions depend on insight about the structure of the hash function, and there are also different kinds of difficulty levels depending on the nature of the attack (preimage resistance, second-preimage resistance, and collision resistance). Gaining more insight about the structure of the hash function can act to reduce the work factor required for mounting these attacks. That should be true for perceptual hashes just as much as cryptographic hashes, but presumably all of the intentional attacks should start off easier because the perceptual hashes' threat models are weaker and there's much less mathematical research on how to achieve them.
And in AI systems involving classifiers, it was generally easy for people to create adversarial examples given access to the model. Perceptual hashes for estimating similarity to specific known images aren't the exact same thing because it's less like "how much like a cat is this image?" and more like "how much like NCMEC corpus image 77 is this image?", but maybe some of the same techniques would still work. In the cryptographic hash analogy, I guess that would be like trying to break preimage resistance.