Continuing with metrics…
Everybody can easily recognize a tune, a famous painting or the face of a friend. Not so for a computer, who must first digitize the information, and operate on this dataset to find similarities. Back in the beginning of computer era, people though that it would be easy to recognize characters or voice, but technology had hard time with these issues. OCR is still perfectible, while voice recognition is still not common (things are getting better, but Siri is still fallible) This is of particular importance nowadays, since our societies tend to control more and more the population, in the name of terrorism prevention. Since they cannot employ as much operator as there are citizens, there is strong need automation and recognition. The general problem of recognition is to find patterns that reflects a feature.I discovered the notion of “perceptive hash”, which is essentially the problem of recognition through the simplification of data, wondering how websites such as TinEye, worked in their way to find similar images. I subsequently discovered that there is an open source library called pHash that allows this kind of features (not to mention computer vision frameworks such as OpenCV, or the incredible Accord.net developped by Cesar Souza)To find similar pictures, there are serveral hurdles. First : all the pictures might not have the same size, they can be cropped, (slightly) modified or rotated.It seems that there are two solutions. The first consists in a brutal dowsizing of the images, followed by a Fourier transform : the Fourier coefficients will capture the fingerprint of the image (this is somehow similar to how How Shazam works) for comparison to a database.Another option is consider the histogram instead of the picture, and perform comparision using least-square or Bhattacharya distance.
I guess this is the one used by TinEye
These methods are rather robust to slight changes in the image, for there is no information on scale, and because the actual location of pixels is erased .
Note that Google image search is based on contextual informations, rather than on actual images.
I guess that understanding why a feature is the way it is is more sutble than brute-force search for patterns.
Emotion vs. Reason. Again