October 14, 2004

Keep 'em Guessing

As science is figuring out how to allow machines to see, we get to look over thier shoulders (scientists, that is) and glean something interesting now and then. I've just found this article about image recognition, via Wretchard's Belmont Club. Check it out:

Our eyes provide us with an abundance of information about the outside world. Thanks to vision we become aware of the objects and living beings that surround us and represent their form and properties in our brains. Computer vision researchers aim at reproducing this capability in machines.

Vision is difficult. The images of a human head and a melon are very similar if taken with the same illumination, whereas two images of the same head taken under different lighting conditions are extremely different. Yet, we have no problem in telling which is which. The image of a tree is composed of an intricate pattern of lights and darks, greens, yellows, and browns and yet we are able to perceive it as a single object and simultaneously to perceive the leaves and branches that compose it. It is obvious from these examples that the metric in the world of images, i.e., a naive distance measure in the extremely high-dimensional space of image intensities, is not very informative for extracting concepts from images. Different objects may produce the same image and, vice versa, the same object may give rise to very different images depending on viewpoint and lighting conditions.

Knowledge of the world is vital in resolving these difficulties and taking advantage of whatever little information an image can provide. Many visual illusions demonstrate that the visual system is built to take educated guesses on the nature of stimuli.

