Brain-Inspired Image Recognition Software From Cortexica Allows Computers to See (Video)
“Our ability to reverse engineer the brain–to see inside, model it, and simulate its regions–is growing exponentially. We will ultimately understand the principles of operation underlying the full range of our own thinking, knowledge that will provide us with powerful procedures for developing the software of intelligent machines.”
-Ray Kurzweil, The Singularity Is Near
A pair of scientists-turned-entrepreneurs are taking significant steps towards fufilling the esteemed singularitian’s forecast of having intelligent machines that think like we do. With their company, Cortexica, British Drs. Anil Bharath and Jeffrey Ng have created technology that allow computers to see. It’s already being used by consumers to quickly retrieve information about what they buy, and will soon help companies advertise more effectively. In developing the technology, they drew inspiration from a computer that already sees very well: the human brain.
Bharath and Ng were studying the biology of vision at Imperial College London. In 2006 they incorporated their biological findings into a computer model that successfully perceived the boundaries of objects in a picture. Picking out the outline of an object might not seem all that impressive at first, but that’s because our brains are so good at interpreting the very complex collection of lines, shapes, colors, light variations, movement, etc. that falls two-dimensionally on our retinas. A computer first has to decide if the two adjacent lines in its two-dimensional image belong to the object. If only one does, then which one? It’s largely a matter of thresholding: what’s background and what’s object? Computationally, it’s far from simple.
In the brain, visual information is transmitted from the retina to the visual cortex via multiple, parallel pathways. Along the way there are several junctions where information in one pathway interacts with information in an adjacent pathway through shorter, interconnecting neurons. Its the sequential interactions at these junctions that refines the visual information so that, at the visual cortex endpoint, lines, shapes, and colors combine so we perceive “chair.”
As far as our brains are concerned, what we see is for the most part uninteresting. As we stand at the shoreline before a vast, uniformly-blue ocean, our eyes will scan back and forth as we contemplate nature and wish we didn’t have to go back to work on Monday. But if a ship happens to appear on the horizon our attention will be drawn immediately to it. To the visual cortex, uniformity is boring. The breaks in pattern are where the action is. The beauty of Bharath and Ng’s software is that it uses the same tactic of giving more importance to pattern disruption. Rather than equally weighing all information–a computationally cumbersome strategy for computers and brains alike–it identifies “key-points” that offer more information than the uniform parts of the field. And the more stable those key-points are–that is, the greater the number of parallel inputs that detect the feature–the greater they’re weighed. You can turn the table, bring it closer, change the lighting in the room, move your Justin Bieber cardboard cutout behind it, and the computer will still know where the corners are.
Following their breakthrough modeling, Bharath and Ng founded Cortexica in 2008 to develop their brainlike software for commercial use. It identifies objects in both 2D and 3D. Ultimately they see a future for their image recognition software–called Vision Search–in which “a user no longer has to describe something in textual terms, but simply has to take a picture of what they want to find.”
In fact, you can download the app to your iPhone–but only if you’re 21 years of age or older.
Cortexica has teamed up with online wine distributor Tesco to create a “pocket sommelier.” With the WINEfindr App you simply take a picture of that bottle’s label and up pops everything you wanted to know from pricing information, reviews, appropriate food pairings, and a list of other members from the same wine family. And so you can sound as though you know what you’re talking about, WINEfindr lists tasting notes, body, style, closure–whatever that is–and other details that you didn’t know you were missing. And, of course, if you like the wine you can order as many as you like with the touch of a virtual button. Pretty cool actually. Makes me wish I had an iPhone. But if you need more convincing before forking over the App’s $4.99, check out this video. (Caveat: Judging from the very few comments I’ve seen from customers who’ve tried WINEfindr, it doesn’t seem to work nearly as well as advertised.)
WINEfindr is not new, as it was released about a year ago. But in addition to better drinking through technology, Cortexica has now incorporated its Visual Search technology to create BrandTrak. From broadcast television, online videos, or movies, BrandTrak uses several criteria to estimate a given brand’s exposure. These criteria–the size of a brand’s logo on the screen, its position, or how much it’s occluded, for example–are used to compute what Cortexica calls C-impact (pronounced “see impact”). C-impact is for companies that are wondering how effectively they’re advertising–or their competitors are advertising–across all media.
It’s exciting to see Cortexica putting Vision Search to work, even though, at least with WINEfindr, there may still be some wrinkles to be ironed out. And it’s not clear to me that C-impact will make much of an impact with companies. The consumer in me says there’s a lot more to advertising than logo size, like whether or not the guy drinking the beer has been “the life of parties he’s never even attended.” Love those commercials, but not for their logos. Regardless, I think it’s great that, like Numenta, another reverse-engineered, brain-inspired software is going live. Let’s see what it gets right and what it needs to work on. No doubt robotics developers will be watching closely. Will Vision Search perform better than the image recognition and tracking software that Aldebaran’s Nao robot recently got?
The visual cortex is just one area where the “software of the brain” is being tapped to create computer software: the brain’s auditory system is serving as a blueprint for automated speech recognition, a part of the brain involved in movement–the cerebellum–inspired the adaptive control mechanism for a robotic eye. Back on the visual systems front, we now know enough about how the brain processes visual information to install retina implants into patients. Moreover, sensor arrays have been built that mimic higher-order processing in the visual cortex.
As a neuroscientist, I may be partial. But I consider the development of reverse-engineered, brain-based technologies to be about as exciting as it gets in the fields of robotics and AI. The human brain is a wondrous thing. And watching the different technologies that emulate the brain develop and come together to form something that ever more closely resembles it will be breathtaking.