Artificial neural networks have been all the rage lately. A neural net called DeepDream, built by Google and later released publicly, became an internet sensation for its trippy, dreamlike images. Google and Microsoft boast neural nets that, just this year, have exceeded humans at certain image recognition tasks.
The machines might know a fox from a cat, but distinguishing deep space from a roundworm? Apparently, not so much. Ville-Matias Heikkilä recently posted an entertaining YouTube video of what a neural net makes of Star Trek: The Next Generation’s opening sequence.
A planet’s molten surface is “chocolate sauce,” its rings a “hairslide,” and the deep blackness of interstellar space “velvet.” Not bad, perhaps, allowing for a little poetic license. But the Starship Enterprise thoroughly flusters the algorithm. The software’s best guesses rapidly flick through: digital clock, stove, CD player, odometer (it likes this one), computer mouse, aircraft carrier, car mirror, waffle iron, jellyfish…
Why does the program have such a hard time with Star Trek? While it excels at its task in the narrow sense—it’s much less flexible generally.
Image classification is a matter of experience. Just as humans gather sensory information, seeing and naming the things around us, artificial neural networks feed on big data. Many image classification algorithms are trained on the huge set of labeled images used in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Software is better-than-human at images in that set—but anything else will confuse it.
Google noted, for example, that DeepDream consistently drew a “barbell” with a hand and arm attached. This, they reasoned, was because it had never seen a barbell that wasn’t being held.
“There isn’t a lot of space stuff in ILSVRC12,” Heikkilä explains. “So pretrained Googlenet has some serious trouble classifying stars, planets and the Enterprise.”
Beyond biology, there is, of course, a significant difference between humans and software in this case. We’re continuously walking around soaking up new shapes and images, and when we don’t know what something is, we ask a friend and file it away for future reference.
An artificial neural network is neither experiencing the real world, nor does it have anyone to ask—or even the awareness that it doesn’t know and should ask. At the moment, such software is entirely reliant on databases of manually annotated images to learn.
But those databases will grow, and already machine learning software is able to tap YouTube videos. Perhaps future algorithms will be able to independently query the biggest database of them all—the internet—gather information from the real world by way of cameras and other sensors, and even share their knowledge with each other.
For now, Lieutenant Commander Data and his positronic neural net remain firmly rooted in the 24th century. That said, it shouldn’t take much more than a good weekend binge watching Star Trek TNG for today’s programs to learn the difference between the Enterprise and a waffle iron.
Image Credit: Shutterstock.com