Turing Test Prize Has Two Winners

24 5 Loading

This odd-looking trophy was awarded to two teams in the BotPrize competition. The winners were able to create first-person shooter bots that appeared human.

The day we can’t tell the difference between a human and robot just got a little bit closer. A Turing Test of sorts has been put to humans to see if they could differentiate between their fellow human and non-human combatants in a first-person shooter game. For the first time in the five years that the contest has run, humans couldn’t tell the difference.

The contest, conceived of and organized by Philip Hingston, Associate Professor of Computer Science at Edith Cowan University in Perth, Australia, puts human and computer players on the battlefields of the first-person shooter game UT2004. After a few rounds of combat, the humans have to decide which players are human and which are bots. This year, 100 years after Turing’s birth, not one but two bots achieved humanness: the UT^2 team from the University of Texas at Austin and Mihai Polceanu, an Artificial Intelligence doctoral student from Romania studying in France. The two split the $7,000 prize money.

Calculating the humanness percentage is pretty straightforward: the number of times a player (human or bot) is judged to be human is divided by total times he/she/it was judged. Each player ended up getting judged about 25 times – a pretty decent sample size for such an experiment. And the experimenters even had a kind of control, “epic bots” already part of the game and not programmed, like the other bots, to trick human judges.

In a rather ironic twist, the two winning bots ended up with humanness scores that were higher than their human competitors! MirrorBot, Polceanu’s creation, scored 52.2 percent humanness while the University of Texas’ UT^2 scored 51.9 percent humanness. The four humans who partook were playing very bot-like, apparently, scoring an average of 41.4 percent humanness. For comparison, the bots received a humanness rating of 37 percent in last year’s competition.

The dual victory marks the end of a five year endeavor that included fourteen teams from 9 different countries. If you’re not familiar with first-person shooters, it might not be clear to you why it’s difficult to create a bot that plays like a human. During simulated battle players will team up against another team. Being a good team member involves sticking with your buddies, covering for them and allowing them to cover for you, being accurate with your shot and avoiding getting shot. One way a bot can give itself away is if it’s actually too good at the game, having deadly aim and the good fortune to always finding the perfect hiding place. They should also learn too as humans (typically) do. A player that makes the same mistake repeatedly is probably a bot. And unless you’re a vulcan, irrational behavior is a major possibility.

Mirror Bot, one of the contest winners, bravely faces an armed foe.

The UT^2 team used this latter distinction to their advantage. Part of their bot’s ‘personality’ was a weakness for holding a grudge. If it got shot up by a particular player it would pursue that opponent while its gameplay suffered. University of Texas doctoral student Jacob Schrum said of their strategy, “People tend to tenaciously pursue specific opponents without regard for optimality. We can mimic that behavior.”

To help their bot look more human-like, the team also modeled it directly after human behaviors, like stick with your team, but it also ran an artificially intelligent neural network that evolved its gameplay in the heat of battle. As Schrum explained to BotPrize, “A great deal of the challenge is in defining what ‘human-like’ is, and then setting constraints upon the neural networks so that they evolve toward that behavior.”

Polceanu took a similar approach. His ‘Mirror Bot’ actually recorded the movements of other players in realtime. This way it could “borrow” humanness by recording video keyframes of other players’ actions (assuming it knows which ones are human) and playing those actions back with a delay and partially modifying the behavior so that it appeared to be its own thinker.

The take-home message here is, if a bot wants to pretend it’s human, it has to pretend to be dumber and more irrational. These two bots have done a pretty good job, at least in the context of a first-person shooter game. What will be much harder to incorporate into our robotic impostors are the human attributes that we celebrate, like creativity, fortitude, and empathy.

The timing of the winners is noteworthy given that it comes as we celebrate Alan Turing’s Centenary – he would have turned 100 on June 23. His famous Turing Test was a test meant to determine if machines can exhibit intelligent behavior. His original conception involved a person talking to a human and a machine, without knowing which one was which. If he or she can’t pick out the human, the machine passes the test. The 2K test, inspired by the Turing Test, is an important step toward conquering Turing’s challenge. Admittedly, trying to discern humanness amongst the mayhem of first-person shooter battlefields is far too nuanced for Turing’s conversational vision (where one would hope that humans could score better than 41.4 humanness). But, if we’re going to get there, it’s going to be with the mimicry strategies like those of the 2K BotPrize winners aimed at the target.

[image credits: BotPrize and Gizmag]
images: BotPrize and Gizmag

Discussion — 5 Responses

  • Tracy R. Atkins October 8, 2012 on 9:47 am

    That brings up such a great point. AI has to dumb itself down to fit in in many cases. I remember playing against bots on the original Quake, way back then. The bots were easy to spot because they were insanely fast and accurate. No personality at all. It’s interesting that you have to ratchet up the social aspects (I assume these bots chat with other players?) while reducing actual performance to pass the test.

    • why06 Tracy R. Atkins October 8, 2012 on 7:01 pm

      What I find more interesting is the 41.4% humanness scored by the humans. And the humans seemed less human then the bots. It really makes you wonder what humans think of themselves. Do we idealize ourselves? What was it about the bots that made them have more humanity then the humans?

  • Mindey November 3, 2012 on 2:59 pm

    It shows how context-specific Turing test is… It has to be calibrated by the action spaces, and other factors, such as the ability of humans to see differences between manifestations of actions.