Artificial Astronomer Analyzes Galaxies Almost As Well As We Do

Aaron Saenz

Jun 03, 2010

Scientists are teaching an artificial intelligence how to classify galaxies imaged by telescopes like the Hubble. Manda Banerji at the University of Cambridge along with researchers at University College London, Johns Hopkins and elsewhere, has succeeded in getting the program to agree with human analysis at an impressive rate of more than 90%. Banerji used data from Galaxy Zoo, a massive online project which has used more than 250,000 volunteers to analyze more than 60 million galaxies. The new automated astronomer will help with even larger analytical projects on the horizon, taking care of trivial classifications and leaving the tough cases to humans. Man and machine, working together to conquer the universe. You gotta love it.

Some of the upcoming classification projects are...well...astronomical in size. The Dark Energy Survey will look at some 300M galaxies while VISTA is going to map the entire southern hemisphere. These are just two examples of large scale astronomy projects. Even with crowd-sourcing initiatives like Galaxy Zoo, humans are simply not going to be able to analyze all of this data as fast as it's coming in. Banerji is involved with both projects so its no wonder she was looking for some automation to help her out.

Scientists have been turning to automated solutions with some impressive results recently. We've told you about Eureqa, a program which analyzes data to come up with the underlying equations that explain a phenomena. There's also Adam, a robot which actually performs tedious biology experiments and makes original discoveries. The artificial astronomy, however, is simultaneously grander and simpler than either of these other projects. The data sets it reviews will be enormous, millions of digitized images to classify. Yet it won't be generating fundamental equations about the universe, or coming up with original ideas. It's a sorting algorithm.

Be Part of the Future

100% Free. No Spam. Unsubscribe any time.

A very sophisticated sorting algorithm. Banerji and the rest of the team programmed the astronomy AI to analyze galaxies for color, variations in brightness, and texture. Each of these variables has a general correspondence to different types of galaxies (i.e. elliptical, spiral, irregular). With enough training, the AI can look at all its variables and make a guess as to the most likely classification for the galaxy.

This is completely different than what people typically do. You or I will look at an image and quickly classify a galaxy by recognizing the patterns of its shape. Despite the difference, the computer's method of classification agreed with Galaxy Zoo results more than 90% of the time. As described in the online paper on the subject (to be published by the Royal Astronomical Society), the computer was able to achieve this success even when its training wasn't complete. That is, it could use a relatively few number of examples to learn how to classify galaxies like a pro.

Astronomy is a data-heavy science, but it's far from being the only one. Genetics is right up there as well, and we're soon going to have mountains of sequenced DNA to analyze with the advent of cheap whole genome sequencing. Projects like the artificial astronomer are likely to be matched by even more sophisticated programs ready to tackle the next generation of massive analytical problems. I think its promising that these first rounds of AIs are likely to need human assistance in a small fraction of their work. Putting man and machine together to classify images is a great idea. Looking forward, however, there's little doubt that the human fraction is going to continue to get smaller. Banerji is already looking to improve the astronomy program. Eventually computers may take over so much analysis that humans will be needed for very little of it. What happens then? Maybe rampant unemployment and world-wide chaos, or maybe we'll reach an age when human scientists are empowered to tackle any creative endeavor they want to pursue. Automation and science - Best. Feedback Loop. Ever.

[image credit: Hubble Heritage via Manda Banerji]
[source: Banerji Webpage, Banerji et al 2010(ArXiv/RAS), MSNBC, Galaxy Zoo]

Big Data