A supercomputer capable of a quintillion operations a second will go online in 2021 after the US government handed Intel and supercomputer manufacturer Cray a contract to build an exascale computer called Aurora. This machine is being built from the bottom up to run AI at unprecedented scales.
Today’s most powerful supercomputers measure their performance in petaflops—one petaflop is a quadrillion operations per second—but the US, China, and Japan are all racing to be the first to reach the next major milestone of an exaflop early next decade.
That will be a huge leap above current capabilities. The world’s current fastest supercomputer, called Summit, only came online last June, and clocked a top speed of 143.5 petaflops in November. Aurora would be looking for a 7x boost in just a few years.
The news didn’t exactly come out of left field, though. The project has been discussed for some time, but this can be seen as the equivalent of a ground breaking ceremony after the Department of Energy formally announced the plans. The new machine will cost $500 million and will be housed at the DOE’s Argonne National Laboratory near Chicago.
The announcement also didn’t reveal too many details about how the feat would be achieved. In a press release Intel said the machine will be powered by a future generation of its Xeon processors, a future generation of its Optane memory chips, and its yet-to-be-released Xe GPUs. But the chipmaker is playing its cards close to its chest when it comes to the exact design and configuration of these future chips.
For its part, Cray will be housing Intel’s chips in its next-generation Shasta supercomputer system, which will include its Slingshot interconnect—the ultra-fast optical cables designed to shuttle data around the system.
This is actually Intel’s second crack at building a system named Aurora. It was originally due to build a 180-petaflop system using its now-scrapped Knight’s Hill family of processors that would have gone live last year.
After the project was refocused on the exascale system, it appears they’ve taken a cue from more recent supercomputers like Summit and Sierra that use a hybrid CPU-GPU approach, notes Nicole Hemsoth in Next Platform. But rather than relying on GPUs from external specialists like Nvidia or AMD, it appears Intel is determined to build one themselves.
That’s perhaps indicative of the growing convergence of high-performance computing and AI, which typically run far more efficiently on GPUs than CPUs, a trend Intel is probably keen not to get left behind by.
In a press briefing before the announcement, Argonne’s Rick Stevens, who is overseeing the project, said Aurora is being designed specifically to enable it to run massive deep learning models, with more than 100 AI applications already under development at various national laboratories.
Delivering such a dramatic boost in performance will be a colossal research and development effort, though. Experts agree that the technology roadmap is fairly well laid-out, but it will still pose many challenges, and developing software for computers of this size and complexity could be an even bigger task.
But Stevens told reporters at the briefing that they were confident the necessary advances in both hardware and software would come through in time. “Exascale R&D has been going on for over a decade and the innovation curves on which these exascale platforms are being based are moving extremely rapidly,” he said.
Whether the US will be the first country to cross the exascale barrier is too hard to call at this stage. A super-computing expert intimately involved in China’s exascale efforts conceded to MIT Tech Review last year that they may have to push back their initial goal of unveiling a machine by the end of 2020 as they weigh the benefits of using home-grown chips or ones they’ve licensed from western companies.
Japan is also targeting a 2021 launch for its post-K computer, which will use ARM CPUs and is heavily focused on rapid data movement rather than raw FLOPs. Its designers believe that focus will help it run applications faster than its competitors.
It looks like the exascale race is going to be a photo finish.