Big Mac Passes 10 TFlops, #3 Ranking Not Expected To Change

Virginia Techis "Big Mac" Power Mac G5 cluster has secured its place as the third fastest supercomputer in the world. According to the latest preliminary numbers from the Top 500 Supercomputer Sites, Big Mac is hitting 10.28 TFlop/s on the LINPACK benchmark, giving it a 19 percent performance lead over the forth place supercomputer. Final numbers will be released November 17 at the Supercomputer Conference in Phoenix.

"I donit expect the top five to change [from the November 2 numbers]," Dr. Jack Dongarra, Director of the Innovative Computing Laboratory at the University of Tennessee, told The Mac Observer. Dongarra is one for authors that maintain the list of the Top 500.

Just last week, Big Mac was hitting 9.56 TFlop/s, and system architect Srinidhi Varadarajan told Wired he was hoping for another 10 percent boost "shortly." With the November 2 numbers, the team has almost achieved that goal, and there could still be time for more improvement.

Theoretical performance of 17.6 TFlop/s

"Theyire hitting 58 percent of the peak performance, which is fairly good," Dongarra said.

The RPeak value, or theoretical maximum performance, is calculated by adding together the theoretical performance of all the processors involved. In Big Macis case, the cluster is comprised of 1,100 Power Mac G5s, each with two G5 processors. Each G5 processor features two floating point units, and each floating point unit can perform two add-multiply operations per cycle. With each processor running at 2 billion cycles per second, the result is theoretically 8 GFlop/s per processor. This is, interestingly enough, identical to the theoretical performance of each processor used in NECis Earth Simulator, the worldis fastest supercomputer.

Thanks to its supercomputer-specific architecture, however, the Earth Simulator is able to hit 86 percent of its theoretical peak, or 35.86 TFlop/s. The Earth Simulator, which went live March 11, 2002, cost an estimated $350 million ($9.77 million per TFlop/s). Compare that with Big Macis $5.2 million price tag (or about $500,000 per TFlop/s), however, and its clear the G5 cluster is the value leader.

The fact that Virginia Tech has managed to put together such a cluster in a matter of months with off-the-shelf parts is also impressive. "Whatis interesting is theyire using a new system with a new processor and new interconnects, and the system is based on a processor not designed for scientific computing," Dongarra noted.

Much of the code for the cluster also had to be written from scratch by Varadarajan and his team at Virginia Tech, while other pieces has to be ported over to Mac OS X.

While the performance of Big Mac could still be tweaked, donit expect a G5 cluster--or really any kind of cluster--to surpass the Earth Simulator any time soon. "If you look at the Top 5 on LINPACK, they each have different processors, and four use commercial processors. These are processors not specifically designed for scientific research," Dongarra said.

A good deal of the Earth Simulatoris ability to hit 86 percent of its theoretical peak can be attributed to its supercomputer architecture, which allows it to move data around its processors more quickly as opposed to commercial processors and interconnects. Thereis also a law of diminishing returns at play when you add processors, and the Earth Simulator features 5,120 of them -- almost three thousand more than Big Mac.

Just a benchmark, but an important one

"An important thing to remember is that this benchmark measures just one problem," Dongarra cautions. "Real applications cover many things, so you need to be careful in making one concrete statement about performance."

Concrete or not, Big Macis No. 3 rank on the Top 500 will garner Virginia Tech, Apple, and IBM plenty of publicity. For its part, Virginia Tech expects to make back many times its investment in the cluster through research that companies will pay the school to perform. The school also plans to release full details of its cluster and the software it uses, which should pave the way for plenty more G5 clusters to be assembled around the world.