Lies, Damned Lies, & Benchmarks
June 24th, 2003

[Update 6/25/2003: Since I wrote this column, Apple VP Greg Joswiak has answered many of the criticisms discussed below (see TMO's full coverage for more information). It remains to be seen if all the things Apple's detractors have complained about are, after all, quite appropriate, but the company is being very transparent and up front about the situation. My column is being left as published, with the addition of this update, and further thoughts I have had on the situation are included in the TMO Spin in the above linked article. - Bryan]

What's the fastest PC on the planet? According to Apple, it's the new dual 2 GHz Power Mac G5 introduced at yesterday's WWDC conference in San Francisco. The company showed us cross-platform benchmark results based on SPEC CPU 2000 benchmarks -- a benchmarking system that has historically shown less favorable results for the G4 than for the x86 world -- conducted by VeriTest in a paid for study. The tests compared the dual G5 to a single-processor 3.06 GHz P4 (with only one processor used on the G5 as well), and to a dual 3 GHz Xeon.

The results showed that the G5 was less fast than both Intel models in SPECint_base2000 tests, which measures integer performance. Apple said it was 10% slower than the Intel processors on this test. The results also showed that the G5 was 21% faster than the Intel processors in the SPECfp_base2000 tests, which measures floating point prowess. According to those results (21% is more of a difference than 10%, according to a paraphrased Steve Jobs), the G5 Mac was the fastest personal computer on the planet.

Now, it won't surprise anyone that the Wintel world, and to a lesser extent the x86-based Linux community, immediately began finding fault with those results. Nothing is worse to some folks than having their belief system challenged. What may be more surprising is that some within the Mac community have also taken exception to Apple's comments and claims, most notably one of the developers at Haxial, a Mac OS X shareware developer that has released many popular Mac OS X apps. According to an editorial published on the Haxial site, Apple fudged the numbers, and the tests were intellectually dishonest.

And the beat goes on...

He's not the only one to do so. AMDZone has posted a rant that looks at Apple's announcement from an amusingly AMD-centric viewpoint (note that the author is very correct that Apple didn't compare the G5 to any of AMD's offerings, but that much of the rest of the commentary is just that side of silly). Slashdot's readers have poured their comments all over the issue, too, as have numerous other outlets.

Documented

What's causing the problem are the testing procedures laid out in the documentation from VeriTest that accompanies the test results. According to that documentation, the G5 was optimized from a hardware and software standpoint, while the two Dell systems tested were not similarly optimized. For instance, while G5-specific code was included in the tests, Dell's system actually had SSE2 and hyperthreading turned off, meaning that much of the potential from those systems was not even in use.

Worse yet, Apple's sponsored tests had the great crime of not using more Intel-friendly compilers to test the Intel machines, instead using a more Mac-friendly, but cross platform, compiler (GCC 3.3, which is used in Apple's upcoming Panther release of Mac OS X). Certainly that's a first, for a computer hardware company to stack the tests in its favor, right? I should also note that some people are saying that certain aspects of GCC are actually highly optimized for Intel, but I wouldn't know for sure. In other words, it's not that GCC is unfriendly to Intel, it's just that Intel's compilers are more Intel friendly. Make sure you read the Haxial editorial to get the breakdown of the problems, as Haxial sees it, with the test results.

Same as it ever was

Here's my take on this. Whenever a manufacturer releases benchmark results, they are always weighted to reflect positively on the company. Always. That's what companies do, rightly or wrongly. For instance, Apple chose GCC 3.3 as the compiler with which to run these tests. GCC is going to favor Apple, at least in some instances, because Apple is working with that compiler in its own OS development. Apple's critics will insist that the company should have instead used something more favorable to Intel, like Intel's own compiler. That is somehow more fair? It's two sides of the same coin.

This is similar to the benchmarks recently run by some PC-oriented site or another that tested Premier on a Windows and Mac box, showing the G4 to be significantly slower than the Wintel box. Testing on Premier made for cross-platform tests, but a better test for the Mac would have been to use Final Cut Pro on the Mac, as Apple has done a much better job of optimizing FCP than Adobe has with Premier. The cross-platform nature of Premier was used to justify those tests the same way that Apple used the cross-platform nature of GCC to justify its own tests. Again, two sides of the same coin. I took those results -- which were crowed about from the PC camp for weeks -- like I took Apple's own SPEC results, with a big bag of salt on hand.

Lies, damned lies, & statistics

You can say whatever you want with statistics, and Apple certainly had the tests tweaked to their advantage. In some cases, such as turning SSE2 off on the Intel machines, Apple did something that I think is intellectually dishonest, though at least the documentation was very open on this. The same can not be said for this sort of testing done by most other PC companies, but even that doesn't matter too much.

It should also be noted that some at Slashdot claimed that using CodeWarrior's Mac compilers would have resulted in better performance on the Mac than Apple's own tests, just as using Intel's compilers would have been better for that platform. Frankly, not being a coder, nor an EE, I am not the one to give you the straight dope on the specifics of these technologies, and am happy to leave it to others, but I do very much understand lies, damned lies, and statistics.

Apple's claims are par for the course. For the last couple of years, all of the tests we have seen come from the Wintel camp trying to show Wintel boxes as being faster than Macs (which was the case for most of the last year, at the least) have also been performed in a manner that benefitted the PCs. That's what companies do.

The bottom line

Are Power Mac G5s the fastest personal computers on the planet? For some tasks, absolutely. In others, absolutely not. It will largely depend on what the user is doing. What's important to take from Apple's announcement, and from the benchmarking and bake-offs that were done by Apple on stage at the WWDC, is that the Power Mac G5 is now completely in the same league as the fastest Intel offerings. It's been at least a year, and perhaps longer, since the same could be said for the G4. You can tweak your testing to get results that favor Apple, Intel, or AMD as you see fit, and I am waiting to see what outside testing on real shipping G5s can do. Note, too, that TMO didn't focus too much attention on Apple's benchmarks from the get-go for all the reasons discussed in this column.

Am I rationalizing Apple's claims? Absolutely not, and certainly it's appropriate and right to talk about the flaws in Apple's testing, but only if you apply the same standards to testing from Dell, Intel, AMD, HP, or some Windows fan site. Heck, make sure you look at how a Mac site does its benchmarks, too. Remember, there are lies, damned lies, and benchmarks.

The underlying fact, however, is that it's a race again in the processor world, and that is, officially, a Very Good Thing™. Everyone will end up benefitting, at least on the user side.