Hidden Dimensions - Everyone Needs a Supercomputer, Part II (The Apple Factor)

by John Martellaro
June 26th, 2006

"Civilization advances by extending the number of important operations
one can perform without thinking about them." [1]

- Alfred North Whitehead

Continued from Part I, last week.

Once Upon a Time

When I was working for Martin Marietta in Oak Ridge, Tennessee in the early 1990s [2], I became a student of the UNIX workstation vendors. My UNIX mentor, who had a DEC Alpha workstation, and I would meet in his office frequently to assess the UNIX workstation industry, vendor market share, and the competitive landscape. What I learned then has served me well in analyzing the HPC/supercomputer market. So follow along with me, and I'll tell you how Apple fits into all this.

When it came time to replace my Quadra 700 (which was running Apple's A/UX), I decided to get an SGI "Indy" instead of a Macintosh 8500. Being a Mac kinda guy, I couldn't live only on the Alpha's command line. On the other hand, I admired what SGI was doing with the IRIX OS and multimedia. They were way ahead of their time. My Indy even had a camera on the top just like my G5's display now has ten years later.

In those days, there were five distinct UNIX workstation vendors with five different CPUs and five different OSEs. Here's how it looked at the time.

Each of these companies fought tooth-and-nail for the desktop workstation market at a time when PCs were considered underpowered and OSes were considered immature, namely DOS on Intel 80486 and Apple System 7 on 680x0. While one of the PCs might cost, roughly $1500 (and the Macs somewhat more), these scientific workstations were in the $10,000 range - but worth every penny if you needed the power of UNIX and fast SCSI storage.

I bring this up for a very good reason. It was this OS and CPU divergence, provincialism, and bitter competition in the Unix workstation market that contributed to the downfall of some of these companies and the rise of Microsoft's Windows NT. Microsoft saw the need for a single API set and a consistent GUI on commodity Intel hardware as a viable alternative to the fragmentation in the workstation market.

A parallel event that created problems for the UNIX workstation vendors was Intel's ability to create powerful yet low-cost CPUs that essentially brought workstation-class performance to the PC world. While we've made terrific progress on the CPU hardware side in the last ten years, investment in a great GUI for UNIX was non-existent until Apple took up the task. Creaky old GUIs like CDE never progressed because these workstation companies had no vision. Executives in that industry believed that PC "toys" would never catch up with their glorious workstations, yet they settled for CDE and put their head in the sand. Windows NT4 on Pentium spelled the beginning of the end for the classic UNIX workstation and those workstation vendors have had to rally around Linux to keep their (albeit fading) business.

As we know, DEC folded first. Even in the face of the Windows NT challenge, DEC employees sabotaged the company by refusing to give up on VMS. DEC was sold to Compaq and their products disappeared into subspace. SGI is in bankruptcy. IBM moved to Linux on POWER. HP moved to a variety of different combinations, and Sun, who discovered that a super-Sparc chip wasn't going to save the company, is into Opterons with the OS of your choice. Hardly anyone buys these systems when they can get a quad CPU G5 system from Apple for thousands less. [3]

Has Anything Changed?

Now fast forward ten years, and we have the same situation with supercomputers that we had with UNIX workstations in the 1990s. Each of the remaining companies, plus Cray, who engages the supercomputing market has its own favorite hardware, Itanium, Opteron, Xeon, PowerPC 440 & 970, and each has its own favorite UNIX flavor. Each company fights tooth and nail for a small piece of the market, arrogantly believing that someday the customers will realize they are the best, and never again buy another cluster from the other guys.

Meanwhile, users of supercomputers are frustrated by problems related to migrating from one generation to the next. And the tools continue to be fragmented with many different variations of MPI, schedulers, compilers -- to the point where each supercomputer is a unique, custom made system that takes months to bring to productivity. And while Virginia Tech showed the world how to stand up a supercomputer and get it running before it's obsolete, there remains a long period form the initial Top500 benchmark to really productive use.

Moreover, the pressures to maintain code consistency and certify their code for crucial national security projects makes it very difficult to introduce new technologies. Compared to Shark, Xcode and Mac OS X, the HPC community is stuck in the dark ages. One presenter I saw at a major conferences didn't even know about the Apple profiling tools for PPC 970 and was writing her own for POWER5 - a CPU with a very similar architecture.

Enter Microsoft

Once again Microsoft sees the weakness in all this foolishness and is positioning themselves to enter and seize the supercomputer business. Here's what they've done.

I've seen Dr. Marvin Theimer several times, in HPC meeting keynotes, describe how they'll infiltrate the HPC market. Here's the plan, very roughly.

Historically, when scientists need e-mail or networking services, they go to the IT department. But when they need a supercomputer, they bypass IT and go much higher in the chain. They know what's needed, and they're UNIX and computational experts. No one can question their expertise.

Microsoft will start to slowly undermine that authority by forcing small and medium sized HPC projects to go through the IT department. Here's how the conversation might go. (Dilbert? Are you listening?)

Scientist: "We need a mid-range cluster to validate our materials design."

CEO: "How much will it cost?"

Scientist: "Well under a million dollars. We need a teraflop."

CEO: "Go see the IT Director."

Scientist: "Huh? He doesn't know sh*t about Arbitrary Lagrangian-Euler analysis. And he doesn't know UNIX."

CEO: "Oh, I'm not convinced we need UNIX. The IT Director and I just finished a meeting with Microsoft, and they've provided us with all the tools you'll need to do the materials design."

Scientist: "Gulp. Gurgle. GAAAAAA!"

CEO: "I've asked the IT Director to design the system with a Microsoft consultant. I'll let you know when we're ready to give you a user account on the system."

The scientist winces when he hears the words "user account" and realizes that his days of bypassing the IT department are over. Now there will be those nay-sayers out there who claim Microsoft will never take control of the HPC industry in this country. I believe that is a short-sighted response. Microsoft has the money, the time and the determination. Amazingly, only one company can stop them. Want to guess who that is?

Once Microsoft has penetrated the low and then mid-range market, systems under one teraflop, they'll have enough expertise to cut their teeth on bigger bids against Cray and IBM. They'll win a few and lose a few. (Meanwhile, the very largest projects will remain as they have been for a few more years.) But once Microsoft achieves dominance in the small (4-64 nodes) and mid-range clusters (128-512 nodes), the days will be numbered for scientists who can call their own shots and buy a UNIX-based departmental cluster of any size with company money. Microsoft will do to Apple in departmental computational science what they did with the back office. And then set their sights on the other guys.

The Way Out

How might HPC scientists avoid such a plight? In my opinion, the key is the software and hardware integration expertise that Apple owns. No one really wants to see Microsoft control the HPC culture in the United States for obvious reasons.

As I mentioned last time, there has been a healthy outpouring of support for Apple in HPC prior to the announcement of the Intel transition. [4] Many believe that Apple, if they set their mind to it, can seize control of this market thanks to their vision and expertise at integrating hardware and software. Even after the Apple Intel announcement, scientists realized that Apple Intel-based Xserves would, someday, make a very nice cluster component. What's missing is a serious level of effort from Apple in terms of organizing to support such sales. To succeed against Microsoft, a team must be built that has its own HPC scientists, proposal writers, software engineers, sales and support people.

There's no doubt that the Intel processor Apple selects for the next generation Xserve when combined with the great engineering and industrial design will make it a winner. [5] If Apple were to engage in a program of partnering with scientists to build turn-key mid-range clusters, certify the design, integrate the best of breed HPC software, and create a sales and support organization that can take such a product to market, it would stop Microsoft dead in its tracks because Apple is welcomed and Microsoft is not by UNIX-savvy scientists in the HPC community. However, if Apple walks away, the IT managers can force the issue their way with Microsoft's traditional power politics.

With Apple's participation and their fabulous tools like Xcode and Shark (and the HPC software tools that have been ported to Mac OS X) in the hands of HPC scientists, the state-of-the-art could leap forward dramatically and cut Microsoft off at the pass. From what I've seen, it won't be hard to for Apple to embarrass Microsoft in terms of elegance, creativity, and attention to the user, just as they've done with everything else. Sans Apple, however, the other HPC vendors cannot hope to develop tool maturity and integration that can compete with Microsoft. As bad as Microsoft's stuff will be, it'll be heavily promoted and be better than anything else.

No one else can do this but Apple. Orion Multi-Systems is too small, if they are even still in business. (Their Web site is off the air as I write this.) Cray doesn't have the expertise. When I was at Apple, Cray consistently approached us to work together because they realized that they needed a great front-end, UNIX partner. They don't have the money or expertise to pull together a great, user friendly, next generation teraflop cluster that has the ease of use Apple is famous for. Sun is too busy just trying to stay in business after Scott McNealy's exhausting crusade. All the other companies involved with HPC have their own axe to grind, their pet government contracts, and absolutely no clue how to pull together a cluster technology that is a joy to use. Or how to defeat Microsoft's infiltration through the IT department.

There is a lot of prestige involved with HPC and supercomputers. Our country's scientists use them for astronomy and physics, sequencing of DNA, designing new drugs, designing safer cars, predicting weather and the tracks of hurricanes, and simulating designs for new energy systems like fusion reactors. Apple already makes a big deal about their support for the scientist on the desktop and in workgroups, but Microsoft wants to take that away from Apple and confiscate, with typical IT shenanigans, much more in the HPC community.

It'll be interesting to see who wants this exploding HPC market the most.


[1] Such as driving a car while talking on a cell phone. (Which
I'm not recommending.)

[2] At the time, Martin Marietta Energy Systems operated the
Oak Ridge National Laboratory and all its related facilities for
the U.S. Department of Energy.

[3] When you bought one of these workstations in the 90s, you also
had to pay thousands in maintenance agreements.  Despite the PC
onslaught, no workstation vendor wanted to give up that revenue stream
first - fearing loss of working cash, profits, etc. So, they all went down together.

[4] Indeed, IBM CPUs have been historically looked on with favor
in the HPC community, so the PPC 970 was warmly welcomed.

[5] I suspect, it's just a hunch, that IBM was not happy with Apple's
emergence in commodity clusters. I'll take a wild guess that IBM,
already benefiting from the low-power PowerPC 440 chip in the
Blue Gene systems had little incentive to solve the transistor
leakage issues with the PowerPC 970's 65 nm process. Just a guess.