The idea was to introduce desktop extension into Argonne’s Blue Gene/P using a version of a Bell Laboratories operating system called “Plan 9,” after the Ed Wood Jr. cult sci-fi movie “Plan 9 from Outer Space.”
The Plan 9 operating system anticipated network computing. The Blue Gene/P supercomputer’s current 40,000 nodes, each node with four cores, are networked in a three-dimensional torus that interconnects each node to six of its nearest neighbors for massively parallel operations at extremely high speeds. IBM says this system is designed to be scaled up to deliver more than 1 quadrillion floating point operations per second.
“That 3-D toroidal mesh greatly reduces the amount of time it takes to send a message from one place to another,” Minnich says. “If you send a message on a Blue Gene network, it’s going to get where it’s supposed to. That’s a guarantee.”
Plan 9, he adds, was designed as an improvement to the UNIX operating system. For desktop extension, a key Plan 9 feature is its ability to import files from other machines for sharing.
Once the team had ported Plan 9 to Blue Gene/P, adapting the operating system to support desktop access was no picnic, Minnich says. “You need to write a compiler to compile your code to run on this new kind of computer. And you have to make sure it all works when it’s run on the big machine. Then you have to get it to run fast. To get it to run fast you have to measure it. Then you have to deal with all of the things that break.”
If the code works on one of the Blue Gene’s nodes, “it doesn’t mean it will work on 10 or 100 or 1,000 of them. So you have to deal with all the problems that come as you run it on more and more of the machine. You’ve got a new computer, and you’ve never run your software on it and you’ve got to build every piece of that software.”
One remaining hurdle will be simple resistance to novelty, he adds. “I think what we’ve already learned is that it’s very hard to get people to change the way they do things to such a large extent. But some people who like this idea a lot would like to use it.”
The computer world is rife with such uncertainty and unexpected shifts of interest. While previously at Los Alamos National Laboratory in New Mexico, Minnich directed another team that created Clustermatic, a software package that enables groups of PCs to work much better together in high performance clusters. Clustermatic earned an R&D 100 award in 2004, but that team has since moved on to companies like Google and Cray.
“Clusters have had a huge role for a decade,” Minnich says. “But now they have been almost pushed aside. High-end computing has gone away from clusters of PCs and moved to machines like Blue Gene.”
Monte Basgall is a freelance writer and former reporter for the Richmond Times-Dispatch, Miami Herald and Raleigh News & Observer. For 17 years he covered the basic sciences, engineering and environmental sciences at Duke University.