Computer Science
July 2007

Ganging up

DOE researchers have made it easier to link off-the-shelf PCs into powerful parallel clusters.

The day when setting up a supercomputer is as easy as programming a VCR may never arrive, but Clustermatic has brought it closer.

A project of scientists at Los Alamos National Laboratory, a Department of Energy facility, Clustermatic is a software suite that greatly simplifies the construction and control of clusters – personal computers joined to work as one large, parallel-processing machine.

Clustermatic image

Parallel processing breaks tasks into pieces and distributes them to multiple processors that work simultaneously for a faster result.  For many applications, clusters are appropriate and also far cheaper than monolithic high-performance computers because they use standard, off-the-shelf components.

Clustermatic has pushed clusters’ popularity; all or part of the suite runs on millions of computers.  Now researchers are designing an all-new version that could move clusters further toward VCR simplicity, says Ron Minnich, a computer scientist who led Clustermatic’s creation.

Labs and universities worldwide have created clusters, giving them high-performance computer power for a fraction of the cost.  Yet, Minnich says, most clusters run with something like mob rule instead of orchestra-like coordination.

“You have this room full of PCs and the first problem is they’re not designed to be clustered,” says Minnich, who worked on the project at Los Alamos before moving to DOE’s Sandia National Laboratory at its Livermore, California location.

‘It doesn’t need a floppy or a disc that it boots from.’

Getting the PCs to cooperate is like herding penguins, Minnich says.

PCs typically start by loading the BIOS – Basic Input-Output System.  The BIOS gives a computer instructions it needs to work with things like a keyboard and mouse before the main operating system loads.

But the BIOS usually isn’t designed for parallel computing, and causes headaches in a cluster arrangement.

So Minnich and his fellow researchers replaced it with LinuxBIOS, a program written in the C high-level computer language and using selected components from the open-source Linux operating system.  Each clustered PC, called a node, loads LinuxBIOS from its own flash memory chip.

LinuxBIOS tells the node to talk to a PC designated as the control node – the conductor of the PC orchestra.  “It says, ‘Tell me what to do,’” Minnich says.

LinuxBIOS creates “a computer node that immediately is remote controlled. It doesn’t need a floppy or a disc that it boots from.”

The control node distributes programs to the compute nodes and parcels out tasks.  For speed, it enlists several compute nodes, telling them to pass along programs the way a teacher might have a few volunteers hand out worksheets to the rest of the class.

LinuxBIOS is the foundation of the Clustermatic software stack, Minnich says, but the software suite includes other cluster computing tools:

  • BProc, or Beowulf Distributed Process Space, lets the control node – the conductor – see what’s happening in the compute nodes – the orchestra.
  • Supermon monitors progress on the compute nodes with no discernable impact on their function.  That’s unusual in high-performance computing, where monitoring often saps processor resources.
  • Beoboot allows large clusters to start quickly by recruiting the first successfully booted nodes to help boot the others.  Beoboot can start a 1,000-node cluster in about 2.5 minutes, compared to hours for some clusters.
A portion of LANL's Pink computer

Researchers at Los Alamos National Laboratory tested their Clustermatic software suite by building Pink, a 2,048-processor cluster, in just 2.5 days. This picture shows one of the three racks holding Pink’s processors.

Clustermatic proved itself quickly.  Minnich and fellow researchers Greg Watson, Matthew Sottile, Sung-Eun Choi, and Eric Hendriks used it to build Pink, a cluster with 1,024 two-processor nodes.  It took just 2.5 days to assemble and start operating.

Pink has a theoretical top speed of almost 10 trillion calculations per second and still runs after five years.  It has a reputation as one of the most reliable machines at Los Alamos, Minnich says.

Clustermatic earned an R&D 100 Award as one of the most promising new inventions of 2004.  Now it’s on thousands of systems, Minnich says – although no one has kept track, since it’s free software distributed via download and CD.

“We keep finding users of it that are … we don’t know where,” Minnich says.  He does know that users include financial and pharmaceutical companies.

Temperature data for Pink's processors

Temperature data for processors in the Pink computer cluster are displayed on a computer monitor. The data were captured by Supermon, a part of the Clustermatic software suite, and processed into a visualization.

LinuxBIOS alone has gained huge acceptance.  Besides high-performance clusters, it’s loaded onto millions of other computers, Minnich says.  It’s especially popular for “embedded” processors – computer chips in kiosks, digital televisions, TV signal converters and other devices.

Clustermatic’s last version came out in 2005.  “We had a certain set of goals we put in the (research) proposal in 2000 and in surprisingly short order we actually hit all the goals around 2003 or 2004,” Minnich says.

Now Minnich and his fellow researchers are molding an as-yet unnamed program that builds on Clustermatic, but is entirely new.  Scientists from Los Alamos and Sandia’s California and New Mexico locations are involved.

The group’s success proves “our ideas about how to build these systems and make them work well and stand up quickly were fundamentally correct,” Minnich says.  “You end up with a very fast and very useful system.”