As an avid hockey player for most of his life, Arthur Bernard “Barney” Maccabe recognizes the vital role of speed and agility in a team’s success on the ice and at work, where he and colleagues design core software for supercomputers.
Maccabe is a pioneer in “lightweight” operating systems – the basic instructions that enable massively parallel computers to solve problems by breaking them apart and parceling them out to thousands of processors. Each executes its part of the algorithm simultaneously, a process known as scaling, to reach a solution more quickly than a single processor could.
After 27 years as a professor, researcher and administrator at the University of New Mexico, Maccabe (pronounced “muhCABE”) joined the Department of Energy’s Oak Ridge National Laboratory in Tennessee early in 2009 as director of the Computer Science and Mathematics Division.
It took him just a few days to find the ice rink in Oak Ridge. A former player on the University of Arizona team, Maccabe most recently was part of an over-40 traveling team, the Albuquerque Chili Peppers.
Carrying a hockey stick was just Maccabe’s avocation while in New Mexico. On the job at the university, he and his group performed basic research in lightweight operating systems – operating systems stripped to a bare minimum of features. Full-featured, or “heavyweight” operating systems (think Microsoft Windows or Apple OS X) include many components that may work well on single-processor computers, but can inhibit scaling on large systems.
Maccabe has collaborated with researchers at Sandia National Laboratory for the past 20 years on the first lightweight kernels (central operating system components) to manage communication between hardware and software for some of the world’s largest and fastest supercomputers, including the first system capable of a teraflops – a trillion operations a second.
Lightweight kernels date to 1991, when Maccabe and Stephen Wheat, now with Intel Corp. but then a Sandia staff member and UNM student, formed a small group to explore better ways to support message passing on Sandia’s nCube-2 system. This group developed the Sandia/UNM Operating System, called SUNMOS, which evolved into the PUMA operating system. Intel ported PUMA to Sandia’s ASCI Red, the first teraflops system, to create the Cougar operating system. Cougar was then re-engineered for Cray’s XT3 Red Storm system, also at Sandia, and renamed Catamount.
‘Multiscale resolution is a problem in all disciplines, but it is especially critical in life sciences, climatology and astrophysics.’
Catamount ran on ORNL’s Jaguar supercomputer at teraflops speed until Jaguar was upgraded in 2008 to run at a peak speed of a petaflops – a quadrillion operations a second.
“Back in 1997 we worked on the first teraflops machine, and I expect to be involved in the exascale and exaflops regime and moving that forward,” Maccabe says. An exaflops is a quintillion operations per second. A quintillion is a 1 followed by 18 zeroes.
From 2002 to 2007, Maccabe served as the technical chair for a DOE Office of Science initiative called FAST-OS, the Forum to Address Scalable Technology for runtime and Operating Systems. Louisiana State University’s Thomas Sterling, developer of Beowulf, the first cluster operating system, and Ron Brightwell, Maccabe’s longtime collaborator from Sandia, joined with Maccabe to use Catamount as a base for an operating system infrastructure sturdy enough to run applications at exaflops speed.
Because lightweight approaches expose all of the available system resources, they “are the best way to support the development of novel architectures and programming models,” Maccabe says.
This work has evolved into the development of a lightweight virtualization layer for high-performance computer systems and the project now includes partners from Northwestern University and ORNL.
Thomas Zacharia, ORNL associate director for the Computing and Computational Sciences Directorate, invited Maccabe to join the lab. Zacharia, who also directs DOE’s National Center for Computational Sciences (NCCS) at Oak Ridge, led CSM until 2001, helping it gain a prominent reputation in supercomputing and making ORNL home to some of the nation’s most powerful computers for unclassified science.
The upgraded Jaguar, an NCCS supercomputer, performs at a peak of 1.64 petaflops, making it the world’s first petaflops system dedicated to open research. CSM develops next-generation operating systems and software for computational science so investigators can use Jaguar and other ORNL high-performance resources. CSM also works with students, teachers, government researchers and industrial scientists on grand challenges in science and engineering.
“Barney’s vision and leadership will be invaluable in providing petascale computing systems such as Jaguar with the highest quality computational environment and the tools to enable leading-edge scientific research as quickly as possible,” Zacharia says. “His strong backgrounds in system software and computer science will enable ORNL to support DOE’s important missions in climate change, renewable energy and efficient transportation.”
Maccabe sees multiscale science as one of the big challenges that will benefit from DOE’s petascale computational resources. New, big computers will enable high-resolution scientific exploration over a range of time periods and sizes.
“In climate modeling,” Maccabe says, “you can get much finer granularity across data fields with petascale computing. In materials science, researchers can explore matter on a scale of nanometers all the way up to meters. Multiscale resolution is a problem in all disciplines, but it is especially critical in life sciences, climatology and astrophysics.”
Maccabe wants computation to be an integral and more visible tool in the sciences and to develop the computer science and mathematics to support researchers across disciplines. This also was one of his major goals at UNM, but barriers and boundaries between disciplines frustrated his efforts.
“Researchers here (at a national lab) are more open to teaming and there is more of a sense of cooperation,” Maccabe says. “There needs to be a balance between the culture of autonomy and the desire to cooperate and build things together, to be part of something bigger than you are, and that concept is better supported here at ORNL.”
Maccabe would like to explore the question of just how far CSM can go in making computation a fully integral part of research involving all the scientific disciplines. It’s “a big challenge at a national laboratory or a university, but I think I have more of a chance to have an impact at ORNL.”
Maccabe is accustomed to working on next-generation systems. At CSM, he envisions the mathematics of data driving the next wave of computation.
“The goal is to get science done but also to build a good strong math and computer science program that supports science and allows us to contribute to our own discipline.”
“Barney’s ability to develop relationships with universities and other research institutions,” Zacharia says, “is critical to building collaborations with researchers who can use petascale computing to explore climate change, energy use and its consequences, as well as other fields including astrophysics and mathematics.”
Besides collaborating with Sandia and Los Alamos national laboratories, Maccabe was a member of the original Message Passing Interface forum, building a library specification for data exchanges between processors in parallel computing. He was a founding member of Linux Cluster Institute, an organization that promotes administration of large-scale computing systems through workshops and an annual technical conference.
At UNM, Maccabe received several faculty awards in research and teaching. He led a group of about 20 researchers and supervised 10 doctoral students, including SUNMOS-collaborator Wheat, a member of the Sandia team that won the 1994 Gordon Bell Prize.
At ORNL, “I expect to do quite a bit of mentoring of young researchers just starting their careers,” Maccabe says. “That was really the fun part of being a faculty member, so I hope to help new scientists build their research programs within computation. That would be almost as much fun as hockey.”