Achieving exascale computing will depend on many interacting parts. That requirement spawned the idea of co-design, in which developers consider the scientific problems to be solved as they create exascale architectures and how that architecture affects software design. To proceed with this approach, the Department of Energy (DOE) created three co-design centers – for materials, nuclear energy and combustion.
A key part of co-design resides in what are known as proxy applications, or proxy apps. A proxy app generally stands for something bigger than itself. It’s a piece of code stripped of as many nonessential lines as possible that mimics some feature of a large program.
“The proxy app can represent an algorithm or many other things, like memory-access patterns a code uses or how a code does threading” – dividing segments of code into tasks that run simultaneously – says Jim Belak, senior scientist at Lawrence Livermore National Laboratory and deputy director of the co-design center on materials. “The suite of proxy apps is an attempt to capture the computational workflow of the bigger program.”
Proxy apps first emerged as benchmarks. For example, the Livermore Loops, a Fortran-based proxy app, consists of 24 do-loops – which direct a program to continue some action until a certain criterion is met, with each one testing an algorithm.
“These proxies allow us to study part of an application’s capabilities in new environments, new machines, new programming models, new operating systems,” says Michael Heroux, distinguished member of the technical staff at Sandia National Laboratories. To succeed, proxy apps must be accurate and readily available.
Among proxy apps, some researchers focus on miniapps.
In part, proxies provide a much-needed roadmap for developing new software or co-developing software and hardware. As Alice Koniges, computational scientist at Lawrence Berkeley National Laboratory, says, “Computer science has mushroomed so much – longer codes that do more things, more languages, new tools to build codes – that there’s not really a good path for a programmer to follow to write really good code, because they don’t know where to start.” With 100,000 or 300,000 lines of 10-year-old code, a single team of programmers probably can’t just take a look and make it faster.
With a proxy app, though, a programmer can see how a piece of code is representing the physics or engineering behind it. “Then people in the (computing) community can say, ‘My language is perfect for that’ or ‘This could map really well to CUDA,’” NVIDIA’s parallel computing architecture, Koniges says. Then programmers could at least work on speeding up a particular part of a larger application.
Exascale provides unprecedented options for parallel processing, which divides a problem into pieces individual processors solve simultaneously. “This creates a fundamental change in how we design hardware and the tools that sit on top of the hardware: the operating systems, programming languages, compilers and applications,” Heroux says. “There’s not a stable piece in the entire computing infrastructure anymore.”
Those changes generate even more need for proxy apps because computer scientists need smaller, manageable pieces of representative code to guide choices from the various hardware and software options ahead.
To make a proxy app, Heroux says, you need three people: a domain expert who understands how the code is applied to the science; a parallel-computing expert who understands what type of algorithms might map well to future machine architectures; and an expert in computer architecture and trends in processor, memory and interconnect design.
“Put those three people in a room, and let them talk about specific applications of interest,” Heroux says. This will include discussions of the important computing patterns – essentially how algorithms behave – in the applications. From this, the trio of experts can create a shortlist of the important parallel patterns in the applications.
As a domain expert, Belak says, “I think of the problems in terms of the mathematics, not necessarily the computer algorithms.” Still, he wants to give the parallel-computing expert information about both “because there might be other ways to solve the domain science.”
Belak says knowledge of the domain science “enables me to go into an application and find that key part” a small application can mimic. When collaborating with the computer architecture expert, however, the domain and parallel-computing experts must work with abstractions. The intellectual property behind the architecture gives the vendor a competitive advantage, so the expert won’t reveal every detail. As Belak says, “I need them to abstract the hardware in sufficient detail that I can think about the machine but that it doesn’t compromise their intellectual property.” That leads to a fine balance in most cases.
Once a proxy app is in hand, it must be validated. As Heroux explains, that means “if I take this app and put it in two different computer environments and run it in both and get a ranking of the performance in those settings, it would be predictive of the real application running in those environments. We’re still learning the right way to do this.”
For instance, Belak asks, do the proxy apps really represent the computational workload of the parent application? To explore that, the comparison could be how often a proxy app and the parent application access a particular memory address, or location. Vendors provide tools for such quantification. In this way, the validation process considers the entire computer ecosystem, rather than just the proxy and parent apps.
Among proxy apps, some researchers focus on miniapps, which possess several features: small but not tiny, consisting of about 2,000 to 10,000 lines of code; works as a standalone piece of software; and can compile – be made ready to run on a specific computer – without additional software pieces. Although that general concept emerged decades ago, the specific features and the miniapps name arose from Heroux’s work at Sandia in 2007.
“Miniapps are designed to exhibit one or a few performance-impacting aspects of a real application,” Heroux explains. So miniapps need not reflect real physics or even solve simplified problems. They just need to provide a good performance model for some key piece of an application.
A sign of the growing interest in these tools is a tutorial on application proxies and miniapps at the SC12 supercomputing conference November 10-16 in Salt Lake City. The session will be presented by a team of experts that includes Heroux, Koniges, Richard F. Barrett of Sandia, and David F. Richards and Thomas Brunner, both from Lawrence Livermore National Laboratory. In addition, each co-design center presented a deep dive into their proxy apps at the recent DOE Advanced Scientific Computer Research program’s Exascale Research Conference.
Current access to proxy apps, however, creates a problem. “It’s difficult for the people who want them to get their hands on the really important ones,” Koniges says. As a solution, she suggests a miniapp repository. “To do this, you need to have the codes freely accessible,” Koniges says. “So they need to be codes that people are willing to give out or give out with some kind of license.”
As a result, miniapp use would increase, for more and more applications. Likewise, validation could be conducted in wider circles. These advances will expand the inevitable use of proxy apps.