Categories: Computer Science

Testing tradition

A tooth left overnight in a glass of cola will dissolve. If you receive an e-mail with the subject “An Internet Flower for you,” do not open it – it contains a virus that removes important files from your computer. If you enter #-9-0 on your phone, you’ll allow scammers to make unlimited long-distance calls from your phone.

Any of these assertions sound familiar? They’re examples of common urban folklore that spread like wildfire via the Internet.

Such legends fool us because they sound plausible. Experts say the most successful myths and folklore contain a mixture of truth, exaggeration and falsity that make them hard to disprove.

Examples exist within virtually all fields, and software development is no exception. How such folklore enters the computer programming culture – and can improve software developer productivity – is what interests Victor Basili, a computer science professor at the University of Maryland-College Park.

Basili’s large collaborative team studies how software developers create new code, where current programming efficiency bottlenecks exist, and how a shared body of knowledge can reduce development time.

His research employs classroom experiments, case studies and group discussions with code developers, combined with interviews with the engineers and scientists who use the code.

Basili has enlisted professors teaching graduate courses at eight universities to identify common impediments to effective program development in student projects.  Developers at five ASC Alliance consortia (DOE-supported centers for high-performance computing) compare those impediments to defects professional developers make.

All these tools are designed to help high-performance software developers improve their productivity and cut time spent debugging.

The team has a wiki to collect bugs, bottlenecks and folklore that hamper effective code development.

“If I give you two orders of magnitude improvement in execution time – if it takes me five years instead of 10 years to develop petascale codes” capable of a quadrillion calculations per second, “Wow! That’s something,” Basili says.

There are relatively few programmers with the knowledge and skills to create code for massively parallel computers – using thousands of processors – and they’re geographically dispersed.  Basili and his team are trying to bring them together in virtual space by creating a knowledge base of best practices for high-performance code development.

Another project aims to determine how developers can best use new programming languages Cray Inc. and IBM are developing for the next generation of computer hardware.

“The petascale project asks, ‘How do we take advantage of these larger and faster machines?’” Basili says.  “But we first have to build this lower-level body of knowledge about what’s important, and now it becomes even more important as the next generation of petascale computers comes on-line.”

His research has led to creation of an experience base for common recurring software defects, or bugs, in high-performance computing.  The team has assembled a wiki to collect patterns of functional bugs, performance bottlenecks, portability problems, and folklore that hamper effective code development.

For example, Basili says software developers often create a serial version of a particular code – one that works in a sequence of steps – optimize it to run efficiently, and then adapt that code to a parallel computing environment.

But when Basili asked experienced developers to critique a code that was created that way, they said the problem was in the assumption – the received wisdom – that it’s important to optimize a serial code before parallelizing it.

“Vendors tell us, ‘Go ahead and develop the serial code, but don’t do it [too] well, because optimizing serial code might actually hurt your effort to parallelize it,’” Basili says.

Tidbits like these ought to be distributed to the developer community, he says: “If there’s less time spent writing wrong code and debugging it, then we can shrink both the time to development, as well as the effort.”

Basili is trying to understand how effective development of high-performance computing codes happens, and how programmers can make the most efficient use of the new languages that are being developed for petascale machines.

“We’ve reached the [processing] limit of single-processor machines,” Basili says.  “With the current generation of multi-core processors in PCs, we are going to have to expand the pool of developers.  It’s going to be a field no longer just for the specialist.”

Basili believes that to effectively train more programmers, something approaching a shared knowledge base must be created and distributed among students and faculty – one that dispels folklore and disseminates best practices.  For now, he’s collecting programming folklore on his Web site and forming testable hypotheses to prove or disprove it, using students and collaborators as his test cases.

So, is it really easier to get something working using a shared memory model than by using message passing? Stay tuned to hpcbugbase.org for the answer.

Bill Cannon

Share
Published by
Bill Cannon

Recent Posts

We the AI trainers

Computer scientists are democratizing artificial intelligence, devising a way to enable virtually anyone to train… Read More

November 12, 2024

AI turbocharge

Electrifying transportation and storing renewable energy require improving batteries, including their electrolytes. These fluids help… Read More

October 30, 2024

Pandemic preparedness

During the pandemic turmoil, Margaret Cheung reconsidered her career. At the University of Houston, she… Read More

October 16, 2024

A heavy lift

Growing up in the remote countryside of China’s Hunan province, Z.J. Wang didn’t see trains… Read More

August 1, 2024

Frugal fusion

Princeton Plasma Physics Laboratory (PPPL) scientists are creating simulations to advance magnetic mirror technology, a… Read More

July 16, 2024

A deeper shade of green

Niall Mangan uses data to explore the underlying mechanisms of energy systems, particularly ones for… Read More

June 26, 2024