If you’ve ever watched video over a slow Internet connection, you have some idea of what scientists face when working with big experimental facilities like the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory.
The SNS supplies intense pulsed neutron beams to scientists seeking clues to the fundamental nature of materials.
Recording how materials scatter neutrons can provide information about the materials’ internal structures at the atomic scale and about their reaction to stress and strain at larger scales. Whether scientists are designing lightweight, durable material for aircraft or components for new rechargeable batteries, information from scattering experiments can help them optimize properties like magnetism, superconductivity and energy storage.
But the SNS instruments usually use a file-based approach for data capture and processing. As researchers carry out experiments, the computer system generates files sometimes comprised of hundreds of gigabytes or terabytes of data. Generating and then processing those individual files could take as much as an entire day, says Thomas Proffen, director of the Neutron Data Analysis and Visualization Division at Oak Ridge.
A team of Oak Ridge researchers wants to give SNS users an experience akin to streaming video, making acquiring and analyzing data as efficient as possible.
SNS and most other neutron-scattering experiments are large, multimillion-dollar accelerators and are in high demand. The SNS accommodates around 500 experiments and more than 1,500 researchers each year.
Real-time access might let researchers make adjustments and cut the time needed to get even higher-quality data.
To make the scientists’ time at the SNS as productive as possible, Oak Ridge scientists are implementing a new way to manage experimental data. ADARA (Accelerating Data Acquisition, Reduction and Analysis) is an advanced software infrastructure that changes how experimental data are captured, processed and then presented to the user. It lets scientists access their data in real time – like a streaming movie – in an integrated package, rather than waiting for files to be stored. With faster access to data, researchers can do more experiments in less time.
The work on ADARA started in 2011, when a team of computer scientists and neutron scientists examined how they could improve productivity at the SNS. The facility has many instruments, but the group found that managing and processing experimental data was a common bottleneck.
“So we wanted to streamline the process of acquiring the data, converting the raw data to a convenient usable form, and analyzing the data to extract meaning from the results,” says Galen Shipman, data systems architect for computing and computational sciences and ADARA’s principal investigator.
Users, the team decided, should be able to capture and process data as instruments generate it in near real-time. They wanted to enable researchers to process extremely large datasets in a matter of seconds. Finally, they needed to build tools so that researchers could accomplish these tasks from anywhere. It’s taken significant developments in data infrastructure to achieve these goals, says Mark Hagen, SNS Data Analysis Group leader and co-primary investigator of the ADARA project.
The huge, difficult-to-process files SNS Instruments generate also come in several different types, some carrying information about conditions surrounding the experimental sample and others holding data from the scattered neutrons. Waiting to examine their data means scientists can’t optimize conditions such as temperature and pressure that might improve their experimental results.
Real-time access might let researchers make adjustments and cut the time needed to get even higher-quality data. That’s important because scientists often have only a few days at a time for their experiments. It might be months before they get a chance to try again, Proffen says.
Although the core problem the team faced was computational, the project relied on collaboration between researchers in two lab directorates, Neutron Sciences and Computing and Computational Sciences. The team had to understand the needs of several different groups: the instrument scientists who build and manage the SNS beam lines, others who manage the data those instruments generate and visiting researchers who use the facilities.
So that researchers could learn the new software quickly, the team decided to ensure everything they developed would function with the established Mantid neutron scattering data processing system. “By delivering these capabilities through the Mantid (graphical user interface), users are able to manipulate their data in a way that they could easily understand and plug into,” Shipman says.
In August 2012 the team deployed ADARA alongside the existing data management system on HYSPEC, a neutron scattering spectrometer primarily used to study excitations in single-crystal samples. Hagen and instrument scientist Barry Winn led the initial experiments, conducting measurements of the mineral cuprite (Cu2O). They needed to ensure the system was reliable and that users could receive feedback in real time. They also wanted to make sure the results obtained with ADARA matched data obtained with the existing file-based data management system. The test succeeded: ADARA provided real-time feedback and instant dataset creation as designed.
The researchers did the most intense work of deploying and integrating the new software at times when the neutron source would normally be down for scheduled maintenance. But they had to carry out many tasks while the neutron source was up and an instrument was running.
“We’ve been changing the tires on the car while it’s been rolling down the hill,” Shipman says. The team is phasing out the old data management system on HYSPEC and plans to deploy ADARA to all SNS beam lines by the end of 2014.
Many of these types of projects end up being proof-of-concept exercises: Researchers gain critical insights, but the resulting system isn’t used routinely. “This is one where we’ve identified a critical need and addressed it,” Proffen says. “That’s a really big deal.”