Accelerator facilities that allow nuclear physicists to probe the inner workings of atoms and their nuclei require a costly high-wire act of data acquisition and storage. Take, for example, the Continuous Electron Beam Accelerator Facility (CEBAF) at the Department of Energy’s Thomas Jefferson National Accelerator Facility, or JLab, in Newport News, Virginia. At CEBAF, electrons hurtle at close to lightspeed into a target, obliterating protons and neutrons.
Every piece of atomic debris is diligently tracked by a suite of detectors, generating tens of gigabytes of data per second. Until recently, capturing this data for post-experiment analysis required priming servers to receive data at irregular intervals and keeping the whole system calibrated for experiments that can last for hours, weeks or years.
“It’s been done this way for time immemorial,” says Graham Heyes, JLab’s scientific computing head.
Now, JLab has found a new way to do it: streaming the mountain of data pouring out of detectors directly into powerful computer systems to allow real-time filtering, calibration and analysis, all of which will save the lab time and money.
In the past, gathering data from nuclear physics experiments was simpler. Hardware limitations capped how much data researchers could acquire and store. Over the years, detectors have become more complex and readout electronics faster.
“Somewhere in the last decade, we realized that networking and computing technology had caught up with detector technology, to the point where, in theory, our hardware could continuously read out the detector’s data and store it,” Heyes says. “You could then sort everything later, which is an awful lot easier than trying to sort it out while you’re looking at it.”
But saving oodles of data is expensive, so typically researchers tried to write data to disk only as interesting collisions happened. But turning data collection on and off to capture individual events requires a trigger, an expensive, complicated piece of hardware with embedded software that uses specific criteria to decide when events occur during the run of an experiment.
Experiments are run around the clock and can last for months or years, Heyes explains. “If you get something wrong in the trigger then it is possible that all of your data is corrupt, leading to misleading results. So, the physicists are constantly stopping to run trigger studies.”
If one system fails, data can be redirected in mid-flight to the other.
System calibration can gobble precious time, too, he says. “What you get out of the detector is a number that represents a voltage on a wire that can be converted into positions and times, but everything needs precise calibration.”
One way the lab’s researchers do that is turning off the magnet and letting cosmic rays come through the lab’s roof and bombard the accelerator’s detectors. Over the course of several hours, they adjust calibration constants to make sure the cosmic ray background signals line up perfectly linearly across the detectors. Like trigger runs, calibration is performed over and over throughout the course of an experiment.
To get around these time sinks, Heyes and his team have fully embraced streaming data acquisition. That allows them to acquire data continuously and eliminate the need for a trigger. Also, because calibration information is mixed with the experimental data the lab’s physicists want to keep, self-calibration occurs in real time. The technology builds on the same principles used by Netflix, Google and Apple, Heyes says, which have “now been tested by a billion customers. You don’t go out and reinvent the wheel when there are already Ferraris driving around.”
Now that the physical trigger is gone and experiments are calibrated in real time, that unlocks another intriguing possibility: real-time data analysis.
They can analyze the hundreds of gigabits of raw data in real time and reduce it to roughly a gigabit per second, Heyes says. “You can save dramatically on the storage costs and the cost of everything else that’s downstream.”
Another cue Heyes and his team have taken from big business is creating acquisition systems tailored to specific applications.
Large data centers run by Amazon, IBM and others either have customized hardware or a software layer on top of their hardware to carve out virtual machines that fit a user’s needs exactly. The JLab team uses a similar approach: “We can build, say, a box tailored to process the data with machine learning techniques, with 400 gigabytes of memory, two CPUs and eight GPUs,” Heyes says. If one system fails, data can be redirected in mid-flight to the other.
The same logic can be applied to streaming data, whether it’s from particle accelerators, X-ray light sources, electron microscopes, genomics experiments, a combination of these data types, or even simulations, Heyes notes. “You have your data acquisition system, some storage, some compute, and then a copy of the whole thing, and if there’s a problem, you can dynamically steer the data from one system to another, so you experience no interruptions.” Heyes and colleagues submitted their method earlier this year to arXiv Computer Science: Networking and Internet Architecture.
Heyes also is trying to simplify how these systems are built. He’s customizing software frameworks that work a bit like Tinkertoys, with streams of data playing the role of the long wooden rods and services acting as the hubs. More advanced versions might provide a results template – a basic sketch of what the incoming data might look like, including metadata that characterize its type and source – and use a machine learning algorithm that helps to reassemble and interpret the resulting diagrams. “So that’s the ultimate goal in five to seven years down the road: a smart data center that you just give a hint of what you want to do, and it figures everything out for you.”
And, if Heyes gets his way, the system he and his team are building will figure everything out for other national labs looking to join the streaming revolution. For example, JLab will partner with Brookhaven National Laboratory to run streaming data experiments at the Electron-Ion Collider in the early 2030s.
“We’re just getting our toes wet here,” Heyes says, “and starting with nuclear physics makes a lot of sense because we’re able to solve a lot of the problems before we have to do it on a scale that’s 10 or 100 times bigger.”