Categories: Computer Science

Digital autodidact

In an age when cars park themselves and computers talk at their human users, Mallikarjun (Arjun) Shankar of the Department of Energy’s Oak Ridge National Laboratory (ORNL) acknowledges that high-end research computation also seems on the verge of, if not intelligence, at least remarkable ability to train itself to assemble and interpret massive amounts of spoken, written and visual data.

That’s owed to an evolving computational field called “deep learning,” says Shankar, ORNL Advanced Data and Workflow Group leader. Enabled by mathematical ideas called artificial neural networks, deep learning offers “computational techniques to understand backgrounds and structures and data,” he says. This information, some of it initially hidden, emerges like art on a painter’s canvas as these networks grasp more insights into the information they’re focusing on.

The goal: employ this high-speed wizardry to search swaths of available information better than previous techniques have, finding learning patterns at the speed of data input. Multiple national labs, including ORNL, are working with major universities in deep learning quests to resolve scientific challenges ranging from understanding cancer to enabling fusion power.

Last fall, ORNL bought what the manufacturer calls “the world’s first deep-learning supercomputer in a box.” This box, the NVIDIA DGX-1, is already exchanging data with the much more massive Titan supercomputer at the Oak Ridge Leadership Computing Facility (OCLF), a DOE Office of Science user facility. Both machines have graphics processing units (GPUs) that deep learning thrives on.

NVIDIA GPUs will also be vital components of Summit – the next-generation supercomputer at OCLF – which will have at least 10 times’ Titan’s calculating speed, enough to label it a “pre-exascale” machine when it is assembled next year. Exascale computers will perform a quintillion (10¹⁸) calculations per second.

Shankar also directs ORNL’s Compute and Data Environment for Science (CADES) program, which offers researchers access to data-intensive resources such as the DGX-1. He thus has a long perspective on data-handling history.

“Statisticians have been doing data analysis for over a hundred years,” he says. “Scientists in the second half of the 20th century became interested in how to make a computer do it well. In the 1970s and 1980s, ways to deploy methods to understand patterns and information in databases were developing, and that was called ‘data mining.’ Then ‘machine learning’ became the term everybody gravitated to.”

Machine learning gives computers the ability to learn without being explicitly programmed to do so. Deep learning, which has evolved rapidly over the last decade, is a part of machine learning that boosts flexibility in computer programs’ self-training power.

GPU processors, it turns out, ‘are really well designed’ for a mathematical feature of neural nets.

Shankar thinks none of this is artificial intelligence, a term he says has become loaded since Massachusetts Institute of Technology computer pioneers Marvin Minsky and John McCarthy first used it in the 1950s. Back then, “there was this thinking that computers and algorithms would work to become sentient in a sense,” he says. “That never really played out, so people lowered their expectations. We reset our targets to say we want a computer that doesn’t have to be reprogrammed when it gets new data.”

For the past 15 years or so, computers have gotten bigger and faster. Now, “these neural networks are able to do things more powerfully that don’t need that much hand-holding. It almost feels like you can construct a neat neural network and give it test data and then” – on its own – “it can start recognizing more information.”

Although an artificial neural network invokes brain physiology, the network is “just equations, which means all software,” Shankar says. But there’s a nuance in their mathematics. The networks frequently take the form of matrix multiplications to produce new matrices from previous ones, largely using linear operations.

Like all software, neural nets run on processors. GPU processors, it turns out, “are really well designed” for a mathematical feature of neural nets: matrix multiplications, which produce new data matrices from previous ones.

“When you play video games or do image processing kind of work, you do this sort of matrix shuffling. A system like the DGX-1 has really powerful GPUs with really high bandwidth connections accelerating these matrix operations.”

At the outset, building a neural net can be as simple as joining a one and zero in a linear combination, he says. The nets initially train on targeted input data that helps them refine their system parameters so they can identify desired patterns in new data. What emerges are large collections of tutored nets set up to extract additional new information on their own from fresh text, audio, image or video data. Thus, they can be used to predict simulated outcomes based on historical data.

One ambitious project is CANDLE, for cancer distributed learning environment. It seeks to address some objectives of a federal cancer moonshot initiative, whose goal is to enlist deep learning to increase by ten-fold the productivity of cancer investigators. Led by Argonne National Laboratory in collaboration with Oak Ridge Lawrence Livermore and Los Alamos national laboratories, plus the National Cancer Institute (NCI), CANDLE is using deep learning in three different ways.

The first searches NCI cancer treatment-related molecular data for telltale genetic signatures at the DNA and RNA levels. A second probes how key proteins interact to set up conditions for cancer. “The behavior of these molecules is notoriously hard to understand as they twist and turn and lock-in,” Shankar says. “It takes thousands and millions of (computer processor) hours to simulate a set of molecules moving around for 10 milliseconds.”

A third sifts millions of patient records to create an automated database of things like metastases and cancer recurrence. A study led by ORNL’S Health Data Sciences Institute and the University of Tennessee, Knoxville, found that deep learning neural nets originally developed for computer vision performed consistently better than previous methods for extracting cancer information from electronic pathology reports.

Meanwhile, some of Titan’s GPUs are being used to evaluate a completely different kind of problem.

Investigators from Princeton University, ORNL and Stony Brook University are probing whether deep learning can help scientists better predict the causes and occurrence of millisecond-long energy disruptions that are hampering efforts to develop magnetically confined fusion power reactors such as in England’s Joint European Torus.

The learning data can enable complex fusion experiments to conclude gracefully without expensive damage to the experimental facility.

Oak Ridge National Laboratory is supported by the Department of Energy’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.