Breaking Down the Storage-Performance Wall

Ioan Raicu
Assistant Professor Ioan Raicu explains a compute node to Tonglin Li, computer science doctoral student.

The central processing unit of the computer, often dubbed the brain, has historically upstaged memory in the quest for fast and powerful machines. Ioan Raicu, assistant professor of computer science, says that a fully developed brain relies on more than its ability to “think.” It also relies on its ability to retrieve data from memory to carry out normal functions and problem-solve as well as store information for future work.

Computer memory devices, such as hard disks and memory modules, used in today’s supercomputers, will not be able to keep up with the processors behind the next generation of supercomputer—the exascale computer—expected to run 1 quintillion (1017) operations per second. Raicu has a solution to this storage-performance issue and was given a 2011 Faculty Early Career Development (CAREER) Award from the National Science Foundation to help develop his plan.

“My proposal looks at a radically different way of managing persistent storage, which involves very little of the network,” he says, noting that his design covers both the hardware and software required to run it. “The idea is to keep as much data local on each compute node as possible using solid-state [no moving parts] memory. Spinning disks are affordable and have a lot of capacity, but they are not compact, generate heat, and are prone to failure.”

Besides having persistent-storage capability in hard-disk drives, a supercomputer has volatile memory chips on each compute node; tens of thousands of these compute nodes connected together make up a supercomputer. When the machine fails or is powered down, any data in the memory is lost. The compute nodes exchange information across a shared, internal communications network that has become increasingly congested and slow, given exponential growth in processing capacity and the complexity of problems supercomputers are tasked with undertaking.

Supercomputers store data on external parallel file systems that also require communicating across a network. Being able to store the massive number of calculations exascale computers will perform in dedicated compute-node storage will result in a more high-performance and efficient system. There are currently no storage systems available that will scale to the sheer number of computers used in future exascale systems.

Anticipated to be 1,000 times more powerful than China’s Tianhe-1A, the world’s fastest computer, exascale supercomputers will be capable of performing a wide variety of technological and scientific projects—from the modeling and simulation of safe nuclear reactors to global-climate patterns. President Barack Obama included $126 million in his 2012 budget proposal for exascale computing development.