SCIENCE
NCSA to Deploy IBM's GPFS for All Supercomputing Systems
- Written by: Webmaster
- Category: SCIENCE
The National Center for Supercomputing Applications (NCSA) will soon employ IBM's General Parallel File System (GPFS) across all its supercomputing platforms, including the upcoming sustained-petaflop Blue Waters system. GPFS is a high-performance, scalable clustered file system that provides reliable, concurrent high-speed file access to applications running on multiple nodes of clusters. In addition, GPFS greatly simplifies cluster file system administration and includes tools capable of managing petabytes of data and billions of files.
"A high-performance, parallel, facility-wide file system has been our vision for a long time," said Bill Kramer, the deputy project director for Blue Waters. "This is a fundamental enabler of future data-focused activities at NCSA and Illinois. This allows us to be at the forefront of data-intensive science."
Currently each of NCSA's supercomputers has its own file system; when researchers want to work on multiple systems or move from one to another they have to manually transfer their data, typically making multiple copies. This is time-consuming for the researchers and consumes both energy and storage space.
With the new site-wide GPFS environment, NCSA can provide a common file system and namespace across computing platforms. Researchers can seamlessly access their data from whichever supercomputer they are using without needing to transfer it from another compute resource. This streamlines the research process and is more cost- and energy-efficient because NCSA doesn't have to purchase additional disk space to store duplicate files. And by pooling the resources of storage purchases, a much larger common shared data infrastructure can be built.
In addition, the data is available in a highly parallel manner, making access to massive amounts of data much faster.
"There are many applications today that generate very high output--some applications could generate a petabyte of data for just one large simulation," says Robert Fiedler, the technical program manager for Blue Waters science and engineering applications. "The IBM GPFS will provide the performance, reliability and efficiency we needed for large-scale I/O requirements."
The NCSA GPFS system will be integrated with IBM's High Performance Storage System (HPSS) to provide additional enhancement to the storage infrastructure.
As part of the agreement between the University of Illinois and IBM, NCSA will serve as the front-line "help desk" for GPFS, assisting other users across campus in deploying and using the technology for their projects. According to Kramer, GPFS will be attractive to other campus research efforts that produce large amounts of data, such as life science, energy, and astronomy research.
"In the past, the options for file systems and support have been costly or have required full-time on-staff experts," said Michelle Butler, leader of NCSA's Storage Enabling Technologies group. "The GPFS multi-system offering allows NCSA and Illinois to use one of the best file systems in the world today at reasonable cost on all clusters, promoting shared file systems such as the one NCSA will provide across all its compute platforms."
Illinois' licensing agreement with IBM includes volume discounts, so as more server licenses are acquired by the campus, the price drops.
"The driving factor for this agreement was the Blue Waters system. It provided the critical mass to make such an novel agreement between Illinois and IBM attractive in a cost-effective manner," Kramer said. "We are the first institution to reach an agreement with IBM to do this with all machines across all architectures at full scale."
"GPFS was invented by IBM Research more than a decade ago and, thanks to continuous investment, now forms the basis for some of IBM's most innovative high-performance storage solutions," said Dave Jursik, vice president of Deep Computing for IBM. "Through this agreement with the University of Illinois and others like it, GPFS is becoming widely adopted across the HPC community."