ACADEMIA
Open Science Grid built for physics benefits widespread applications
Snaking cables and racks of computer processors with winking blue lights fill a room in University of California, San Diego’s Mayer Hall. It’s a powerful resource, made more so through links to a network of more than 80 similar centers distributed across the country.
Called the Open Science Grid, the network connects processors and data storage owned by an alliance of universities and national laboratories to muster supercomputing power in response to spikes in individual research group’s needs.
“It’s an ideal way of working,” said physics professor Frank Würthwein, who as Co-Principle Investigator helps to develop protocols and mechanisms that allow users to share the resource. “You retain control of your own resources, with the ability to scale up in a reasonable amount of time to meet occasional large needs.”
Physicists created the Open Science Grid to handle the torrent of data produced by the Large Hadron Collider. Intersecting beams of protons gunning at high energies for months at a time generate trillions of collisions, and discoveries depend on the ability to trace the energy and trajectory of the resulting debris. Distributed, high-throughput computing, which the Open Science Grid provides, ideally suits the need to analyze this enormous number of independent events.
“The Open Science Grid has accelerated our science,” said Würthwein who runs the cluster, the roomful of 2,500 linked processing cores, disks that can store a petabyte of data, and a scheduler that allocates specific tasks to each core at particular times.
As the world waits to learn whether the Higgs boson exists, a great deal of physics has already emerged from the LHC. The CMS collaboration, in which UC San Diego faculty members Jim Branson, Vivek Sharma, Avi Yargil and Würthwein, and scientists in their research groups participate, has generated more than 140 scientific papers. UC San Diego’s contributions include essential descriptions of events that form a background that must be understood in order to detect the Higgs.
The Open Science Grid offers supercomputing power to projects whose occasional or even one-time computational needs preclude investing in such a resource. In fact, the consortium recently received an additional $27 million in funding to continue to provide software and services to other scientists, including several groups at UC San Diego.
One is the Protein Data Bank, the single world-wide repository for three-dimensional structures of large molecules, based at Rutgers University in New Jersey, and the San Diego Supercomputer Center and Skaggs School of Pharmacy and Pharmaceutical Sciences at UC San Diego. They used the Open Science Grid to compare pairs of proteins, looking for structural similarities that might otherwise have gone unnoticed.
“This year we have used the Open Science Grid to calculate one billion alignments,” said Andreas Prlic, a senior scientists with the Protein Data Bank. “I don’t know of any other resource that would have allowed us to do that as quickly. Needless to say, we are huge Open Science Grid fans.”
Bruce Thayre, a staff research associate at UC San Diego’s Scripps Institution of Oceanography is a fan as well. Thayre works with sounds recorded by underwater microphones deployed in the open ocean for up to a year at a time. The Scripps Whale Acoustics Lab, where Thayre works, has developed ways of picking out particular sounds, like those made by ships, from other ocean noises in these long recordings, which can amount to 8 terabytes of data.
By analyzing the timing of sounds identified by detection algorithms run on the Open Science Grid, scientists recently showed that certain types of sonar signals disrupt calling by foraging blue whales off the coast of California.
“This is only possible because of guys like Frank who are involved in the LHC,” Thayre said. “It’s a phenomenal tool for all types of scientists who wouldn’t otherwise have access to that level of computing.” In several days of computing time on the Open Science Grid, they were able to accomplish what Thayre estimated would have taken months to do using local computing power.
Würthwein said he finds these far-ranging applications satisfying. “As an experimentalist and someone who invents and develops, and ultimately operates new ways of using computers to do science, I’m excited when these tools find widespread application across disciplines.”