ACADEMIA
Smaller Slower, Supercomputers Someday May Win the Race
LOS ALAMOS, NM -- The supercomputers of the future will never crash and will cost far less to run than today's machines. At least that's the vision of a scientist at the National Nuclear Security Administration's Los Alamos National Laboratory. "Everyone's fixed on the mantra of performance at all costs," said Wu Feng of Los Alamos' Advanced Computing Laboratory. "What we've done is redefine the price-to-performance ratio to look at efficiency, reliability and availability, in other words, total cost of ownership."
Feng and colleagues Michael Warren and Eric Weigle developed the first of this new breed of high-performing, low-cost computers, which they named "Green Destiny." The machine has been operating with unprecedented stability and performance efficiency for more than eight months in a dusty warehouse where temperatures routinely reach 85-degrees Fahrenheit.
Feng, Warren and Weigle argue that the costs of computing should include electrical power, infrastructure, air conditioning, floor space, time lost to system failures and salaries for the people needed to keep finicky machines operating. Supercomputers of the future may very well be similar to Green Destiny, they say: small, extremely stable and miserly in their power use.
Green Destiny represents a new "flavor" of supercomputer, Feng said. The machine packs 240 Transmeta processors that operate at 667 MegaHertz, mounted onto a half-inch-slim compact motherboard, or blade. A total of 24 blades then mount into a RLX Technologies System 324 chassis, and then ten chassis, with network switches, are mounted in a standard computer rack.
Currently computing at a peak rate of 160-billion operations per second, Green Destiny uses less than ten percent of the electricity and twenty-five percent of the space to give performance comparable to the previous generation of so-called cluster computers. More important is reliability, Feng said.
"As the push for performance goes up, so does the power consumption. And system failure is directly proportional to power consumption," he pointed out. "If your machine isn't available all the time, then you can't do any computing.
In fact, unpublished empirical data from computer vendors indicate that as processor temperatures increase by 10-degrees Celsius, failure rates double. Typical computing-intensive businesses depend on hundreds or even thousands of identical servers to handle multiple requests for information simultaneously. When the servers go down, hourly losses can range up to $6.5 million for a large brokerage firm.
Green Destiny, whose processors operate roughly one-tenth as hot as market-leading chips, has been running continuously since September without air filtration or special cooling. In fact, it kept humming even with the fans removed. "It's absolutely rock solid," Feng said. "It's so reliable we only keep one spare blade around, and we have never needed it."
In a recent paper, available at http://public.lanl.gov/feng/Bladed-Beowulf.pdf online, Feng predicts, based on Moore's law, that the drive for increased performance will result in "the microprocessor of 2010 having over one billion transistors and dissipating over one kilowatt of thermal energy; this is considerably more energy per square centimeter than even a nuclear reactor."
Beowulf clusters, developed at NASA in the early 1990s, group commodity processors with commercial switches and have attracted much attention because they're able to handle many computations simultaneously. A larger version of the Green Destiny Bladed Beowulf cluster would require far less space than a traditional Beowulf cluster. Putting 2,000 of the bladed machines together could yield an enormous savings in space, and in costs, with floor space in Silicon Valley renting for more than $150 a square foot.
Feng argued that with all these factors taken into account, the true price-to-performance rating for Green Destiny would be at least twice as good as other supercomputers.
Internet pioneer Gordon Bell, software guru Linus Torvalds and other guests visited Los Alamos Laboratory recently to learn more about Green Destiny and the Supercomputing in Small Spaces project, whose web site is at http://sss.lanl.gov online.
Stephen Lee, acting deputy leader of Los Alamos' Computer and Computational Sciences Division, said Green Destiny represents a promising research advance, but emphasized the national need for large platforms that are uniquely able to move huge amounts of data in and out of memory rapidly, such as Los Alamos' Q machine, developed for NNSA's Advanced Simulation and Computing program, or ASCI.
"This could be the next important step in scalable supercomputing, but the challenge of maintaining the nation's nuclear stockpile in the face of aging weapons, eroding expertise and nearly a decade without nuclear testing demand three-dimensional, full physics computing on tera-scale computers today, while designers and engineers with weapon test experience are still available to validate the ASCI simulations." Lee said.
The best use for machines like Green Destiny might be in the inexpensive development of scientific codes for a wide range of applications, Feng and Lee said. Once the code has been developed and stabilized, it could move to an ASCI-style supercomputer.
Feng's team at Los Alamos originally bought the machine from RLX Technologies to host large volumes of data. After several delays in compiling the data, they decided to make a cluster instead and tested it with some high-performance applications, such as Warren's simulation of the beginnings of the universe and his three-dimensional model of supernovae. Among planned future jobs for Green Destiny are global climate modeling, large-scale molecular dynamics, computational fluid dynamics and bioinformatics.
"At first, we did not think that there was anything particularly novel about this," Feng said. "We showed it to fellow researchers at a supercomputing conference last November, and we saw more than 7,000 hits on our web site the following week. This project has taken on a life of its own."
The Transmeta Crusoe processor provides about 75 percent of the performance of similarly clocked chips from a major manufacturer used in other Beowulf clusters. So Green Destiny might be compared to the tortoise, eventual winner of the fabled race with the speedy hare.
"If you want to solve a problem that you can complete on a traditional supercomputer faster than the mean time between failures, I guarantee we'll run slower than a traditional supercomputer or virtually any other cluster," Feng said. "But if you're running something that takes weeks or months, eventually the stable machine will win the race."