SCIENCE
DRC Computer Establishes Stunning Genomics World Record
First Highly Scalable Gene Sequence Analysis Appliance Delivering Multi-Trillion Cell Updates per Second Running on Microsoft Windows HPC Server 2008 R2
DRC Computer Corporation (DRC) has achieved 9.4 trillion cell updates per second (TCUPS) running the Smith-Waterman algorithm with Affine gap model on the latest DRC Accelium coprocessors. Now medical researchers, pharmacologists and DNA forensic experts can more effectively and rapidly analyze human gene sequences to identify medical conditions, build new treatments and complete criminal investigations.
"As the volume and intensity of research and commercial gene sequence analysis increases, it is critical to have systems that can scale to high volume analysis using standard, cost-effective servers," commented Dr. Michael Schatz, Assistant Professor of Quantitative Biology at Cold Spring Harbor Laboratory.
Previously gene sequence analysis has been very costly and lengthy, requiring expensive high performance servers. Now the time and cost to complete can be reduced by a factor of 20 using standard Intel-based servers installed with DRC Accelium processors and running Microsoft Windows HPC Server 2008 R2. Not only does this improve the analysis time but it also reduces by over 90% the computing cost, power, real estate and infrastructure (such as air conditioning) required to obtain the results. Most importantly, the price/performance is over 5 times better than any other published results.
“Microsoft is committed to making technical computing simpler and more affordable for a broader audience. This world record by DRC demonstrates the scalability of Windows HPC Server as a platform for the most powerful algorithms and address the high performance computing needs in the biomedical field,” said Bill Hamilton, director in the Technical Computing group at Microsoft.
The Smith-Waterman algorithm is widely used in bioscience to align DNA and protein sequences and is considered to deliver the highest accuracy of any alignment algorithm; however, it is computationally intensive. Using a novel design, DRC engineers have implemented the algorithm on the massively parallel DRC processor. DRC achieved a performance of 9.4 trillion CUPS running 200 base-pair DNA reads against a 650,000,000 nucleotides database. This benchmark ran on clustered, standard servers incorporating multiple DRC processors operating as a cloud computing environment. On a single DRC Accelium processor, DRC achieved 530 billion CUPS. DRC used the SSEARCH35 tool within the FASTA genomics tool kit to benchmark performance. This highly scalable architecture can extend well beyond the configuration used in this benchmark to build a massive cloud service capability.
“This new world record establishes that DRC processors running the Smith-Waterman algorithm can scale to meet the most challenging bioscience requirements,” said Mark O’Hare, DRC Chairman and CEO. “What really surprises the experts is the outstanding price/performance.”