STORAGE
COLSA Upgrades World's Largest Apple Cluster
- Written by: Writer
- Category: STORAGE
COLSA Corp. has completed upgrading the world's largest Apple-based cluster to high-speed Myricom interconnect technology. COLSA completed the migration of the 1536-node system from Gigabit Ethernet to Myricom's low-latency, 2Gbps, Myrinet-2000 technology in only two weeks, and has already achieved application-performance improvements averaging nearly forty percent. Based in Huntsville, Alabama, COLSA Corp. provides engineering services that include integration and operation of High-Performance Computing (HPC) systems to numerous clients within Department of Defense (DoD) and other federal agencies. The company built and operates the MACH5 Apple Xserve cluster under the U.S. Army Hypersonic Missile Technology (HMT) program that studies hypersonic airflow dynamics for missile and air vehicle flight. The MACH5 cluster is configured with the Myrinet-2000 technology on 1536 of its 1566 dual-G5 Apple Xserves, delivering a total of 3132 processors with more than 4.5 Terabytes of memory. In its current configuration, MACH5 is the largest and fastest cluster of Apple computers worldwide. Within a year of MACH5's initial deployment, researchers' need for speed and efficiency began to exceed the capabilities of the cluster's Gigabit Ethernet interconnect technology. With customer needs increasing and a pending expansion of the cluster by several hundred nodes under discussion, COLSA elected to upgrade the cluster interconnect, and began evaluating options in terms of performance and the capability for rapid, seamless migration of application programs. "Our environment is unique, serving a division of the government that has intensive computing requirements, and our job is to ensure the highest efficiency HPC environment for researchers with huge scientific processing needs," says Mike Whitlock, Program Director of COLSA's Hypersonic Missile Technology team. "As the HMT team's demand for processing power grew at an enormous rate, our CFD computations couldn't run effectively on the original network and we began looking at alternatives." After evaluating several cluster-network options, Whitlock's team opted to replace MACH5's existing computational interconnect technology with Myrinet-2000 based on superior scalability, efficiency, and the overall elegance of the solution. "We sent out an RFQ and had three technologies respond, some very aggressive on pricing and all very aggressive with claims of what they could do," says Whitlock. "We ended up testing two pilot systems on site: Myrinet, which we were able to get fully operational ourselves within roughly a day, and a competing technology from another vendor that sent three engineers who stayed for four days to get an operational system working. It was an easy decision and it's worked out phenomenally." Upon upgrading the MACH5 1536-node network within three weeks, COLSA engineers reported test results showing a 25-40% average increase in application performance for the CRAFT code, the cluster's most important day-to-day application. On the High-Performance Linpack (HPL) benchmark used to rank computers on the TOP500 supercomputer list, the MACH5 cluster achieved in excess of 16 Teraflops, nearly 66% of peak, which would have corresponded to a rank of 10th on the June-2005 TOP500 list. Dr. Anthony DiRienzo, Executive Vice President of COLSA, underscores that the company's decision encompassed many criteria beyond simply speed. "We're operating production facilities under federal regulations, so our sourcing and selection processes include a matrix of requirements ranging from ease of installation and management to dependable uptime, performance, cost, footprint, cable management, and others," says DiRienzo. "We found Myrinet to have the most efficient solution by far. The Myrinet solution was "plug-and-play." The alternative solutions presented a more complex and time consuming configuration utilizing thick cumbersome cables that were difficult to install and maintain compared with a tidy set of Myrinet fiber cables neatly placed within the server racks. Just looking at the two solutions, it was obvious to us and to our customer that we made the right selection." Myricom founder and Executive Vice President Dr. Nan Boden said the experience with COLSA has been mutually rewarding on numerous fronts. "Even in high-performance computing, it isn't every day that we get to work with a company whose real-world challenges are as sophisticated as those COLSA faces," says Boden. "It's been extremely rewarding for us to work with COLSA on the MACH5 cluster and to see them embrace Myrinet to the point where they're proposing Myrinet clusters to other government clients."