GOVERNMENT
Processor Triathletes Outrun Sprinters, Says Sun Design Center Chief
SAN JOSE, CA -- The rapid market growth of networked client-server computing, interacting with physical, electrical and design productivity constraints faced by 50-million-plus transistor class microprocessors are creating fundamental shifts in processor design philosophies, according to Dave Tuttle, senior director of the Sun Microsystems, Inc. (Nasdaq: SUNW - news) Austin, Texas Processor Design Center. In a keynote address titled ``Triathletes vs. Sprinters: Processor Design Directions for the New Millennium'' at the 2001 Microprocessor Forum, held here, Tuttle said personal computer marketing-driven imperatives to achieve GHz clock speeds at any cost will increasingly become less relevant to real world computing due to changes in system requirements and application workloads. Furthermore, fundamental limitations imposed by transistor power consumption, the slower speeds of memory ICs and data buses, and the complexity of designing devices with the electronic equivalent of millions of ``moving parts'' will make ``sprinter'' processors harder to design and less worthwhile in terms of return on engineering investment.
``There's a big shift going on in processor design. The simplifying assumptions of the 1990's are running into the steep ends of trade-off curves in the areas of power consumption, system latencies, and the sheer complexity of 50-, 100-, and 200-plus million transistor microprocessor designs. At the same time, contemporary computing workloads are becoming increasingly able to exploit multiprocessing multithreaded networked systems. While 'Sprinter' processors that sacrifice everything to achieve the highest clock speed may have been one of the successful recipes during the last decade, 'Triathlete' processors that reconcile system-focused performance, optimum power density and fast time-to-market now have the upper hand.''
Four ``Meta Trends'' Impact Processor Design
Tuttle began his keynote address to a high-powered gathering of the semiconductor industry's technical and business elite by acknowledging the influence of long-time processor technology analyst Linley Gwennap's 1993 article in Microprocessor Report contrasting high-MHz ``Speed Demons'' with ``Brainiacs,'' which sought high levels of instruction level parallelism. Tuttle's discussion of the current Sprinter-Triathlete competition updates this analysis to examine four Meta Trends that will rule processor design in the early 21st century.
The first Meta Trend focuses on processor power consumption and density. The last decade has seen substantial decreases in supply voltages along with great increases in transistor density. However, CMOS (complimentary metal oxide silicon) technology will not experience the same rate of reductions in voltage over the next decade. This fundamental trend in physics is becoming an increasingly large engineering challenge for well balanced designs, but it is even worse for Sprinter CPUs. Unbalanced processors not only require more expensive cooling at the chip, system and data center levels, but he aggregate effect of millions of computers with poor power/performance balance may be creating the equivalent of 10 miles-per-gallon gas hog cars of the early 1970's. Under these circumstances, power consumption changes from a minor engineering consideration to a fundamental limit on silicon performance. On a macro level, computers being designed around such processors could be unnecessarily wasting the world's electricity supplies.
The second Meta Trend involves the leveling off of the CMOS processing speed performance curve as process geometries and chip feature sizes shrink. CMOS is today's prevailing on-chip process technology and cannot be replaced without massive investment in as yet hypothetical post-CMOS chip manufacturing technology. To cite one example of unavoidable technical barriers lurking in CMOS technology, a number of the material layers deposited on chips for purposes such as insulation are approaching a few atoms thick. They simply can't get much thinner than that. Furthermore, at extremely small geometries, previously trivial performance inhibitors such as interfaces between silicon transistors and metallic interconnects can cause proportionately larger signal transmission delays or require greater design margins to operate reliably.
The third Meta Trend involves the margins of safety required when setting the frequency that a chip will operate at -- what technologists call state management overhead. Since it is impossible for millions of transistors to turn on or off exactly at the same time to synchronize with a system clock, designers must build in a margin to allow for delays, settling time, electrical noise, and surges experienced by transistors when moving from one state (on or off) to another. As clock speed rises, the proportion of an on-off cycle required to maintain this safety margin grows. As the industry has increased clock speeds over the past decade this amount of time wasted can balloon to 10-to-20 percent or more of a chip's cycle time.
In the fourth Meta Trend, processor clock speed gains over the last two decades have outstripped the speed of system memory, input/output (I/O) channels, and communications bus architectures. Since system performance is only as good as the most limiting system bottleneck, when over-designed processors meet under-designed data paths, disappointment ensues. Anybody connecting a GHz-class PC to the Internet through a 56K modem will understand this problem.
Beyond these physical and electronic Meta Trends, the effort -- measured in investments in engineering time, design tools and capital equipment -- required to productize microprocessors is growing faster than potential performance gains. Instead of managing complexity, designers find complexity managing them and the time between the introduction of new generations of products growing longer, rather than shorter if design complexity is not carefully controlled.
Mastering the Meta Trends
Tuttle said that while the relentless pursuit of clock speed was a technically challenging engineering feat during the 1990's, it did not ensure market success. Market and technical factors are converging to favor balanced, triathlete processor designs. Tuttle defines a triathlete processor as one that optimizes clock speed in the interests of optimizing power density, overall fit with system requirements/data bandwidth, and expeditious design/productization processes.
The changing market for computing technology bodes well for triathlete processors. Networked client/server computing has emerged as the high growth technology segment for this decade. Applications and computing workloads have followed suit, emphasizing needs to reliably process many tasks in parallel.
System requirements for these machines include a balanced portfolio of processor features to achieve high, multithreaded processing bandwidth, instant access to vast quantities of memory; linear scalability to enable efficient hundreds-of-processor systems; and rock-solid reliability, availability and serviceability. Rather than budgeting transistors to simply achieve high clock speeds, the multidimensional system requirements of networked client/server systems call for balanced budgeting of processor transistors to achieve multi-axis system performance goals -- by definition an event best suited to triathletes rather than sprinters.
According to Tuttle: ``Every advance in process geometry puts millions more transistors at the disposal of designers. But transistors aren't free. They generate costs in the forms of heat, the need to interconnect them with other transistors, and the arduous design work required to make them do something useful and perform flawlessly over their lifetime. It's a big mistake to chase after the rainbow of maximum clock speed while leaving system factors and design productivity behind. Going forward, both designers and processors will have to become triathletes to deliver the best possible performances in the areas of processing density, overall fit with system requirements and catching make-or-break time-to-market windows.''