Roberto Zafalon Director, EU R&D Projects Leakage Aware Design for Next Generation's SOCs Roberto Zafalon European R&D Projects Date 09 workshop, April 24 th 2009 Designing for Embedded Parallel Computing Platforms: Architectures Session Outline Market Application rush Basics of CMOS Leakage Power consumption Why bothering for low power systems? Technology Scaling, Trends & Roadmap Leakage Aware design strategies Cost of heat removal: packaging and reliability Memory architectures Increased market share of mobile electronics Limitations of battery technology Conclusion 1
30 Years of Electronics Industry CAGR Semic. Capex: 17% Semic. Market: 15% Electronic Systems: 8 % WW GDP: 3,4% Market Application rush 1 TOPS/W 100 GOPS/W 5 GOPS/W H264 encoding dictation Expression recognition Gbit radio Adaptive route Gesture recognition 3D gaming 3D TV 3D ambient Structured interaction decoding Ubiquitous 3D projectednavigation Autonomous display driving HMI by motion Structured Gesture detection encoding Collision avoidance Language Emotion recognition Mobile Base-band Image recognition UWB A/V Sign streaming recognition 802.11n Si Xray H264 decoding Fully recognition (security) Auto personalization 2005 2007 2009 2011 2013 2015 Year of Introduction 2
CMOS Roadmap: 3 main showstoppers Pat Gelsinger, CTO Intel Corp. Quote from DAC 04 Keynote: Power is the only limiter!! CMOS Roadmap: 3 main showstoppers: 1. Subthreshold Leakage Current ( I off ) 2. Huge Process Variation Spread 3. Interconnect Performance and Signal Integrity A further quote, to start with Roberto Zafalon, Low Power System Design mngr, STMicroelectronics CLEAN-IP General Project Manager Quote from CLEAN Press Release published by EETIMES on Jan 2006: Semiconductor industry urges to overcome the technology shortcomings for 65nm and below, and in particular, process variability and unreliability, as well as leakage currents, Industry needs to decrease power consumption of nanoelectronic devices, increase design productivity and thus make the raised SoC s complexity manageable. 3
Why bothering for low power systems? Practical market issue: Increasing market share of mobile, asking for longer cruising life Limitations of battery technology Economic issue: Reducing packaging costs and achieving energy savings Technology issue: Enabling the realization of high-density chips (heat poses severe constraints to reliability) Electronic Technology Today: CMOS Convergence CMOS technology dominates in modern ICs. 1960s 1970s 1980s 1990s 2000s Watch Chip Calculator PMOS CMOS CMOS SRAM NMOS CMOS Microprocessor NMOS CMOS FLASH NMOS CMOS DRAM PMOS NMOS CMOS Server/Mainframe Bipolar ECL BICMOS CMOS 1960s 1970s 1980s 1990s 2000s 4
CMOS at core of chip making still for many years The theoretical limit for transistor gate length on silicon is around 1.5nm. Today s 65nm CMOS process has a gate length of 42nm: i.e 28X larger than the theoretical limit! In 32nm, the gate length is 21nm i.e. 14X above limit The gate delay determines the fundamental speed of the logic. The theoretical limit is 0.04ps Today s 65nm logic NAND2 is ~1ps, i.e. 24X slower! Transistor density, i.e. the number of device which can be squeezed into a chip, reaches the limit around 1.8 billion Tx per cm². Source: ITRS, STM, IFX Today s 65nm CMOS device is 7.5X larger! (i.e. 750Kgate/mm 2 = 2.4M Tx/mm 2 = 240M Tx/cm 2 ) Performance as measured by clock speed, fell off Moore s Law during the last decade, thanks to Multi Processors computing architectures. Basics of CMOS Power Consumption Power consumption of a CMOS gate: P = P SW + P SC + P Lk where: P SW = Switching (or dynamic) power. P SC = Short-circuit power. P Lk = Leakage (or stand-by) power. In older technologies (0.25um and above), P Lk was marginal w.r.t. switching power: Switching power minimization was the primary objective. In deep sub-micron processes, P Lk becomes critical: ii Leakage accounts for around 5-10% of power budget at 180nm; this grows to 20-25% at 130nm and to 35-60% at 65 nm. 5
Leakage Currents in Bulk CMOS I sub : Subthreshold current. SOURCE GATE I G DRAIN I gs, I gb, I gd : Gate oxide tunneling. I jbs, I jbd : Junction reverse current. I S I gs I gb I gd I sub I GIDL I jbs I GISL I jbd I D I GIDL, I GISL : Gate induced D,S leakage. BULK I ii I B I ii : Impact ionization current. Long Channel (L > 1 um) Very small leakage Short Channel (L > 180nm, Tox > 30A 0 ) Subthreshold leakage Very Short Channel (L > 90nm, Tox > 20A 0 ) Subthreshold + Gate leakage Nano-scaled (L < 90nm, Tox < 20A 0 ) Subthreshold + Gate + Junction leakage Technology Scaling Smaller geometries Higher device density: Smaller gate capacitance, yet many more gates/chip Higher switched capacitance Higher switching power. Higher clock frequencies: Higher switching power Lower supply voltages: Lower switching power, but also lower speed Lower threshold voltages Exponential leakage Consequence: Power density increases as technology scales! 6
ITRS Roadmap 2007 vs Moore s law Squeezing costs of computing cores ARM 9 180 nm 11.8 mm2 130 nm, 5.2 mm2 90 nm, 2.6 mm2 65 nm 1.4 mm2 7
VDD (no more) scaling is increasing the «power crisis» Volt Evolution of VDD (LSTP) 5V plateau 5 45 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Regular Decrease in 10 years From 5V to 1.2V (x 0.7 per node) 1.2V plateau 1.1V 700 500 350 250 180 120 90 65 45 32 1989 1992 1995 1998 2000 2002 2004 2007 2010 2015 Year of production (ITRS) 1V plateau? Power Trend for microprocessors Power density in Intel s microprocessors: 1000 100 Watts/cm 2 10 i386 Nuclear Reactor P4 @ 1.4GHz, 75W P4 Pentium III Hot plate Pentium II PentiumPro Pentium i486 P5 Rocket Noozle Sun s surface 1 1.5μ 1μ 0.7μ 0.5μ 0.35μ 0.25μ 0.18μ 0.13μ 0.1μ 0.07μ 0.05μ 8
CMOS Logic Tech Overview Source: STMicroelectronics Gate total power Total power per gate [nw] 4000 3500 3000 2500 2000 1500 1000 65LP/G 45GS 32LP 32GP F=500Mhz 500 0 1% 5% 10% 20% 50% 90% Duty Cycle [%] Source: STMicroelectronics 9
90/65/45nm Speed vs Leakage Source: STMicroelectronics Ioff/Ion for 32LP, 65LP and 65GP 1.E+06 PMOS NMOS 1.E+05 LVT LVT 65GP SVT SLVT 65GP SVT SLVT Ioff (pa/um) 1.E+04 1.E+03 HVT LVT 32LP 1.0V LVT HVT LVT 32LP 1.0V LVT 1E+02 1.E+02 65LP SVT HVT SVT 65LP HVT SVT SVT 65LP 1.2V 65GP 1.0V 32LP 1.0V HVT HVT 1.E+01 200 300 400 500 600 700 800 900 1000 1100 Ion (ua/um) Source: STMicroelectronics 10
Technology Scaling Increasing contribution of leakage power: Example: ASICs [source: STMicroelectronics] Pow er Density (Watts/cm 2 ) 150 125 100 75 50 25 0 250nm 180nm 130nm 90nm 65nm Leakage Power Dynamic Power Example: Microprocessors [source: Intel]. Itanium 2: 180nm, 1.5V, 1.0GHz, 221MTx (core+cache) Itanium 3: 130nm, 1.3V, 1.5GHz, 410MTx (core+cache) 100% 80% 60% 40% 20% 0% Itanium 2 Itanium 3 Leakage Power I/O Power Dynamic Power SoC Requirements for MP platforms (1) Processing performance is expected to grow more than 200x in the next 15 years. 11
SoC Requirements for MP platforms (2) # PE per chip; Processing Performance; ND2 s max switching frequency (normalized to 2007) 2400.0 Wafers/Week x1000 2000.0 1600.00 1200.0 800.0 400.0 0.0 MOS Capacity by Dimensions 3Q 05 4Q 05 1Q 06 2Q 06 3Q 06 4Q 06 1Q 07 2Q 07 3Q 07 4Q 07 1Q 08 2Q 08 >=0.7µ <0.7µ >=0.4µ <0.4µ >=0.3µ <0.3µ >=0.2µ <0.2µ >=0.16µ <0.16µ >=0.12µ <0.12µ <0.12µ >=0.08µ <0.08µ Source: "Semiconductor Industry Association", Statistics Report 2008-Q2 12
Dynamic vs. Leakage Power Source: ITRS Roadmap 100 300 Pow wer [W] 1 Dynamic Power Cross-Over 250 200 10-2 10-4 Sub-Threshold Leakage Possible trajectory for high-k dielectrics 50 Technology Node Gate-Oxide 10-6 Leakage 0 1990 1995 2000 2005 2010 2015 2020 150 100 e Length [nm] Physical Gate Semiconductor s Challenge Power Sensors RF FPGA Memory Graphics Computing Wireless OS, software. Protocols Communication Moore s Law at Work! 13
Leakage crisis: Is it a technology issue only? Trends: nominal Vdd getting stable around 1V MOS s Vth linearly scales to keep costant speed But leakage grows exponentially with Vth reduction!! sub-threshold current from 100 to 1000 pa/um gate leakage to become larger that sub-threshold total static power from 21E-12 to 60E-12 W/Transistor SOI has major disadvantages w.r.t. subthreshold reduction! Leakage Aware design strategy includes A. Gate/Circuit-level techniques Use of multiple V th Dual-V th design. Mixed-V th (MVT) CMOS design. MTCMOS. Sleep transistor insertion/voltage islands State retention FFs B. Techniques for memory circuits Cell state (stored value) determines exactly which transistors leak State-preserving techniques: Only suitable choice for non-cache memories (e.g., scratchpad). State-destroying techniques: Suitable for caches (can invalidate values). C. Architectural techniques Adaptive Body Biasing (ABB). Adaptive Voltage Scaling (AVS). V th hopping. Multiple V BB 14
Memory Driver Static 0.0035 Static Power Dissipation (mw/cell) HP / LSTP Dynamic Power Consumption Per Cell - (mw/mhz) HP / LSTP Dyn 8.E-07 0.003 7.E-07 0.0025 6.E-07 5.E-07 0.002 4.E-07 0.0015 3.E-07 0.001 2.E-07 0.0005 1.E-07 0 2005 2006 2007 2008 2009 2010 2012 2015 2018 0.E+00 Low Leakage Memory Approaches Leakage reduction techniques can be broadly classified in terms of how memory state is managed : State-preserving techniques: Memory cell value is preserved when in low-leakage state. Suitable choice for non-cache memories (e.g., scratch-pad). State-destroying techniques: Memory cell value is NOT preserved when in low-leakage state. Suitable only for caches (can invalidate values). Tradeoff between: Residual leakage paid to preserve the state. Restoring the lost state from higher levels of the memory hierarchy. 15
Low Leakage Memory Approaches (cont.) Circuit-level techniques: Modify internal structure of SRAM cells. Transistor size, P/N ratio, V th, body bias. Additional transistors. Precharge policy tuning May possibly require specialized process (e.g., different Tox, Halo doping, multiple V th ). Architectural techniques. Use system level information to determine conditions to drive portions of memory into lowleakage state. Portions of Memory: bit lines, blocks, regions, etc. Spatio-Temporal-Value Cache Partitioned Architecure (Outcome of CLEAN): CM 1. Tag 1 Data 1. Address Tag Data Address CM i Address i Tag i Data i. Sleep i. CM N Tag N Data N 16
SoC Design Grand Challenges (source: ITRS 2007) MANAGEMENT OF OVERALL POWER Due to the Moore s law,power management is the primary issue across most application segments. Needs to be addressed across multiple levels, especially system, design, and process technology. MANAGEMENT OF LEAKAGE POWER Leakage currents increase by 10x per tech node. From system design requirements & improvements in CAD design tools, downto leakage and performance requirements for new architectures. Subthreshold Leakage vs. Temperature Ioff (na/μm) 10,000 1,000 100 10 0.10 μm 0.13 μm 0.18 μm 0.25 μm 1 30 40 50 60 70 80 90 100 110 Temp (C) 17
Thermal map of a Multi Processor SoC Chip floorplan Steady state temperature Some hot spots in steady state: Silicon is a good thermal conductor (only 4x worse than Cu) and temperature gradients are likely to occur on large dies Lower power density than on a high performance CPU (lower frequency and less complex HW) Thermal Management Challenge BGA Normalized cost vs. thermal enhancement Ceramic Metal max cost min cost Organic Baseline 0 1 2 3 4 5 6 Normalized cost Source: STM Corporate Packaging BGA package rough (Cost-performance High-performance) max power density = 50 60 W/cm2 Cost per pin = 0.25 1.1 /pin (~ 90 pins/cm2) Max pincount = 500 2500+ 18
Increased share of Mobile Phone Subscribers Cellular Phones: GSM+CDMA The fastest growing communication technology of all time. The billionth subcriber user was connected in Q1 2002 Millions of subscribers 4000 3500 3000 2500 2000 1500 1000 500 0 1998 2000 2002 2004 2006 Q2-08 Mobile Phones Regional Split at Q2-2008 3665 M subscribers as of Q2-2008 Mobile Broadband Network (HSPA) subscribers has reached 50 M from 11 M on 2007 (i.e. 4 M/Month growth rate). GSM Regional Statistics Q2-2008 194 323 412 413 280 492 1547 Asia Pac Europe West Europe East Americas Africa Middle East USA/CND 19
Cellular Phone s standby current Nomadik : ST s example of Mobile Multi-Media driver Audio Video 20
Nomadik : a flagship design for ultra low power! and not Only Mobile! 20% of electrical energy consumed in Amsterdam is used for Telecom In the US, Internet is responsible for 9% of the electrical energy consumed nation-wide This grows to 13% with all computer applications Transfering 2 MBytes of data through the internet consumes the energy of 1 pound of coal (1 pound=0.453 Kg) Source: 2000 CO2 conference, Amsterdam, NL 21
Complexity goes non-linear Complexity SOXX Finance 50% 40% 30% 20% 10 % 0% -10% -20% -30% -40% 3/12 12/12-50% integration Markets LITHO & DFM LINEAR APPROACH TO SW complexity non linear PROBLEMS! Cost IC Design verification Complexity Conclusion Semiconductor market is still CMOS dominated: Switching and leakage power. Leakage will become dominant for technology nodes below 65nm. Leakage power optimization must be addressed from both technology and design points of view. Many circuit-level techniques have been investigated recently: Not yet fully supported by commercial EDA tools. Higher-level approaches are still in their infancy: Results are promising. The electronics industry calls for a REVOLUTION! 22
Industry s Needs Ultra low power systems Ultra low power cognitive radio Energy scavenging Micro-Nano System systems On Wafer System In Package System On Wafer 23