CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 22: Memery, ROM [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L22 S.1
Memory Definitions Size Kbytes, Mbytes, Gbytes, Tbytes Speed Read Access delay between read request and the data available Write Access delay between write request and the writing of the data into the memory (Read or Write) Cycle - minimum time required between successive reads or writes Read Cycle Read Read Access Read Access Write Cycle Write Write Setup Write Access Data Data Valid Data Written Sp12 CMPEN 411 L22 S.2
A Typical Memory Hierarchy By taking advantage of the principle of locality, we can present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology. On-Chip Components Control edram Datapath RegFile ITLB DTLB Instr Data Cache Cache Second Level Cache (SRAM) Main Memory (DRAM) Secondary Memory (Disk) Speed (ns):.1 s 1 s 10 s 100 s 1,000 s Size (bytes): 100 s K s 10K s M s T s Cost: highest lowest Sp12 CMPEN 411 L22 S.3
More Memory Definitions Function functionality, nature of the storage mechanism static and dynamic; volatile and nonvolatile (NV); read only (ROM) Access pattern random, serial, content addressable Read Write Memories (RWM) NVRWM ROM Random Access SRAM (cache, register file) Non-Random Access FIFO, LIFO EPROM EEPROM Mask-prog. ROM DRAM (main memory) CAM Shift Register FLASH Electricallyprog. PROM Input-output architecture number of data input and output ports (multiported memories) Application embedded, secondary, tertiary Sp12 CMPEN 411 L22 S.4
Random Access Read Write Memories SRAM Static Random Access Memory data is stored as long as supply is applied large cells (6 fets/cell) so fewer bits/chip fast so used where speed is important (e.g., caches) differential outputs (output BL and!bl) use sense amps for performance compatible with CMOS technology DRAM - Dynamic Random Access Memory periodic refresh required (every 1 to 4 ms) to compensate for the charge loss caused by leakage small cells (1 to 3 fets/cell) so more bits/chip slower so used for main memories single ended output (output BL only) need sense amps for correct operation not typically compatible with CMOS technology Sp12 CMPEN 411 L22 S.5
Kbit capacity/chip Evolution in DRAM Chip Capacity 100000000 10000000 1000000 100000 10000 1000 100 64 10 Sp12 CMPEN 411 L22 S.6 256 1,000 4,000 1.6-2.4 m human memory human DNA 4X growth every 3 years! book 16,000 1.0-1.2 m page 64,000 0.7-0.8 m 256,000 0.5-0.6 m 1,000,000 4,000,000 16,000,000 64,000,000 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010 Year 0.18-0.25 m 0.35-0.4 m 0.13 m 0.1 m 0.07 m encyclopedia 2 hrs CD audio 30 sec HDTV
Memory Timing: Approaches DRAM Timing Multiplexed Adressing SRAM Timing Self-timed Sp12 CMPEN 411 L22 S.7
6-transistor SRAM Storage Cell WL M5!Q M2 M4 Q M6 M1 M3!BL BL Will cover how the cell works in detail in the next lecture Sp12 CMPEN 411 L22 S.8
1D Memory Architecture M bits M bits S 0 Word 0 S 0 Word 0 S 1 Word 1 S 1 Word 1 S 2 S 3 Word 2 Storage Cell A 0 A 1 S 2 S 3 Word 2 Storage Cell A k-1 S N-2 Word N-2 S N-2 Word N-2 S N-1 Word N-1 S N-1 Word N-1 Input/Output N words N select signals Input/Output Decoder reduces # of inputs K = log 2 N Sp12 CMPEN 411 L22 S.9
2D Memory Architecture 2 K-L bit line (BL) word line (WL) A L A L+1 A K-1 storage (RAM) cell A 0 A 1 A L-1 Column Decoder Sense Amplifiers Read/Write Circuits M2 L selects appropriate word from memory row amplifies bit line swing Input/Output (M bits) Sp12 CMPEN 411 L22 S.10
3D (or Banked) Memory Architecture A 1 A 0 Input/Output (M bits) Advantages: 1. Shorter word and bit lines so faster access 2. Block addr activates only 1 block saving power Sp12 CMPEN 411 L22 S.11
2D 4x4 SRAM Memory Bank read precharge bit line precharge enable!bl BL WL[0] A 1 WL[1] A 2 WL[2] WL[3] 2 bit words clocking and control A 0 Column Decoder sense amplifiers BL i BL i+1 write circuitry Sp12 CMPEN 411 L22 S.12
Quartering Gives Shorter WLs and BLs Precharge Circuit Precharge Circuit data A i-1 A 0 Write Circuitry Sense Amps Column Decoder Write Circuitry Sense Amps Column Decoder A N-1 A i Sp12 CMPEN 411 L22 S.13 Read Precharge Read Precharge
Decreasing Word Line Delay Drive the word line from both sides WL driver polysilicon word line metal word line driver Use a metal bypass WL polysilicon word line metal bypass Use silicides Sp12 CMPEN 411 L22 S.14
Read Only Memories (ROMs) A memory that can only be read and never altered Programs for fixed applications that once developed and debugged, never need to be changed, only read Fixing the contents at manufacturing time leads to small and fast implementations. BL = 1 BL = 1 WL WL BL = 0 BL = 0 WL WL Sp12 CMPEN 411 L22 S.15
MOS OR ROM Cell Array 1 0 0 1 0 0 0 0 BL(0) BL(1) BL(2) BL(3) 0 WL(0) 0 1 WL(1) on on V DD 0 0 WL(2) WL(3) V DD predischarge 1 0 Sp12 CMPEN 411 L22 S.17
Precharged MOS NOR ROM 0 1 precharge V DD 1 A 1 enable WL(0) 0 WL(1) 0 1 on on GND 0 A 2 WL(2) 0 WL(3) 0 GND BL(0) BL(1) BL(2) BL(3) 1 0 1 1 1 1 1 0 Sp12 CMPEN 411 L22 S.19
MOS NOR ROM Layout 1 Memory is programmed by adding transistors where needed (ACTIVE mask early in the fab process) WL(0) cell size of 9.5 x 7 GND WL(1) WL(2) metal1 on top of diffusion GND WL(3) Sp12 CMPEN 411 L22 S.20
MOS NOR ROM Layout 2 WL(0) WL(1) GND Memory is programmed by adding contacts where needed (CONTACT mask one of the last processing steps) All transistors are fabricated the presence of a metal contact creates a 0-cell WL(2) cell size of 11 x 7 GND WL(3) Sp12 CMPEN 411 L22 S.21
MOS NAND ROM V DD Pull-up devices BL[0] BL[1] BL[2] BL[3] WL[0] WL[1] WL[2] WL[3] All word lines high by default with exception of selected row Sp12 CMPEN 411 L22 S.22
MOS NAND ROM Layout Cell (8 x 7 ) Programmming using the Metal-1 Layer Only No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM Polysilicon Diffusion Metal1 on Diffusion Sp12 CMPEN 411 L22 S.23
NAND ROM Layout Cell (5 x 6 ) Programmming using Implants Only Polysilicon Threshold-altering implant Metal1 on Diffusion Sp12 CMPEN 411 L22 S.24
Transient Model for 512x512 NOR ROM WL precharge poly r word metal1 C bit BL c word Word line parasitics (distributed RC model) Resistance/cell: 17.5 Wire capacitance/cell: 0.049 ff Gate capacitance/cell: 0.75 ff Bit line parasitics (lumped C model) Resistance/cell: 0.275 (which is negligible) Wire capacitance/cell: 0.09 ff Drain capacitance/cell: 0.8 ff Sp12 CMPEN 411 L22 S.25
Transient Model for 512x512 MOS NAND ROM Model for NAND ROM V DD BL r bit C L Word line parasitics WL r word c bit Similar to NOR ROM c word Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance Bit line parasitics Resistance/cell: 8.7K (compared to 0.275 in NOR) Speed: NOR: T LH =1.87 ns T LH = 1.2 us Sp12 CMPEN 411 L22 S.26
Next Lecture and Reminders Next lecture SRAM, DRAM, and CAM cores - Reading assignment Rabaey, et al, 12.1-12.2.4 Sp12 CMPEN 411 L22 S.27