CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

Similar documents
CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 15: Dynamic CMOS

Energy Efficient Content-Addressable Memory

Lecture 10: Circuit Families

Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology

Design of a Low Power Content Addressable Memory (CAM)

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT

CMPEN 411 VLSI Digital Circuits Spring Lecture 06: Static CMOS Logic

HYB25D256400/800AT 256-MBit Double Data Rata SDRAM

HYB25D256[400/800/160]B[T/C](L) 256-Mbit Double Data Rate SDRAM, Die Rev. B Data Sheet Jan. 2003, V1.1. Features. Description

Precharge-Free, Low-Power Content-Addressable Memory

Page 1. Goal. Digital Circuits: why they leak, how to counter. Design methodology: consider all design abstraction levels. Outline: bottom-up

Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge

Design and Analysis of 32 Bit Regular and Improved Square Root Carry Select Adder

A48P4616B. 16M X 16 Bit DDR DRAM. Document Title 16M X 16 Bit DDR DRAM. Revision History. AMIC Technology, Corp. Rev. No. History Issue Date Remark

Advantage Memory Corporation reserves the right to change products and specifications without notice

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019

SDRAM AS4SD8M Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory. PIN ASSIGNMENT (Top View)

HYB25D256400B[T/C](L) HYB25D256800B[T/C](L) HYB25D256160B[T/C](L)

- DQ0 - NC DQ1 - NC - NC DQ0 - NC DQ2 DQ1 DQ CONFIGURATION. None SPEED GRADE

ASIC Design (7v81) Spring 2000

TC59SM816/08/04BFT/BFTL-70,-75,-80

128Mb DDR SDRAM. Features. Description. REV 1.1 Oct, 2006

DQ0 NC DQ1 DQ0 DQ2 DQ3 DQ Speed Grade

Layout Design and Implementation of Adiabatic based Low Power CPAL Ripple Carry Adder

t WR = 2 CLK A2 Notes:

SYNCHRONOUS DRAM. 128Mb: x32 SDRAM. MT48LC4M32B2-1 Meg x 32 x 4 banks

Non-volatile STT-RAM: A True Universal Memory

Advantage Memory Corporation reserves the right to change products and specifications without notice

ReRAM Technology, Versatility, and Readiness

Revision History Revision 1.0 (August, 2003) - First release. Revision 1.1 (February, 2004) -Corrected typo.

A High-Speed and Low-Energy Ternary Content Addressable Memory Design Using Feedback in Match-Line Sense Amplifier

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.

A 5T SRAM with Improved Read Stability and Variation Tolerance over 6T

AVS64( )L

Field Programmable Gate Arrays a Case Study

Advantage Memory Corporation reserves the right to change products and specifications without notice

SDRAM DEVICE OPERATION

- - DQ0 NC DQ1 DQ0 DQ2 - NC DQ1 DQ3 NC - NC

- DQ0 - NC DQ1 - NC - NC DQ0 - NC DQ2 DQ1 DQ

DOUBLE DATA RATE (DDR) SDRAM

SYNCHRONOUS DRAM. 256Mb: x4, x8, x16 SDRAM 3.3V

CprE 281: Digital Logic

Notes: Clock Frequency (MHz) Target t RCD- t RP-CL t RCD (ns) t RP (ns) CL (ns) -6A E

SDR SDRAM. MT48LC8M8A2 2 Meg x 8 x 4 Banks MT48LC4M16A2 1 Meg x 16 x 4 Banks. Features. 64Mb: x8, x16 SDRAM. Features

Introduction to Digital Techniques

Notes: Clock Frequency (MHz) Target t RCD- t RP-CL t RCD (ns) t RP (ns) CL (ns) A

SDRAM Device Operations

Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches

IN CONVENTIONAL CMOS circuits, the required logic

- DQ0 - NC DQ1 - NC - NC DQ0 - NC DQ2 DQ1 DQ

IS42S32200L IS45S32200L

- - DQ0 NC DQ1 DQ0 DQ2 - NC DQ1 DQ3 NC - NC

V58C2256(804/404/164)SH HIGH PERFORMANCE 256 Mbit DDR SDRAM 4 BANKS X 8Mbit X 8 (804) 4 BANKS X 4Mbit X 16 (164) 4 BANKS X 16Mbit X 4 (404)

SP4 DOCUMENTATION. 1. SP4 Reference manual SP4 console.

M464S1724CT1 SDRAM SODIMM 16Mx64 SDRAM SODIMM based on 8Mx16,4Banks,4K Refresh,3.3V Synchronous DRAMs with SPD. Pin. Pin. Back. Front DQ53 DQ54 DQ55

Sól Dual Voltage Buck Boost Solar Charge Controller Connection & Operation V1.00

IS42S32200C1. 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM

Shrink-TSOP. M464S3323CN0 SDRAM SODIMM 32Mx64 SDRAM SODIMM based on stsop2 16Mx8, 4Banks, 4K Refresh, 3.3V SDRAMs with SPD. Pin. Front. Pin.

DS1250W 3.3V 4096k Nonvolatile SRAM

CS250 VLSI Systems Design

Advanced Topics. Packaging Power Distribution I/O. ECE 261 James Morizio 1

DQ18 DQ19 VDD DQ20 NC *VREF **CKE1 VSS DQ21 DQ22 DQ23 VSS DQ24 DQ25 DQ26 DQ27 VDD DQ28 DQ29 DQ30 DQ31 VSS **CLK2 NC NC SDA SCL VDD

256Mbit SDRAM. 8M x 8bit x 4 Banks Synchronous DRAM LVTTL. Revision 0.1 Sept. 2001

A 0.35um CMOS 1,632-gate count Zero-Overhead Dynamic Optically Reconfigurable Gate Array VLSI

140 WDD PRECHARGE ENABLE Y-40s

SDR SDRAM. MT48LC16M4A2 4 Meg x 4 x 4 Banks MT48LC8M8A2 2 Meg x 8 x 4 Banks MT48LC4M16A2 1 Meg x 16 x 4 Banks. Features. 64Mb: x4, x8, x16 SDRAM

TABLE OF CONTENTS 1. GENERAL DESCRIPTION FEATURES PIN DESCRIPTION Signal Descriptions BLOCK DIAGRAM...

Revision History. REV. 0.1 June Revision 0.0 (May, 1999) PC133 first published.

DS1250Y/AB 4096k Nonvolatile SRAM

SDR SDRAM. MT48LC32M4A2 8 Meg x 4 x 4 Banks MT48LC16M8A2 4 Meg x 8 x 4 Banks MT48LC8M16A2 2 Meg x 16 x 4 Banks. Features. 128Mb: x4, x8, x16 SDRAM

LM3621 Single Cell Lithium-Ion Battery Charger Controller

Notes: Clock Frequency (MHz) Target t RCD- t RP-CL t RCD (ns) t RP (ns) CL (ns) -6A

Notes: 1K A[9:0] Hold

IS42S Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM FEATURES OVERVIEW. PIN CONFIGURATIONS 54-Pin TSOP (Type II)

DS1643/DS1643P Nonvolatile Timekeeping RAM

Parallelism I: Inside the Core

DS1230Y/AB 256k Nonvolatile SRAM

Mobile SDRAM AVM121632S- 32M X 16 bit AVM123216S- 16M X 32 bit

TS1SSG S (TS16MSS64V6G)

1. GENERAL DESCRIPTION

Learn to Design with Stratix III FPGAs Programmable Power Technology and Selectable Core Voltage

In-Place Associative Computing:

OKI Semiconductor MD56V82160

HDS 5812 Amplified pressure sensor

HDS 5105 Amplified pressure sensor/switch

UTBB FD-SOI: The Technology for Extreme Power Efficient SOCs

Basic Electricity. Mike Koch Lead Mentor Muncie Delaware Robotics Team 1720 PhyXTGears. and Electronics. for FRC

Storage-less and converter-less maximum power tracking of photovoltaic cells for a nonvolatile microprocessor

CS 250! VLSI System Design

DC Nanogrids Igor Cvetkovic

Dual Voltage Solar Power Charge Controller Board Connection & Operation V2.xx

4707 DEY ROAD LIVERPOOL, NY PHONE: (315) FAX: (315) M.S. KENNEDY CORPORATION MSK Web Site:

Mobile Low-Power SDR SDRAM

512K 4 BANKS 32BITS SDRAM

8. OPERATION Read Operation Write Operation Precharge... 18

Notes: Clock Frequency (MHz) Target t RCD- t RP-CL t RCD (ns) t RP (ns) CL (ns) -6A E

An Energy Efficient Design of High-Speed Ternary CAM Using Match-Line Segmentation and Resistive Feedback in Sense Amplifier

SDR SDRAM. MT48LC2M32B2 512K x 32 x 4 Banks. Features. 64Mb: x32 SDRAM. Features

Transcription:

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L24 S.1

Review: Read-Write Memories (RAMs) Static SRAM data is stored as long as supply is applied large cells (6 fets/cell) so fewer bits/chip fast so used where speed is important (e.g., caches) differential outputs (output BL and!bl) use sense amps for performance compatible with CMOS technology Dynamic DRAM periodic refresh required (every 1 to 4 ms) to compensate for the charge loss caused by leakage small cells (1 to 3 fets/cell) so more bits/chip slower so used for main memories single ended output (output BL only) need sense amps for correct operation not typically compatible with CMOS technology Sp12 CMPEN 411 L24 S.2

Peripheral Memory Circuitry Row and column decoders Read bit line precharge logic Sense amplifiers Timing and control Speed Power consumption Area pitch matching Sp12 CMPEN 411 L24 S.6

Row Decoders Collection of 2 M complex logic gates organized in a regular, dense fashion (N)AND decoder for 8 address bits WL(0) =!A 7 &!A 6 &!A 5 &!A 4 &!A 3 &!A 2 &!A 1 &!A 0 WL(255) = A 7 & A 6 & A 5 & A 4 & A 3 & A 2 & A 1 & A 0 NOR decoder for 8 address bits WL(0) =!(A 7 A 6 A 5 A 4 A 3 A 2 A 1 A 0 ) WL(255) =!(!A 7!A 6!A 5!A 4!A 3!A 2!A 1!A 0 ) Goals: Pitch matched, fast, low power Sp12 CMPEN 411 L24 S.7

Implementing a Wide NOR Function Single stage 8x256 bit decoder (as in Lecture 22) One 8 input NOR gate per row x 256 rows = 256 x (8+8) = 4,096 Pitch match and speed/power issues Decompose logic into multiple levels!wl(0) =!(!(A 7 A 6 ) &!(A 5 A 4 ) &!(A 3 A 2 ) &!(A 1 A 0 )) First level is the predecoder (for each pair of address bits, form A i A i-1, A i!a i-1,!a i A i-1, and!a i!a i-1 ) Second level is the word line driver Predecoders reduce the number of transistors required Four sets of four 2-bit NOR predecoders = 4 x 4 x (2+2) = 64 256 word line drivers, each a four input NAND 256 x (4+4) = 2,048-4,096 vs 2,112 = almost a 50% savings Number of inputs to the gates driving the WLs is halved, so the propagation delay is reduced by a factor of ~4 Sp12 CMPEN 411 L24 S.8

Hierarchical Decoders Multi-stage implementation improves performance WL 1 WL 0 A 0 A 1 A 0 A 1 A 0 A 1 A 0 A 1 A 2 A 3 A 2 A 3 A 2 A 3 A 2 A 3 A 1 A 0 A 0 A 1 A 3 A 2 A 2 A 3 NAND decoder using 2-input pre-decoders Sp12 CMPEN 411 L24 S.9

Dynamic Decoders Precharge devices GND GND V DD WL 3 V DD WL 3 WL 2 V DD WL 2 WL 1 WL 0 V DD WL 1 WL 0 V DD f A 0 A 0 A 1 A 1 A 0 A 0 A 1 A 1 f 2-input NOR decoder 2-input NAND decoder Which one is faster? Smaller? Low power? Sp12 CMPEN 411 L24 S.10

2 input NOR decoder Pass Transistor Based Column Decoder BL 3!BL 3 BL 2!BL 2 BL 1!BL 1 BL 0!BL 0 S 3 A 1 S 2 S 1 A 0 S 0 Sp12 CMPEN 411 L24 S.11 data_out!data_out Read: connect BLs to the Sense Amps (SA) drive one of the BLs low to write a 0 into the cell Writes: Fast since there is only one transistor in the signal path. However, there is a large transistor count ( (K+1)2 K + 2 x 2 K ) For K = 2 3 x 2 2 (decoder) + 2 x 2 2 (PTs) = 12 + 8 = 20

Tree Based Column Decoder BL 3!BL 3 BL 2!BL 2 BL 1!BL 1 BL 0!BL 0 A 0!A 0 A 1!A 1 data_out!data_out Number of transistors reduced to (2 x 2 x (2 K -1)) for K = 2 2 x 2 x (2 2 1) = 4 x 3 = 12 Delay increases quadratically with the number of sections (K) (so prohibitive for large decoders) can fix with buffers, progressive sizing, combination of tree and pass transistor approaches Sp12 CMPEN 411 L24 S.12

Decoder Complexity Comparisons Consider a memory with 10b address and 8b data Conf. Data/Row Row Decoder Column Decoder 1D 8b 10b = a 10x2 10 decoder Single stage = 20,480 Two stage = 10,320 2D 2D 2D 32b (32x256 core) 64b (64x128 core) 128b (128x64 core) 8b = 8x2 8 decoder Single stage = 4,096 T Two stage = 2,112 T 7b = 7x2 7 decoder Single stage = 1,792 T Two stage = 1,072 T 6b = 6x2 6 decoder Single stage = 768 T Two stage = 432 T 2b = 2x2 2 decoder PT = 76 T Tree = 96 T 3b = 3x2 3 decoder PT = 160 T Tree = 224 T 4b = 4x2 4 decoder PT = 336 T Tree = 480 T Sp12 CMPEN 411 L24 S.13

Bit Line Precharge Logic First step of a Read cycle is to precharge (PC) the bit lines to V DD every differential signal in the memory must be equalized to the same voltage level before Read Turn off PC and enable the WL the grounded PMOS load limits the bit line swing (speeding up the next precharge cycle) BL!PC!BL equalization transistor - speeds up equalization of the two bit lines by allowing the capacitance and pull-up device of the nondischarged bit line to assist in precharging the discharged line Sp12 CMPEN 411 L24 S.14

Sense Amplifiers Amplification resolves data with small bit line swings (in some DRAMs required for proper functionality) Delay reduction compensates for the limited drive capability of the memory cell to accelerate BL transition Sp12 CMPEN 411 L24 S.15 t p = ( C * V ) / I av large input small SA output make V as small as possible Power reduction eliminates a large part of the power dissipation due to charging and discharging bit lines Signal restoration for DRAMs, need to drive the bit lines full swing after sensing (read) to do data refresh

Classes of Sense Amplifiers Differential SA takes small signal differential inputs (BL and!bl) and amplifies them to a large signal singleended output common-mode rejection rejects noise that is equally injected to both inputs Only suitable for SRAMs (with BL and!bl) Types Current mirroring Two-stage Latch based Single-ended SA needed for DRAMs Sp12 CMPEN 411 L24 S.16

Differential Sense Amplifier V DD M 3 M 4 y Out bit M 1 M 2 bit SE M 5 Directly applicable to SRAMs Sp12 CMPEN 411 L24 S.17

Differential Sensing SRAM V DD PC V DD BL EQ BL y M 3 V DD M 4 V DD 2 y WL i x M 1 M 2 2 x x 2 x SE M 5 SE SRAM cell i SE x Diff. Sense 2 x Amp V DD y Output Output (a) SRAM sensing scheme SE (b) two stage differential amplifier Sp12 CMPEN 411 L24 S.18

Approaches to Memory Timing SRAM Timing Self-Timed DRAM Timing Multiplexed Addressing msb s lsb s Address Bus Address Address Bus RAS Row Addr. Column Addr. Address transition initiates memory operation CAS RAS-CAS timing Sp12 CMPEN 411 L24 S.20

Reliability and Yield Memories operate under low signal-to-noise conditions word line to bit line coupling can vary substantially over the memory array - folded bit line architecture (routing BL and!bl next to each other ensures a closer match between parasitics and bit line capacitances) interwire bit line to bit line coupling - transposed (or twisted) bit line architecture (turn the noise into a common-mode signal for the SA) leakage (in DRAMs) requiring refresh operation suffer from low yield due to high density and structural defects increase yield by using error correction (e.g., parity bits) and redundancy and are susceptible to soft errors due to alpha particles and cosmic rays Sp12 CMPEN 411 L24 S.21

Redundancy in the Memory Structure Fuse bank Redundant row Row address Redundant columns Column address Sp12 CMPEN 411 L24 S.22

Row Redundancy Fused Repair Addresses ==? ==? Redundant Wordline Redundant Wordline Enable Normal Wordline Decoder Normal Wordline Functional Address Normal Wordline Decoder Enable Normal Wordline Fused Repair Addresses ==? ==? Redundant Wordline Redundant Wordline Page 4 Sp12 CMPEN 411 L24 S.23

Fuse Fuse Fuse Fuse Fuse Fuse Fuse Fuse Normal Data Column Normal Data Column Normal Data Column Normal Data Column Normal Data Column Normal Data Column Normal Data Column Normal Data Column Redundant Data Column Column Redundancy Data 0 Data 1 Data 2 Data 3 Data 4 Data 5 Data 6 Data 7 Page 5 Sp12 CMPEN 411 L24 S.24

Error-Correcting Codes Example: Hamming Codes e.g. If B3 flips 1 1 = 3 0 2 K >= m+k+1. m # data bit, k # check bit For 64 data bits, needs 7 check bits Sp12 CMPEN 411 L24 S.25

Performance and area overhead for ECC Sp12 CMPEN 411 L24 S.26

Redundancy and Error Correction Sp12 CMPEN 411 L24 S.27

System FITS Soft Errors Nonrecurrent and nonpermanent errors from alpha particles (from the packaging materials) neutrons from cosmic rays 10000 1000 100 From Semico Research Corp. As feature size decreases, the charge stored at each node decreases (due to a lower node capacitance and lower V DD ) and thus Q critical (the charge necessary to cause a bit flip) decreases leading to an increase in the soft error rate (SER) 10 1 0.25 0.18 0.13 0.09 0.05 Process Technology From Actel MTBF (hours).13 m.09 m Ground-based 895 448 Civilian Avionics System 324 162 Military Avionics System 18 9 Sp12 CMPEN 411 L24 S.28

CELL Processor! See class website for web links Sp12 CMPEN 411 L24 S.29

CELL Processor! Sp12 CMPEN 411 L24 S.30

CELL Processor! Sp12 CMPEN 411 L24 S.31

Embedded SRAM (4.6Ghz) Each SRAM cell 0.99um2 Each block has 32 sub-arrays, Each sub-array has 128 WL plus 4 redundant line, Each block has 2 redundant BL, Sp12 CMPEN 411 L24 S.32

Multiplier in CELL Sp12 CMPEN 411 L24 S.33

Next Lecture and Reminders Next lecture Power consumption in datapaths and memories - Reading assignment Rabaey, et al, 11.7; 12.5 Sp12 CMPEN 411 L24 S.34