CS 250! VLSI System Design

Size: px
Start display at page:

Download "CS 250! VLSI System Design"

Transcription

1 CS 250! VLSI System Design Lecture 3 Timing ! Professor Jonathan Bachrach! slides by John Lazzaro TA: Colin Schmidt www-insteecsberkeleyedu/~cs250/ UC Regents Fall 2013/1014 UCB

2 everything doesn t happen at once Timing, the 10,000 ft view Locally synchronous, globally asynchronous On the same page Minimal set of timing concepts you need for project Break RTL Examples Better timing through micro-architecture Electrical details Just so you know UC Regents Fall 2013 UCB

3 Google I/O, 2012 View from 10,000 Ft

4 26 Billion Moore s Law 1 Million 2 Thousand Synchronous logic on a single clock domain is not practical for a 26 billion transistor design

5 GALS: Globally Asynchronous, Locally Synchronous Synchronous modules typically 50K-1M gates, so that the synchronous logic approach works well without requiring heroics Examples

6 IBM Power 5 CPU - Dynamically Scheduled CP = group commit) Branch prediction Program Branch Return Target counter history stack stack cache tables Alternate Instruction buffer 0 0 Group formation Instruction Instruction decode cache Instruction Dispatch Instruction buffer 1 1 translation Thread priority Dynamic instruction selection Shared Shared execution issue issue units units queues LSU0 FXU0 LSU1 FXU1 FPU0 FPU1 BXU BXU Sharedregister mappers Read Read sharedregister files files CRL CRL Write Write shared- register files files Data Data Translation Data Data Cache Group Store Store completion queue Data Data Data Data translation cache L2 L2 cache Shared by by two two threads Thread 0 0 resources Thread 1 1 resources Stars denote FIFOs that create separate synchronous domains An example of how architecture and circuits work together Figure 4 Power5 instruction data flow (BXU = branch execution unit and CRL = condition register logical execution unit)

7 Rocket uses GALS for accelerator interface Your project interfaces with the RISC-V pipeline and the memory system using FIFOs Your timing closure is independent of the CPU logic domain

8 Today: Timing insights for your project What we re not doing If this class was EE 241 and your project was an SRAM: You could see through down to the layout Timing? Use SPICE on this hand-drawn schematic

9 Technology X: The CS 250 timing challenge What we are doing ---> If your accelerator is too slow two options: Top-down: Rework high-level micro-architecture Let Technology X keep its job Today Logic Synthesis Bottom-up: Take control away from logic synthesis Use HDL as textual schematic Also, use command-line tool flags Sometimes necessary Colin is the expert, ask in discussion

10 A Logic Circuit Primer Models should be as simple as possible, but no simpler Albert Einstein UC Regents Fall 2013/1014 UCB

11 Inverters: A simple transistor model In Inverter Out Out = In Correctly predicts logic output for simple static CMOS circuits In 0 1 Out Circuit In 1 0 Vdd PMOS Out NMOS Extensions to model subtler circuit families, or to predict timing, have not worked well pfet! A switch On if gate is grounded nfet! A switch On if gate is! at Vdd UC Regents Fall 2013 UCB

12 Transistors as water valves (Cartoon physics) If electrons are water molecules,! transistor strengths (W/L) are pipe diameters,! and capacitors are buckets Vdd 1 A on p-fet fills! up the capacitor with charge Open Charge 0 Water level Time A on n-fet! empties the bucket n Vdd Open Vdd Out Discharge 1 This model is often good enough 0 Water level Time UC Regents Fall 2013/2014 UCB

13 What is the bucket? A gate s fan-out Inverter: NAND gate: Fan-out : The number of gate inputs driven by a gate s output Driving other gates slows a gate down Driving wires slows a gate down Driving it s own parasitics slows a gate down UC Regents Fall 2013 UCB

14 Fanout UC Regents Fall 2013 UCB

15 A closer look at fan-out Driving more gates adds delay Linear model! works for! reasonable! fan-out 05ns Out: Low -> High Slope = 00021ns / ff FO4: Fanout of four delay Delay time of an inverter driving 4 inverters Cout UC Regents Fall 2013 UCB

16 Propagation delay graphs Cascaded gates: 1 ->0 1 ->0 0 ->1 0 ->1 inverter transfer function Vout Vin UC Regents Fall 2013/2014 UCB

17 Worst-case delay through combinational logic T2 might be the worst-case delay path (critical path) 0 ->1 T2 0 ->1 T1 0 ->1 x = g(a, b, c, d, e, f) If d going 0-to-1 switches x 0-to-1, delay is T1 If a going 0-to-1 switches x 0-to-1, delay is T2 It would be surprising if T1 > T2 UC Regents Fall 2013/2014 UCB

18 Why might? Wires have delay too Wires posses distributed resistance and capacitance v1 v2 v3 v4 Looks! benign,! but Time constant associated with distributed RC is proportional to the square of the length v1 v2 v3 v4 time signals are typically rebuffered to reduce delay: UC Regents Fall 2013/2014 UCB

19 Clocked Logic Circuits UC Regents Fall 2013/1014 UCB

20 From Delay Models to Timing Analysis clk Timing Analysis!! What is the smallest T that produces correct operation? f T 1 MHz 1 μs 10 MHz 100 ns 100 MHz 10 ns 1 GHz 1 ns UC Regents Fall 2013/2014 UCB

21 Timing Analysis and Logic Delay Register:!! An Array of Flip-Flops Combinational Logic If our clock period T > worst-case delay through CL, does this ensure correct operation? UC Regents Fall 2013/2014 UCB

22 Flip Flops have internal delays D Q Value of D is sampled on positive clock edge Q outputs sampled value for rest of cycle t_setup CLK D Q t_clk-to-q UC Regents Fall 2013/2014 UCB

23 Flip-Flop delays eat into time budget Combinational Logic ALU time budget T! # clk"q + # CL + # setup UC Regents Fall 2013/2014 UCB

24 Clock skew also eats into time budget CLKd CLK CLK CLK CLKd CLK CL As T 0, which circuit fails first? CL CLK CLK CLKd clock skew, delay in distribution T " T CL +T setup +T clk!q + worst case skew ost modern large high-performance chi UC Regents Fall 2013/2014 UCB

25 Delay Grid Tuned sector trees Delay Sector buffers x Clock Tree! Delays,! IBM Power! CPU y Buffer level 2 Buffer level 1 UC Regents Fall 2013 UCB

26 15 10 Delay Volts (V) 20 ps skew Time (ps) Multiplefingered transmissio line x Clock Tree Delays, IBM Power y UC Regents Fall 2013 UCB

27 Some Flip Flops have hold time t_setup t_inv t_hold CLK D Q D D must! stay stable here What is the intended function of this circuit? CLK Does flip-flop hold time affect operation of this circuit? Under what conditions? t_clk-to-q + t_inv > t_hold For correct operation UC Regents Fall 2013/2014 UCB

28 Searching for processor critical path? Timing Analysis!! What is the smallest T that produces correct operation?!! Must consider! all connected! register pairs Why might I suspect this one? UC Regents Fall 2013/2014 UCB

29 Combinational paths for IBM Power 4 CPU The critical path Most wires have hundreds of picoseconds to spare Late-mode timing checks (thousands) Timing slack (ps) From The circuit and physical design of the POWER4 microprocessor, IBM J Res and Dev, 46:1, Jan 2002, JD Warnock et al UC Regents Fall 2013/2014 UCB

30 How to retime logic Circles are combinational logic, labelled with delays Critical path is 5 We want to improve it without changing circuit semantics IN OUT Figure 1: A small graph before retiming The nodes represent logic delays, with the inputs and outputs passing through mandatory, fixed registers The critical path is 5 Add a register, move one circle Performance improves by 20% Post-Placement C-slow Retiming for the Xilinx Virtex FPGA IN Nicholas Weaver UC Berkeley Berkeley, CA Yury Markovskiy UC Berkeley Berkeley, CA Yatish Patel UC Berkeley Berkeley, CA OUT Figure 2: The example in Figure 2 after retiming The critical path is reduced from 5 to 4 Technology X can often do this John Wawrzynek UC Berkeley Berkeley, CA

31 Power 4: Timing Estimation, Closure Timing Estimation!! Predicting a processor s clock rate early in the project From The circuit and physical design of the POWER4 microprocessor, IBM J Res and Dev, 46:1, Jan 2002, JD Warnock et al UC Regents Fall 2013/2014 UCB

32 Power 4: Timing Estimation, Closure Timing Closure!! Meeting! (or exceeding!) the timing estimate From The circuit and physical design of the POWER4 microprocessor, IBM J Res and Dev, 46:1, Jan 2002, JD Warnock et al UC Regents Fall 2013/2014 UCB

33 Floorplaning: essential to meet timing (Intel XScale 80200) UC Regents Fall 2013/2014 UCB

34

35 Break UC Regents Fall 2013/1014 UCB

36 Simple exercises for gaining intuition about timing for your process + EDA tools Thanks to Bhupesh Dasila, Open-Silicon Bangalore

37 Synthesize gate chains using hand-specified library cells Exercises cell library and place and route tools! Lets you know how many levels of logic you can use in the best case weak NANDs Synthesis constrained to 2ns clock Delay of a chain of 3 inverters with strongest strength Guaranteed not to exceed speed Chain lengths 40 nm process 29 ps/gate av Helps you see through Technology X Bhupesh Dasila

38 Force P&L to drive a long wire with a known buffer cell Bhupesh Dasila Vary driver strength, wire length, metal layer! Shows the maximum distance two gates can be placed and still meet your clock period Distributed RC is the square of the length is clearly seen!

39 Driving Large Loads Large fanout nets: clocks, resets, memory bit lines, off-chip Relatively small driver results in long rise time (and thus large gate delay) Strategy: Staged Buffers Optimal trade-off between delay per stage and total number of stages fanout of 4-6 per stage Lecture 04, Timing 12 UC Regents CS250, Fall 2013/1014 UC Berkeley Fall UCB 12

40 Register file: Synthesize, or use SRAM? sel(ws) 5 WE D E M U X clk wd R0 - The constant 0 Q 32 D D D En En En R1 R2 R31 Q Q Q Speed will depend on how large it lays out two read ports sel(rs1) M U X M U X 5 32 rd1 sel(rs2) 5 32 rd2 UC Regents Fall 2013/2014 UCB

41 Synthesized, custom, and SRAM-based register files, 40nm For small register files, logic synthesis is competitive! Not clear if the SRAM data points include area for register control, etc Synthesis SRAMS Register file compiler Figure 3: Using the raw area data, the physical implementation team can get a more accurate area estimation early in the RTL development stage for floorplanning purposes This shows an example of this graph for a 1-port, 32-bit-wide SRAM Bhupesh Dasila

42 Techniques UC Regents Fall 2013/1014 UCB

43 Pipelining UC Regents Fall 2013 UCB

44 + Starting point: A single-cycle processor Challenge: Speed up clock while keeping CPI == 1 Seconds Program Instructions Program Cycles Instruction Seconds Cycle 0x4 CPI == 1 This is good Slow This is bad D PC Q Addr Instr Mem Data RegFile rs1 rs2 rd1 ws rd2 wd WE op A L U 32 Data Memory Addr Dout Din WE MemToReg Ext UC Regents Fall 2013/2014 UCB

45 Reminder: How data flows after posedge PC Instr! Mem 0x4 + D Q Addr Data rs1 rs2 ws wd RegFile WE rd1 rd op Logic A L U 32 UC Regents Fall 2013/2014 UCB

46 Next posedge: Update state and repeat PC D Q 5 rs1 5 rs2 5 ws 32 wd RegFile WE rd1 rd UC Regents Fall 2013/2014 UCB

47 Observation: Logic idle most of cycle For most of cycle, ALU is either waiting for its inputs, or holding its output Ideal: a CPU architecture where each part is always working 0x4 + D PC Q Addr Instr Mem Data RegFile rs1 rs2 rd1 ws rd2 wd WE op A L U 32 Data Memory Addr Dout Din WE MemToReg Ext UC Regents Fall 2013/2014 UCB

48 Inspiration: Automobile assembly line Assembly line moves on a steady clock! Each station does the same task on each car The clock Merge station Car body shell Bolting station Car chassis UC Regents Fall 2013/2014 UCB

49 Inspiration: Automobile assembly line Simpler station tasks more cars per hour Simple tasks take less time, clock is faster UC Regents Fall 2013 UCB

50 Inspiration: Automobile assembly line Line speed limited by slowest task Most efficient if all tasks take same time to do UC Regents Fall 2013 UCB

51 Inspiration: Automobile assembly line Simpler tasks, complex car long line! These lines go 24 x 7, and rarely shut down UC Regents Fall 2013 UCB

52 Lessons from car assembly lines Faster line movement yields more cars per hour off the line Faster line movement requires more stages, each doing simpler tasks To maximize efficiency, all stages should take same amount of time! (if not, workers in fast stages are idle) Filling, flushing, and stalling assembly line are all bad news UC Regents Fall 2013/2014 UCB

53 Key Analogy: The instruction is the car Pipeline Stage #1 Stage #2 Stage #3 Stage #4 Stage #5 Instruction Fetch IR IR IR IR + PC 0x4 Instr Mem Controls hardware in stage 2 Controls hardware in stage 3 Controls hardware in stage 4 Controls hardware in stage 5 D Q Addr Data Data-stationary control UC Regents Fall 2013/2014 UCB

54 + Example: Decode & Register Fetch Stage Pipeline Stage #1 Stage #2 Stage #3 Instr Fetch Decode & Reg Fetch SUB R10,R9,R8 IR OR R7,R6,R5 IR ADD R4,R3,R2 IR 0x4 A sample program D PC Q Addr Instr Mem Data RegFile rs1 rs2 rd1 ws rd2 wd WE Ext A M B ADD R4,R3,R2 OR R7,R6,R5 SUB R10,R9,R8 R s chosen so that instructions are independent - like cars on the line UC Regents Fall 2013/2014 UCB

55 Hazards: An instruction is not a car + Stage #1 Stage #2 Stage #3 Instr Fetch Decode & Reg Fetch D PC Q 0x4 Addr Instr Mem Data OR R5,R4,R2 IR IR IR wrong value of R4 fetched from RegFile, contract with programmer broken! Oops! rs1 rs2 ws wd RegFile WE rd1 rd2 Ext A M B ADD R4,R3,R2 R4 not written yet New sample program ADD R4,R3,R2 OR R5,R4,R2 An example of a hazard -- we must (1) detect and (2) resolve all hazards to make a CPU that matches ISA UC Regents Fall 2013/2014 UCB

56 Performance Equation and Hazards + Seconds Program Instructions Program Cycles Instruction Seconds Cycle D PC Q Instr Fetch Decode & Reg Fetch Stage #3 IR IR IR Some ways to cope with hazards! Added logic to makes CPI > 1! detect and resolve 0x4 stalling pipeline hazards increases A clock period Addr Instr Mem Data rs1 rs2 ws wd RegFile WE rd1 rd2 Ext M B Software slows the machine down! Seymour Cray UC Regents Fall 2013/2014 UCB

57 Superpipelining UC Regents Fall 2013 UCB

58 Superpipelining: Add more stages Seconds Program Instructions Program Cycles Instruction Seconds Cycle Goal: Reduce critical path by! adding more pipeline stages Example: 8-stage ARM XScale:! extra IF, ID, data cache stages Difficulties: Added penalties for load delays and branch misses Also, power! Ultimate Limiter: As logic delay goes to 0, FF clk-to-q and setup UC Regents Fall 2013/2014 UCB

59 Note: Some stages now overlap, some instructions take extra stages 5 Stage 8 Stage IF ID+RF EX MEM WB IR IR IR IR IM Reg DM Reg ALU IF now takes 2 stages (pipelined I-cache) ID and RF each get a stage ALU split over 3 stages MEM takes 2 stages (pipelined D-cache) UC Regents Fall 2013/1014 UCB

60 Superpipelining techniques Split ALU and decode logic over several pipeline stages Pipeline memory: Use more banks of smaller arrays, add pipeline stages between decoders, muxes Remove rarely-used forwarding networks that are on critical path Creates stalls, affects CPI Pipeline the wires of frequently used forwarding networks Also: Clocking tricks (example: use posedge and negedge registers) UC Regents Fall 2013 UCB

61 Hardware limits to superpipelining? FO4! Delays Historical limit: about 12 FO4s MIPS stages CPU Clock Periods! Pentium Pro 10 stages FO4: How many fanout-of-4 inverter delays in the clock period Pentium 4 20 stages Thanks to Francois Labonte, Stanford * intel 386 intel 486 intel pentium intel pentium 2 intel pentium 3 intel pentium 4 intel itanium Alpha Alpha Alpha Sparc SuperSparc Sparc64 Mips HP PA Power PC AMD K6 AMD K7 AMD x86-64 Power wall: Intel Core Duo has 14 stages UC Regents Fall 2013/2014 UCB

62 CPU DB: Recording Microprocessor History With this open database, you can mine microprocessor trends over the past 40 years Andrew Danowitz, Kyle Kelley, James Mao, John P Stevenson, Mark Horowitz, Stanford University F04 Delays Per Cycle for Processor Designs F04 / cycle FO4 delay per cycle is roughly proportional to the amount of computation completed per cycle

63 Multithreading UC Regents Fall 2013 UCB

64 Multithreading of Static Pipelines Interleave 4 threads, T1-T4, on non-bypassed 5-stage pipe T1: LW r1, 0(r2) T2: ADD r7, r1, r4 T3: XORI r5, r4, #12 T4: SW 0(r7), r5 T1: LW r5, 12(r1) t0 t1 t2 t3 t4 t5 t6 t7 t8 F D X M W F D X M W F D X M W F D X M W F D X M W t9 Last instruction in a thread always completes writeback before next instruction in same thread reads regfile 4 CPUs, each run at 1/4 clock PC PC PC 1 PC I$ IR GPR1 GPR1 GPR1 GPR1 X Y D$ +1 2 Thread 2 Many variants UC Regents Fall 2013/2014 UCB

65 At the logic level Synchronous logic we want to multithread Critical path is 5 2X multi-threading: double each register Modern synthesis will retime this as shown: critical path is now 2 IN OUT Figure 1: A small graph before retiming The nodes represent logic delays, with the inputs and outputs passing through mandatory, fixed registers The critical path is 5 IN OUT Figure 3: The example in Figure 2 2-slowed This design now operates on 2 independent data streams IN OUT Figure 4: The example in Figure 3 after retiming The combination of C-slowing and retiming reduced the critical path from 5 to 2 Post-Placement C-slow Retiming for the Xilinx Virtex FPGA Nicholas Weaver UC Berkeley Berkeley, CA Yury Markovskiy UC Berkeley Berkeley, CA Yatish Patel UC Berkeley Berkeley, CA John Wawrzynek UC Berkeley Berkeley, CA

66 Good fit for GALS Two input queues (red and green) The mux control logic implements turn-taking Outputs placed into two output queues

67 Electrical Details UC Regents Fall 2013/1014 UCB

68 Flip Flops Revisited UC Regents Fall 2013 UCB

69 Recall: Static RAM cell (6 Transistors) Gnd Vdd Vth Vth Vdd Gnd Crosscoupled inverters noise noise x x! UC Regents Fall 2013/2014 UCB

70 Recall: Positive edge-triggered flip-flop D Q A flip-flop samples right before the edge, and then holds value clk Sampling! circuit clk Holds! value clk clk clk clk clk Clock to Q delay results fr 16 Transistors: Makes an SRAM look compact! What do we get for the 10 extra transistors? Clocked logic semantics clk UC Regents Fall 2013/2014 UCB

71 Sensing: When clock is low D Q clk A flip-flop samples right before the edge, and then holds value Sampling! circuit clk Holds! value clk clk clk clk clk = 0 clk = 1 clk clk Clock to Q delay results fr clk clk clk clk clk clk Will capture clk new value on posedge Clock to Q delay results fr Outputs clk last value captured UC Regents Fall 2013/2014 UCB

72 Capture: When clock goes high D Q clk A flip-flop samples right before the edge, and then holds value Sampling! circuit clk Holds! value clk clk clk clk clk = 1 clk = 0 clk Clock to clk Q delay results fr clk clk clk clk clk clk Remembers value clk just captured Clock to Q delay results fr Outputs value clk just captured UC Regents Fall 2013/2014 UCB

73 Flip Flop delays: clk-to-q? setup? hold? clk clk D Q CLK clk clk clk clk CLK == 0 Sense D, but Q! outputs old value clk Clock to Q delay results fr setup clk CLK 0->1 Capture D, pass! value to Q hold clk-to-q UC Regents Fall 2013/2014 UCB

74 On Tuesday Power and Energy Heat Sink Heat Source

CS250 VLSI Systems Design

CS250 VLSI Systems Design CS250 VLSI Systems Design Lecture 4: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing Spring 2016 John Wawrzynek with Chris Yarp (GSI) Lecture 04, Timing CS250, UC Berkeley Sp16 What

More information

Parallelism I: Inside the Core

Parallelism I: Inside the Core Parallelism I: Inside the Core 1 The final Comprehensive Same general format as the Midterm. Review the homeworks, the slides, and the quizzes. 2 Key Points What is wide issue mean? How does does it affect

More information

Lecture 14: Instruction Level Parallelism

Lecture 14: Instruction Level Parallelism Lecture 14: Instruction Level Parallelism Last time Pipelining in the real world Today Control hazards Other pipelines Take QUIZ 10 over P&H 4.10-15, before 11:59pm today Homework 5 due Thursday March

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12

More information

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution 6.823, L16--1 Advanced Superscalar Architectures Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Speculative and Out-of-Order Execution Branch Prediction kill kill Branch

More information

CS 152 Computer Architecture and Engineering

CS 152 Computer Architecture and Engineering CS 152 Computer Architecture and Engineering Lecture 23 Synchronization 2006-11-16 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Udam Saini and Jue Sun www-inst.eecs.berkeley.edu/~cs152/ 1 Last Time:

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 22: Memery, ROM [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L22 S.1

More information

Energy Efficient Content-Addressable Memory

Energy Efficient Content-Addressable Memory Energy Efficient Content-Addressable Memory Advanced Seminar Computer Engineering Institute of Computer Engineering Heidelberg University Fabian Finkeldey 26.01.2016 Fabian Finkeldey, Energy Efficient

More information

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

More information

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide)

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) Out-of-order Pipeline Register Read When do instructions read the register file? Fetch Decode Rename Dispatch Buffer of instructions Issue Reg-read Execute Writeback Commit Option #: after select, right

More information

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 20: Parallelism ILP to Multicores James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L20 S1, James C. Hoe, CMU/ECE/CALCM, 2018 18 447 S18 L20 S2, James C. Hoe, CMU/ECE/CALCM,

More information

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon]

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Anne Bracy CS 3410 Computer Science Cornell University [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Prog. Mem PC +4 inst Reg. File 5 5 5 control ALU Data Mem Fetch Decode Execute Memory WB

More information

VHDL (and verilog) allow complex hardware to be described in either single-segment style to two-segment style

VHDL (and verilog) allow complex hardware to be described in either single-segment style to two-segment style FFs and Registers In this lecture, we show how the process block is used to create FFs and registers Flip-flops (FFs) and registers are both derived using our standard data types, std_logic, std_logic_vector,

More information

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design ENGN64: Design of Computing Systems Topic 5: Pipeline Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Computer Architecture: Out-of-Order Execution Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Reading for Today Smith and Sohi, The Microarchitecture of Superscalar Processors, Proceedings

More information

ABB June 19, Slide 1

ABB June 19, Slide 1 Dr Simon Round, Head of Technology Management, MATLAB Conference 2015, Bern Switzerland, 9 June 2015 A Decade of Efficiency Gains Leveraging modern development methods and the rising computational performance-price

More information

Using SystemVerilog Assertions in Gate-Level Verification Environments

Using SystemVerilog Assertions in Gate-Level Verification Environments Using SystemVerilog Assertions in Gate-Level Verification Environments Mark Litterick (Verification Consultant) mark.litterick@verilab.com 2 Introduction Gate-level simulations why bother? methodology

More information

CS152: Computer Architecture and Engineering Introduction to Pipelining. October 22, 1997 Dave Patterson (http.cs.berkeley.

CS152: Computer Architecture and Engineering Introduction to Pipelining. October 22, 1997 Dave Patterson (http.cs.berkeley. CS152: Computer Architecture and Engineering Introduction to Pipelining October 22, 1997 Dave Patterson (http.cs.berkeley.edu/~patterson) lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/ cs 152

More information

Learn to Design with Stratix III FPGAs Programmable Power Technology and Selectable Core Voltage

Learn to Design with Stratix III FPGAs Programmable Power Technology and Selectable Core Voltage Learn to Design with Stratix III FPGAs Programmable Power Technology and Selectable Core Voltage Vaughn Betz and Sanjay Rajput Copyright 2007 Altera Corporation Agenda The power challenge Stratix III power

More information

Advanced Superscalar Architectures

Advanced Superscalar Architectures Advanced Suerscalar Architectures Krste Asanovic Laboratory for Comuter Science Massachusetts Institute of Technology Physical Register Renaming (single hysical register file: MIPS R10K, Alha 21264, Pentium-4)

More information

e-smart 2009 Low cost fault injection method for security characterization

e-smart 2009 Low cost fault injection method for security characterization e-smart 2009 Low cost fault injection method for security characterization Jean-Max Dutertre ENSMSE Assia Tria CEA-LETI Bruno Robisson CEA-LETI Michel Agoyan CEA-LETI Département SAS Équipe mixte CEA-LETI/ENSMSE

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by Milo Martin & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin

More information

CIS 662: Sample midterm w solutions

CIS 662: Sample midterm w solutions CIS 662: Sample midterm w solutions 1. (40 points) A processor has the following stages in its pipeline: IF ID ALU1 MEM1 MEM2 ALU2 WB. ALU1 stage is used for effective address calculation for loads, stores

More information

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019 6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019 http://csg.csail.mit.edu/6.823/ This self-assessment test is intended to help you determine your

More information

Exploiting Clock Skew Scheduling for FPGA

Exploiting Clock Skew Scheduling for FPGA Exploiting Clock Skew Scheduling for FPGA Sungmin Bae, Prasanth Mangalagiri, N. Vijaykrishnan Email {sbae, mangalag, vijay}@cse.psu.edu CSE Department, Pennsylvania State University, University Park, PA

More information

EE 330 Integrated Circuit. Sequential Airbag Controller

EE 330 Integrated Circuit. Sequential Airbag Controller EE 330 Integrated Circuit Sequential Airbag Controller Chongli Cai Ailing Mei 04/2012 Content...page Introduction...3 Design strategy...3 Input, Output and Registers in the System...4 Initialization Block...5

More information

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士 Computer Architecture 计算机体系结构 Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review ISA, micro-architecture, physical design Evolution of ISA CISC vs

More information

ASIC Design (7v81) Spring 2000

ASIC Design (7v81) Spring 2000 ASIC Design (7v81) Spring 2000 Lecture 1 (1/21/2000) General information General description We study the hardware structure, synthesis method, de methodology, and design flow from the application to ASIC

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 20: Multiplier Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411

More information

CS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars

CS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars CS 152 Comuter Architecture and Engineering Lecture 15 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste

More information

Improving Performance: Pipelining!

Improving Performance: Pipelining! Iproving Perforance: Pipelining! Meory General registers Meory ID EXE MEM WB Instruction Fetch (includes PC increent) ID Instruction Decode + fetching values fro general purpose registers EXE EXEcute arithetic/logic

More information

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT Features High Performance: f Clock Frequency -7K 3 CL=2-75B, CL=3-8B, CL=2 Single Pulsed RAS Interface Fully Synchronous to Positive Clock Edge Four Banks controlled by BS0/BS1 (Bank Select) Units 133

More information

Code Scheduling & Limitations

Code Scheduling & Limitations This Unit: Static & Dynamic Scheduling CIS 371 Computer Organization and Design Unit 11: Static and Dynamic Scheduling App App App System software Mem CPU I/O Code scheduling To reduce pipeline stalls

More information

Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology

Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology C. H. Balaji 1, E. V. Kishore 2, A. Ramakrishna 3 1 Student, Electronics and Communication Engineering, K L University, Vijayawada,

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by M. Martin, A.Roth, C.J. Taylor and Benedict Brown at the University of Pennsylvania with sources that included

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

Pipelined MIPS Datapath with Control Signals

Pipelined MIPS Datapath with Control Signals uction ess uction Rs [:26] (Opcode[5:]) [5:] ranch luor. Decoder Pipelined MIPS path with Signals luor Raddr at Five instruction sequence to be processed by pipeline: op [:26] rs [25:2] rt [2:6] rd [5:]

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 20 Synchronous Digital Systems Blu-ray vs HD-DVD war over? As you know, there are two different, competing formats for the next

More information

ReCoSoC Experimental Fault Injection based on the Prototyping of an AES Cryptosystem

ReCoSoC Experimental Fault Injection based on the Prototyping of an AES Cryptosystem ReCoSoC 2010 5th International Workshop on Reconfigurable Communication-centric Systems on Chip Experimental Fault Injection based on the Prototyping of an AES Cryptosystem Jean- Baptiste Rigaud Jean-Max

More information

UTBB FD-SOI: The Technology for Extreme Power Efficient SOCs

UTBB FD-SOI: The Technology for Extreme Power Efficient SOCs UTBB FD-SOI: The Technology for Extreme Power Efficient SOCs Philippe Flatresse Technology R&D Bulk transistor is reaching its limits FD-SOI = 2D Limited body bias capability Gate gate Gate oxide stack

More information

Unit 9: Static & Dynamic Scheduling

Unit 9: Static & Dynamic Scheduling CIS 501: Computer Architecture Unit 9: Static & Dynamic Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Mar;n at University of Pennsylvania CIS 501: Comp. Arch. Prof. Milo Martin

More information

In-Place Associative Computing:

In-Place Associative Computing: In-Place Associative Computing: A New Concept in Processor Design 1 Page Abstract 3 What s Wrong with Existing Processors? 3 Introducing the Associative Processing Unit 5 The APU Edge 5 Overview of APU

More information

Advanced Topics. Packaging Power Distribution I/O. ECE 261 James Morizio 1

Advanced Topics. Packaging Power Distribution I/O. ECE 261 James Morizio 1 Advanced Topics Packaging Power Distribution I/O ECE 261 James Morizio 1 Package functions Packages Electrical connection of signals and power from chip to board Little delay or distortion Mechanical connection

More information

Lecture 10: Circuit Families

Lecture 10: Circuit Families Lecture 10: Circuit Families Outline Pseudo-nMOS Logic Dynamic Logic Pass Transistor Logic 2 Introduction What makes a circuit fast? I C dv/dt -> t pd (C/I) ΔV low capacitance high current small swing

More information

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science. Chapter 3: Computer Organization Fundamentals Prof. Ben Lee Oregon State University School of Electrical Engineering and Computer Science Chapter Goals Understand the organization of a computer system

More information

RAM-Type Interface for Embedded User Flash Memory

RAM-Type Interface for Embedded User Flash Memory June 2012 Introduction Reference Design RD1126 MachXO2-640/U and higher density devices provide a User Flash Memory (UFM) block, which can be used for a variety of applications including PROM data storage,

More information

Finite Element Based, FPGA-Implemented Electric Machine Model for Hardware-in-the-Loop (HIL) Simulation

Finite Element Based, FPGA-Implemented Electric Machine Model for Hardware-in-the-Loop (HIL) Simulation Finite Element Based, FPGA-Implemented Electric Machine Model for Hardware-in-the-Loop (HIL) Simulation Leveraging Simulation for Hybrid and Electric Powertrain Design in the Automotive, Presentation Agenda

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 06: Static CMOS Logic

CMPEN 411 VLSI Digital Circuits Spring Lecture 06: Static CMOS Logic MPEN 411 VLSI Digital ircuits Spring 2012 Lecture 06: Static MOS Logic [dapted from Rabaey s Digital Integrated ircuits, Second Edition, 2003 J. Rabaey,. handrakasan,. Nikolic] Sp12 MPEN 411 L06 S.1 Review:

More information

Programmable Comparator Options for the isppac-powr1220at8

Programmable Comparator Options for the isppac-powr1220at8 November 2005 Introduction Application Note AN6069 Lattice s isppac -POWR1220AT8 offers a wide range of features for managing multiple power supplies in a complex system. This application note outlines

More information

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining CMU 18-447 Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining Instructor: Prof. Onur Mutlu TAs: Justin Meza, Yoongu Kim, Jason Lin 1 Adding the REP

More information

Wind Turbine Emulation Experiment

Wind Turbine Emulation Experiment Wind Turbine Emulation Experiment Aim: Study of static and dynamic characteristics of wind turbine (WT) by emulating the wind turbine behavior by means of a separately-excited DC motor using LabVIEW and

More information

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS Lipasti, artin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar GAS STATION Pipelining II Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin,

More information

Introduction to Digital Techniques

Introduction to Digital Techniques to Digital Techniques Dan I. Porat, Ph.D. Stanford Linear Accelerator Center Stanford University, California Arpad Barna, Ph.D. Hewlett-Packard Laboratories Palo Alto, California John Wiley and Sons New

More information

Tomasulo-Style Register Renaming

Tomasulo-Style Register Renaming Tomasulo-Style Register Renaming ldf f0,x(r1) allocate RS#4 map f0 to RS#4 mulf f4,f0, allocate RS#6 ready, copy value f0 not ready, copy tag Map Table f0 f4 RS#4 RS T V1 V2 T1 T2 4 REG[r1] 6 REG[] RS#4

More information

8Mbit to 256MBit HyperMemory SRAM and FIFO. Configurations. Features. Introduction. Applications

8Mbit to 256MBit HyperMemory SRAM and FIFO. Configurations. Features. Introduction. Applications 8Mbit to 256MBit HyperMemory SRAM and FIFO Features Super high-speed Static-Memory Can be configured as a standalone FIFO Supports multiple IO Standards (HSTL, SSTL, LVCMOS/ LVTTL) Access time as low as

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 15: Dynamic CMOS

CMPEN 411 VLSI Digital Circuits Spring Lecture 15: Dynamic CMOS CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 15: Dynamic CMOS [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L15

More information

1 Introduction. 2 Cranking Pulse. Application Note. AN2201/D Rev. 0, 11/2001. Low Battery Cranking Pulse in Automotive Applications

1 Introduction. 2 Cranking Pulse. Application Note. AN2201/D Rev. 0, 11/2001. Low Battery Cranking Pulse in Automotive Applications Application Note Rev. 0, 11/2001 Low Battery Cranking Pulse in Automotive Applications by Axel Bahr Freescale Field Applications Engineering Munich, Germany 1 Introduction 2 Cranking Pulse Electronic modules

More information

Fast Orbit Feedback (FOFB) at Diamond

Fast Orbit Feedback (FOFB) at Diamond Fast Orbit Feedback (FOFB) at Diamond Guenther Rehm, Head of Diagnostics Group 29/06/2007 FOFB at Diamond 1 Ground, Girder and Beam Motion 29/06/2007 FOFB at Diamond 2 Fast Feedback Design Philosophy Low

More information

A Predictive Delay Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture

A Predictive Delay Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture A Predictive Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture Toshihiro Kameda 1 Hiroaki Konoura 1 Dawood Alnajjar 1 Yukio Mitsuyama 2 Masanori Hashimoto 1 Takao Onoye 1 hasimoto@ist.osaka

More information

HYB25D256400/800AT 256-MBit Double Data Rata SDRAM

HYB25D256400/800AT 256-MBit Double Data Rata SDRAM 256-MBit Double Data Rata SDRAM Features CAS Latency and Frequency Maximum Operating Frequency (MHz) CAS Latency DDR266A -7 DDR200-8 2 133 100 2.5 143 125 Double data rate architecture: two data transfers

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

Page 1. Goal. Digital Circuits: why they leak, how to counter. Design methodology: consider all design abstraction levels. Outline: bottom-up

Page 1. Goal. Digital Circuits: why they leak, how to counter. Design methodology: consider all design abstraction levels. Outline: bottom-up Digital ircuits: why they leak, how to counter Ingrid Verbauwhede Ingrid.verbauwhede-at-esat.kuleuven.be KU Leuven, OSI cknowledgements: urrent and former Ph.D. students Fundamental understanding of MOS

More information

Nickel Cadmium and Nickel Hydride Battery Charging Applications Using the HT48R062

Nickel Cadmium and Nickel Hydride Battery Charging Applications Using the HT48R062 ickel Cadmium and ickel Hydride Battery Charging Applications Using the HT48R062 ickel Cadmium and ickel Hydride Battery Charging Applications Using the HT48R062 D/: HA0126E Introduction This application

More information

Power distribution techniques for dual-vdd circuits. Sarvesh H Kulkarni and Dennis Sylvester EECS Department, University of Michigan

Power distribution techniques for dual-vdd circuits. Sarvesh H Kulkarni and Dennis Sylvester EECS Department, University of Michigan Power distribution techniques for dual-vdd circuits Sarvesh H Kulkarni and Dennis Sylvester EECS Department, University of Michigan Outline Motivation for multiple supply design Implications of using multiple

More information

IS42S32200L IS45S32200L

IS42S32200L IS45S32200L IS42S32200L IS45S32200L 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM OCTOBER 2012 FEATURES Clock frequency: 200, 166, 143, 133 MHz Fully synchronous; all signals referenced to a positive

More information

IS42S32200C1. 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM

IS42S32200C1. 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM JANUARY 2007 FEATURES Clock frequency: 183, 166, 143 MHz Fully synchronous; all signals referenced to a positive clock edge Internal bank

More information

HYB25D256[400/800/160]B[T/C](L) 256-Mbit Double Data Rate SDRAM, Die Rev. B Data Sheet Jan. 2003, V1.1. Features. Description

HYB25D256[400/800/160]B[T/C](L) 256-Mbit Double Data Rate SDRAM, Die Rev. B Data Sheet Jan. 2003, V1.1. Features. Description Data Sheet Jan. 2003, V1.1 Features CAS Latency and Frequency Maximum Operating Frequency (MHz) CAS Latency DDR200-8 DDR266A -7 DDR266-7F DDR333-6 2 100 133 133 133 2.5 125 143 143 166 Double data rate

More information

COSC 6385 Computer Architecture. - Tomasulos Algorithm

COSC 6385 Computer Architecture. - Tomasulos Algorithm COSC 6385 Computer Architecture - Tomasulos Algorithm Fall 2008 Analyzing a short code-sequence DIV.D F0, F2, F4 ADD.D F6, F0, F8 S.D F6, 0(R1) SUB.D F8, F10, F14 MUL.D F6, F10, F8 1 Analyzing a short

More information

CS 152 Computer Architecture and Engineering. Lecture 14 - Advanced Superscalars

CS 152 Computer Architecture and Engineering. Lecture 14 - Advanced Superscalars CS 152 Comuter Architecture and Engineering Lecture 14 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 02

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Digital Arithmetic Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletch and Andrew Hilton (Duke) Last

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Registers and Counters CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev

More information

128Mb DDR SDRAM. Features. Description. REV 1.1 Oct, 2006

128Mb DDR SDRAM. Features. Description. REV 1.1 Oct, 2006 Features Double data rate architecture: two data transfers per clock cycle Bidirectional data strobe () is transmitted and received with data, to be used in capturing data at the receiver is edge-aligned

More information

DARE+ DARE+ Design Against Radiation Effects (Digital) Cell Libraries. Jupiter Icy Moons Explorer (JUICE) Instruments Workshop 9 November 2011

DARE+ DARE+ Design Against Radiation Effects (Digital) Cell Libraries. Jupiter Icy Moons Explorer (JUICE) Instruments Workshop 9 November 2011 DARE+ Design Against Radiation Effects (Digital) Cell Libraries Jupiter Icy Moons Explorer (JUICE) Instruments Workshop 9 November 2011 Objectives (1/2) Provide a suitable and mixed-signal capable microelectronic

More information

CHAPTER 19 DC Circuits Units

CHAPTER 19 DC Circuits Units CHAPTER 19 DC Circuits Units EMF and Terminal Voltage Resistors in Series and in Parallel Kirchhoff s Rules EMFs in Series and in Parallel; Charging a Battery Circuits Containing Capacitors in Series and

More information

HYB25D256400B[T/C](L) HYB25D256800B[T/C](L) HYB25D256160B[T/C](L)

HYB25D256400B[T/C](L) HYB25D256800B[T/C](L) HYB25D256160B[T/C](L) Data Sheet, Rev. 1.21, Jul. 2004 HYB25D256400B[T/C](L) HYB25D256800B[T/C](L) HYB25D256160B[T/C](L) 256 Mbit Double Data Rate SDRAM DDR SDRAM Memory Products N e v e r s t o p t h i n k i n g. Edition 2004-07

More information

SDRAM AS4SD8M Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory. PIN ASSIGNMENT (Top View)

SDRAM AS4SD8M Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory. PIN ASSIGNMENT (Top View) 128 Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory FEATURES Full Military temp (-55 C to 125 C) processing available Configuration: 8 Meg x 16 (2 Meg x 16 x 4 banks) Fully synchronous; all signals registered

More information

IS42S Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM FEATURES OVERVIEW. PIN CONFIGURATIONS 54-Pin TSOP (Type II)

IS42S Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM FEATURES OVERVIEW. PIN CONFIGURATIONS 54-Pin TSOP (Type II) 1 Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM JANUARY 2008 FEATURES Clock frequency: 166, 143 MHz Fully synchronous; all signals referenced to a positive clock edge Internal bank for

More information

Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge

Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge krisztian.flautner@arm.com kimns@eecs.umich.edu stevenmm@eecs.umich.edu

More information

EEL Project Design Report: Automated Rev Matcher. January 28 th, 2008

EEL Project Design Report: Automated Rev Matcher. January 28 th, 2008 Brad Atherton, masscles@ufl.edu, 352.262.7006 Monique Mennis, moniki@ufl.edu, 305.215.2330 EEL 4914 Project Design Report: Automated Rev Matcher January 28 th, 2008 Project Abstract Our device will minimize

More information

DOUBLE DATA RATE (DDR) SDRAM

DOUBLE DATA RATE (DDR) SDRAM UBLE DATA RATE Features VDD = +2.5V ±.2V, VD = +2.5V ±.2V Bidirectional data strobe transmitted/ received with data, i.e., source-synchronous data capture x6 has two one per byte Internal, pipelined double-data-rate

More information

A48P4616B. 16M X 16 Bit DDR DRAM. Document Title 16M X 16 Bit DDR DRAM. Revision History. AMIC Technology, Corp. Rev. No. History Issue Date Remark

A48P4616B. 16M X 16 Bit DDR DRAM. Document Title 16M X 16 Bit DDR DRAM. Revision History. AMIC Technology, Corp. Rev. No. History Issue Date Remark 16M X 16 Bit DDR DRAM Document Title 16M X 16 Bit DDR DRAM Revision History Rev. No. History Issue Date Remark 1.0 Initial issue January 9, 2014 Final (January, 2014, Version 1.0) AMIC Technology, Corp.

More information

Five Cool Things You Can Do With Powertrain Blockset The MathWorks, Inc. 1

Five Cool Things You Can Do With Powertrain Blockset The MathWorks, Inc. 1 Five Cool Things You Can Do With Powertrain Blockset Mike Sasena, PhD Automotive Product Manager 2017 The MathWorks, Inc. 1 FTP75 Simulation 2 Powertrain Blockset Value Proposition Perform fuel economy

More information

Timing is everything with internal combustion engines By: Bernie Thompson

Timing is everything with internal combustion engines By: Bernie Thompson Timing is everything with internal combustion engines By: Bernie Thompson As one goes through life, it is said that timing is everything. In the case of the internal combustion engine, this could not be

More information

EEC 216 Lecture #10: Power Sources. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #10: Power Sources. Rajeevan Amirtharajah University of California, Davis EEC 216 Lecture #10: Power Sources Rajeevan Amirtharajah University of California, Davis Announcements Outline Review: Adiabatic Charging and Energy Recovery Lecture 9: Dynamic Energy Recovery Logic Lecture

More information

CS 6354: Tomasulo. 21 September 2016

CS 6354: Tomasulo. 21 September 2016 1 CS 6354: Tomasulo 21 September 2016 To read more 1 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer

More information

The seal of the century web tension control

The seal of the century web tension control TENSIONING GEARING CAMMING Three techniques that can improve your automated packaging equipment performance What are 3 core motion techniques that can improve performance? Web Tension Control Proportional

More information

SYNCHRONOUS DRAM. 128Mb: x32 SDRAM. MT48LC4M32B2-1 Meg x 32 x 4 banks

SYNCHRONOUS DRAM. 128Mb: x32 SDRAM. MT48LC4M32B2-1 Meg x 32 x 4 banks SYNCHRONOUS DRAM 128Mb: x32 MT48LC4M32B2-1 Meg x 32 x 4 banks For the latest data sheet, please refer to the Micron Web site: www.micron.com/sdramds FEATURES PC100 functionality Fully synchronous; all

More information

Roehrig Engineering, Inc.

Roehrig Engineering, Inc. Roehrig Engineering, Inc. Home Contact Us Roehrig News New Products Products Software Downloads Technical Info Forums What Is a Shock Dynamometer? by Paul Haney, Sept. 9, 2004 Racers are beginning to realize

More information

Storage and Memory Hierarchy CS165

Storage and Memory Hierarchy CS165 Storage and Memory Hierarchy CS165 What is the memory hierarchy? L1

More information

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer.

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer. To read more CS 6354: Tomasulo 21 September 2016 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer Architecture:

More information

A new perspective. The Kalmar RTG range.

A new perspective. The Kalmar RTG range. A new perspective. The range. All new s enable you to take advantage of automating some or all of your processes; from remote control to fully automated moves. Securing the future of your business. s extensive

More information

Sequential Circuit Background. Young Won Lim 11/6/15

Sequential Circuit Background. Young Won Lim 11/6/15 Sequential Circuit /6/5 Copyright (c) 2 25 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free ocumentation License, Version.2 or any later

More information

AN-1166 Lithium Polymer Battery Charger using GreenPAK State Machine

AN-1166 Lithium Polymer Battery Charger using GreenPAK State Machine AN-1166 Lithium Polymer Battery Charger using GreenPAK State Machine This note describes the design of a complete charging circuit. A single cell Lithium Polymer (LiPol) battery is charged in two stages:

More information

Design-Technology Co-Optimization for 5nm Node and Beyond

Design-Technology Co-Optimization for 5nm Node and Beyond Design-Technology Co-Optimization for 5 Node and Beyond Semicon West 26 Victor Moroz July 2, 26 Why Scaling? When What scales? When does it end? 965 999 2 Moore s Law (Fairchild): Double transistor density

More information

Basic Electricity. Mike Koch Lead Mentor Muncie Delaware Robotics Team 1720 PhyXTGears. and Electronics. for FRC

Basic Electricity. Mike Koch Lead Mentor Muncie Delaware Robotics Team 1720 PhyXTGears. and Electronics. for FRC Basic Electricity and Electronics for FRC Mike Koch Lead Mentor Muncie Delaware Robotics Team 1720 PhyXTGears The Quick Tour The Analog World Basic Electricity The Digital World Digital Logic The Rest

More information

Composite Layout CS/ECE 5710/6710. N-type from the top. N-type Transistor. Polysilicon Mask. Diffusion Mask

Composite Layout CS/ECE 5710/6710. N-type from the top. N-type Transistor. Polysilicon Mask. Diffusion Mask Composite Layout CS/ECE 5710/6710 Introduction to Layout Inverter Layout Example Layout Design Rules Drawing the mask layers that will be used by the fabrication folks to make the devices Very different

More information

Field Programmable Gate Arrays a Case Study

Field Programmable Gate Arrays a Case Study Designing an Application for Field Programmable Gate Arrays a Case Study Bernd Däne www.tu-ilmenau.de/ra Bernd.Daene@tu-ilmenau.de de Technische Universität Ilmenau Topics 1. Introduction and Goals 2.

More information