Warped-Compression: Enabling Power Efficient GPUs through Register Compression

Size: px
Start display at page:

Download "Warped-Compression: Enabling Power Efficient GPUs through Register Compression"

Transcription

1 WarpedCompression: Enabling Power Efficient GPUs through Register Compression Sangpil Lee, Keunsoo Kim, Won Woo Ro (Yonsei University*) Gunjae Koo, Hyeran Jeon, Murali Annavaram (USC) (*Work done while visiting USC)

2 Short Summary Target Register File on GPUs Problem Energy Consumption of Register File Solution Data Compression on Register File Results Reducing 25% of Register File Energy Consumption 2

3 Motivation: Register Power Consumption GPUs Need Large Register Files to Maximize TLP Register File Contributes Significant Portion of the Total GPU Chip Power Register File Size Has Been Growing 512KB 1920 KB 2048 KB 3840 KB 6144 KB Tesla (G80/G92) Tesla (GT200) Fermi (GF110) Kepler (GK110) Maxwell (GM200) Estimated GeForce GTX480 (Fermi) Component Power Consumption* 3 *Leng et al., GPUWattch : Enabling Energy Optimizations in GPGPUs

4 Motivation: GPU Register Characteristics Warp: A Bundle of 32 Threads Operands of a Warp: A Bundle of 32 Thread Registers This bundle of registers is treated as a single instruction operand in GPUs add.u32 %r0, %r1, %r6;... dst src1 src2 Warp Instruction (add.u32 %r0, %r1, %r6) T 0 T 1 T 2 T 3 T 28 T 29 T 30 T 31 r0 r0 r0 r0 r0 r0 r0 r0 r1 r1 r1 r1 r1 r1 r1 r1 r6 r6 r6 r6 r6 r6 r6 r6 32bit Registers X 32 (128byte) 4

5 line Register File Multibanked Register File* 4KB per bank, 32 banks 128bit wide single read/write port Provides 4 thread operands per bank Access 8 banks for collecting a warp operand Bank Arbiter 4KB Bank (128bit Wide) Bank 0 byte Bank 1 byte Bank 2 byte Bank 3 byte Bank 4 byte Bank 5 byte Bank 6 byte Bank 7 byte Operand Collector Buffer (32bit X 32) *Gebhart et al., Energyefficient Mechanisms for Managing Thread Context in Throughput Processors 5

6 Register File Access Energy Accessing Warp Operand Registers Activates Multiple Banks Bank access energy + wire energy Bank Arbiter 4KB SRAM Access Energy 1 7pJ Bank 0 byte Bank 1 Bank 2 Power Hungry! Bank 3 Bank 4 Bank 5 Bank 6 Register byte byte byte File byte Access byte byte is Bank 7 byte 128bit Wire Energy 2 9.6pJ/mm Access Energy/Warp Operand : ( )*8 = 132.8pJ 1 CACTI (1.0V, 45nm) 2 Gebhart et al., Energyefficient Mechanisms for Managing Thread Context in Throughput Processors (1.0v, 40nm) Operand Collector Buffer (32bit X 32) How Can We Reduce Register File Access Energy? 6 1mm

7 Opportunity: Similarity of Register Values Value Similarity is Frequently Observed on a Warp Operand Constant Value: all thread registers in a warp have a same value T 0 T 1 T 2 T 3 T 28 T 29 T 30 T 31 src Index Values: all thread registers have incremental values T 0 T 1 T 2 T 3 T 28 T 29 T 30 T 31 src Low Dynamic Range: values of all thread registers are bounded in a limited range T 0 T 1 T 2 T 3 T 28 T 29 T 30 T 31 src Dynamic Range: 46 (min=127, MAX=173) 7

8 Source of Value Similarity: pathfinder * Index Values Constant Values Low Dynamic Range global void pathfinder_kernel(int iteration,...) { }... int tx = threadidx.x; int bx = blockidx.x; int small_block_cols = BLOCKSIZEiteration*HALO*2; int blkx = small_block_cols*bxborder; int xidx = blkx+tx;... for (int i=0; i<iteration ; i++){ computed = false; if( IN_RANGE(tx, i+1, BLOCKSIZEi2) && isvalid){ computed = true; int left = prev[w]; } } int up = prev[tx]; int right = prev[e]; int shortest = MIN(left, up); shortest = MIN(shortest, right); int index = cols*(startstep+i)+xidx; result[tx] = shortest + wall[index]; Thread Index (0 ~ 1023) Thread Block Index (0 ~ 65535) Application Input Data (0 ~ 9) *from Rodinia Benchmark Suite

9 Arithmetic Distance Distribution How Much is This Opportunity? On Average, 70% Thread Registers are Not Random Zero: neighboring registers has same value 128 bin: neighboring registers differ by at most K bin: neighboring registers differ by at most Zero 128 bin 32K bin Random 9

10 Exploiting Value Similarity for Register Compression 10

11 Register Compression Writeback (32bit X 32) Compressor 50% Compressed Bank Arbiter Bank 0 Bank 1 Bank 2 Bank 3 Comp Comp Comp Comp B B B B Bank 4 Bank 5 Bank 6 Bank 7 Only 50% of RF & Wire Active Decompression Warp Operand (32bit X 32) 11

12 But Is It Practical? Energy Consumption Compression & Decompression consume extra energy Register File Access Latency Compression & Decompression increase register file access latency Requirements for Register Compression Low Energy Compression Low Latency Compression High Compression Ratio 12

13 Low Latency/Energy Compression DeltaImmediate (BΔI) Compression Optimized for zero and similar value compression Use base and delta to represent original value Original Data Warp Operand (32 Thread Registers) 100,000, ,000, ,000, ,000,031 4byte 4byte 4byte 4byte 128byte BΔI Compression Data Representation (4, Delta1) Value 100,000,000 4byte Delta Values 1byte 1byte 31 1byte 35byte Register File Bank 0 Bank 1 Bank 2 Δ Δ Δ Δ Δ Δ Δ Δ Δ Bank 3 Bank 4 Bank 5 Bank 6 Bank 7 3 Bank Used 13 5 Bank Unused

14 BDI /Delta Type Ratio Compression Ratio BΔI Compression Parameters BΔI Can Use Various and Delta size : 2, 4, 8byte / Delta: 0, 1, 2byte Various and Delta can improve compression ratio But also increase complexity of compression/decompression Use Single, Various Delta Most of registers can be compressed by using 4byte ( 4) Various Delta improve compression ratio We use 4byte and 0/1/2byte Delta 1 Not Compressed /Delta 4 8/Delta 2 8/Delta 1 8/Delta 0 4/Delta 2 4/Delta /Delta 0 only 4/Delta 1 only 4/Delta 2 only 4/Delta 0,1,2 0 AVG 4/Delta 0 0 AVG 14

15 Bank Arbiter Compression Range Indicator Vector Compressor Unit Array Interconnect Decompressor Unit Array WarpedCompression Architecture Compressor Inserted in front of the register file bank Decompressor Inserted in front of the operand collectors Bank Arbiter Tracks which register is compressed What compression parameters are used Warp Scheduler Issue Register Bank 0 Operand Collector Register Bank 1 Operand Collector SIMD EXE Units Register Bank 31 Operand Collector 15

16 Dealing with Branch ergence Branch ergence Partially update destination registers in a warp using the active mask If the destination registers are compressed, registers cannot be updated using active mask True If (threadid % 2) False Active Mask add r0, r1, r6 Active Mask r0 T 0 T 1 T 2 T 3 T 4 T 5 T 6 T Execution Results sub r0, r1, r6 r0 r0 T 0 T 1 T 2 T 3 T 4 T 5 T 6 T T 0 T 1 T 2 T 3 T 4 T 5 T 6 T Δ Δ Δ Δ Compressed Destination Register

17 Compression Ratio N/A N/A N/A N/A N/A N/A Simplifying Branch ergence Handling Compression Ratio in ergent Region is Low Thread registers in a diverged warp can have different values according to their execution path Nondivergent Region ergent Region Overall Simple Solution: Disable Compression in ergent Region But What If a Destination Register is Already Compressed? Using dummy MOV instructions 17

18 Bank Arbiter Compressor Decompressor Handling Branch ergence (1) Turn Off Register Compression Compression unit is disabled when the active mask contains any zero values Decompress Destination Operand Register Bank arbiter injects a dummy MOV instruction to the execution pipeline when a destination register is compressed This dummy MOV instruction has the same src/dest register Access Request r1, r6 2 ergence Check 3 Destination Reg. r0 Check mov r0, r0 add r0, r1, r6 4 If Destination Register is Compressed, Suspend Original Request & Inject Dummy MOV Instruction Register File B Δ Δ Δ Δ Dest. Reg is Compressed 5 Read & Decompress Warp Scheduler Operand Collector SIMD EXE Units 1 Register Access Request to Read Input Operands 18

19 Bank Arbiter Compressor (Disabled) Decompressor Handling Branch ergence (2) Update Register File Write uncompressed register value by the dummy MOV instruction At this point, the destination register on the register file is uncompressed Resume The Suspended Request Bank arbiter processes the suspended access request to the destination register as conventional register access Access Request r1, r6 7 Bank Arbiter Grants Register Write for Uncompressed Register Value 8 Bank Arbiter Restarts Suspended Register Access Request Register File B Δ Δ Δ Δ Dest. Reg is Uncompressed Compressed Operand Collector SIMD EXE Units 6 Writeback Uncompressed Destination Register Value 19

20 Register File Energy Register File Energy Saving Average Register File Energy Consumption: Reduced by 25% Dynamic energy consumption: Reduced by register compression Leakage energy consumption: Reduced by unused banklevel powergating Extra Energy Consumption of Compressor/Decompressor: Insignificant RF Leakage RF Dynamic Compressor Decompressor AVG 20

21 Exeution Time Impact on Performance Performance Degradation: Negligible 2 cycle compression + 1 cycle decompression latency = 0.1% performance loss Dummy MOV instructions account for less 2% of the total instruction count line WarpedCompression 21

22 Conclusion Register Files are Power Hungry But Register File Data Exhibits Strong Value Similarity Use BΔI Compression to Exploit Value Similarity to Compress Register Data Compression is Effective Reduce the size of a warp operand to 60% Compression is Energy Efficient Save 25% of total register file energy consumption Compression Has Negligible Performance Impact 0.1% degradation 22

23 Backup Slides 23

24 Evaluation Environment Simulation Parameters Parameter Value Clock Frequency 1.4GHz SMs / GPU 15 Warp Schedulers / SM 2 Warp Scheduling Policy GTO SIMT Lane Width 32 Max # of Warps / SM 48 Max # of Threads / SM 1536 Register File Size 128 KB Max Registers / SM 32,768 # of Register Banks 32 Bit Width / Bank 128bit # of Entries / Bank 256 # of Compressors 2 # of Decompressors 4 Compression Latency 2 cycle Decompression Latency 1 cycle Bank Wakeup Latency 10 cycle Parameter Operating Voltage Wire Capacitance (45nm) Wire Energy (128bit) Access Energy / Bank (45nm) Leakage Power / Bank (45nm) Compression Unit Energy / Activation Compression Unit Leakage Power Decompression Unit Energy / Activation Decompression Unit Leakage Power Value 1.0 V 300 ff/mm 9.6 pj/mm 7pJ 5.8 mw 23 pj 0.12 mw 21 pj 0.08 mw Benchmarks GPGPUsim, Rodinia benchmark suite, Parboil benchmark suite 24

25 Compression & Decompression Unit Simplifying BΔI GPU Register: 32bit Only use 4byte base and 0/1/2byte delta for compressing register values Only need 32bit Adder/Subtractors, bit comparators 4Byte 128byte Original Data 32bit Subtractor 32bit Subtractor 32bit Subtractor 32bit Subtractor 32bit Subtractor 4Byte Δ 0 Δ 0 Δ 0 Δ 0 Δ 0 Δ 0 Δ n1 4Byte Δ 0 Δ 0 Δ 0 Δ 0 Δ 0 Δ 1 Δ 2 Δ 3 Δ 30 Sign Extension Comparator Sign Extension Comparator Sign Extension Comparator Sign Extension Comparator Sign Extension Comparator Yes Δ 0 Δ 0 Δ n1 Compressible? Packing Data No 32bit Adder 32bit Adder 32bit Adder 32bit Adder 32bit Adder 32bit Adder 128byte Original Data Compressed Data out Original Data out Compressor Decompressor 25

26 Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Nondiv Arithmetic Distance Distribution How Much is This Opportunity? On Average, 79% Thread Registers are Not Random Zero: neighboring registers has same value 128 bin: neighboring registers differ by at most K bin: neighboring registers differ by at most N/A N/A N/A N/A N/A Zero 128 bin 32K bin Random N/A LIB AES BFS CP LPS STO backp hots path srad dwt2d cutcp mriq sad sgemm spmv stencil Avg 26

27 Register Compression Compressed Register Data Reduces the Number of Register File Access Decompression Bank Arbiter 50% Compressed Data Bank 0 Comp B Bank 1 Comp B Bank 2 Comp B Bank 3 Comp B Bank 4 Bank 5 Bank 6 Bank 7 Only 50% of RF & Wire Active Do Not Need to Access Decompression Warp Operand (32bit X 32) 27

28 BDI /Delta Type Ratio Compression Ratio BΔI Compression Parameters BΔI Can Use Various and Delta size : 2, 4, 8byte / Delta: 0, 1, 2byte Various and Delta can improve compression ratio But it increases complexity of compression/decompression Use Fixed, Various Delta Most of registers can be compressed by using 4byte (4) GPU register granularity: 32bit Do not need 2 or 8byte Various Delta improve compression ratio We use 4byte and 0/1/2byte Delta Not Compressed 8/Delta 4 8/Delta 2 8/Delta 1 8/Delta 0 4/Delta 2 4/Delta /Delta 0 only 4/Delta 1 only 4/Delta 2 only 4/Delta 0,1,2 0 AVG 4/Delta 0 28

29 Compression Ratio N/A N/A N/A N/A N/A N/A Handling Branch ergence Compression Ratio in ergent Region is Low Nondivergent Region ergent Region Overall Solution: Disable Compression & Decompress Register Before Access Dummy MOV instruction (which has same sourcedestination) used for decompressing registers when the destination register is compressed Writeback Active Mask Has 0 Destination Register is Compressed? Disable Compressor Inject Dummy MOV 29 Decompress Destination Register Target Register Writeback Suspended Resume Register Write Complete Writeback

30 Register File Energy Register File Energy Saving Average Register File Energy Consumption: Reduced by 25% Dynamic energy consumption: Reduced by register compression Leakage energy consumption: Reduced by unused banklevel powergating Extra Energy Consumption of Compressor/Decompressor: Insignificant 1 RF Leakage RF Dynamic Compressor Decompressor LIB AES BFS CP LPS STO back hot path srad dwt2d cutcp mriq sad sgemm spmv stencil AVG 30

Lecture 14: Instruction Level Parallelism

Lecture 14: Instruction Level Parallelism Lecture 14: Instruction Level Parallelism Last time Pipelining in the real world Today Control hazards Other pipelines Take QUIZ 10 over P&H 4.10-15, before 11:59pm today Homework 5 due Thursday March

More information

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution 6.823, L16--1 Advanced Superscalar Architectures Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Speculative and Out-of-Order Execution Branch Prediction kill kill Branch

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 20: Multiplier Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411

More information

CS 6354: Tomasulo. 21 September 2016

CS 6354: Tomasulo. 21 September 2016 1 CS 6354: Tomasulo 21 September 2016 To read more 1 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer

More information

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer.

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer. To read more CS 6354: Tomasulo 21 September 2016 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer Architecture:

More information

Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge

Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge krisztian.flautner@arm.com kimns@eecs.umich.edu stevenmm@eecs.umich.edu

More information

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

More information

Energy Efficient Content-Addressable Memory

Energy Efficient Content-Addressable Memory Energy Efficient Content-Addressable Memory Advanced Seminar Computer Engineering Institute of Computer Engineering Heidelberg University Fabian Finkeldey 26.01.2016 Fabian Finkeldey, Energy Efficient

More information

Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3. David Wentzlaff Department of Electrical Engineering Princeton University

Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3. David Wentzlaff Department of Electrical Engineering Princeton University Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3 David Wentzlaff Department of Electrical Engineering Princeton University 1 Agenda SpeculaJon and Branches Register Renaming Memory DisambiguaJon

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 20: Parallelism ILP to Multicores James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L20 S1, James C. Hoe, CMU/ECE/CALCM, 2018 18 447 S18 L20 S2, James C. Hoe, CMU/ECE/CALCM,

More information

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon]

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Anne Bracy CS 3410 Computer Science Cornell University [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Prog. Mem PC +4 inst Reg. File 5 5 5 control ALU Data Mem Fetch Decode Execute Memory WB

More information

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Computer Architecture: Out-of-Order Execution Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Reading for Today Smith and Sohi, The Microarchitecture of Superscalar Processors, Proceedings

More information

In-Place Associative Computing:

In-Place Associative Computing: In-Place Associative Computing: A New Concept in Processor Design 1 Page Abstract 3 What s Wrong with Existing Processors? 3 Introducing the Associative Processing Unit 5 The APU Edge 5 Overview of APU

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by Milo Martin & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by M. Martin, A.Roth, C.J. Taylor and Benedict Brown at the University of Pennsylvania with sources that included

More information

Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs

Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs Louis Bavoil, Principal Engineer Booth #223 - South Hall www.nvidia.com/gdc Full-Screen Pixel Shader SM TEX L2 DRAM CROP SM = Streaming

More information

Non-volatile STT-RAM: A True Universal Memory

Non-volatile STT-RAM: A True Universal Memory Non-volatile STT-RAM: A True Universal Memory Farhad Tabrizi Grandis Inc., Milpitas, California August 13 th, 2009 Santa Clara, CA, USA, August 2009 1 Outline Grandis Corporation Overview Current Flash

More information

SDRAM DEVICE OPERATION

SDRAM DEVICE OPERATION POWER UP SEQUENCE SDRAM must be initialized with the proper power-up sequence to the following (JEDEC Standard 21C 3.11.5.4): 1. Apply power and start clock. Attempt to maintain a NOP condition at the

More information

EECS 583 Class 9 Classic Optimization

EECS 583 Class 9 Classic Optimization EECS 583 Class 9 Classic Optimization University of Michigan September 28, 2016 Generalizing Dataflow Analysis Transfer function» How information is changed by something (BB)» OUT = GEN + (IN KILL) /*

More information

Unit 9: Static & Dynamic Scheduling

Unit 9: Static & Dynamic Scheduling CIS 501: Computer Architecture Unit 9: Static & Dynamic Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Mar;n at University of Pennsylvania CIS 501: Comp. Arch. Prof. Milo Martin

More information

RAM-Type Interface for Embedded User Flash Memory

RAM-Type Interface for Embedded User Flash Memory June 2012 Introduction Reference Design RD1126 MachXO2-640/U and higher density devices provide a User Flash Memory (UFM) block, which can be used for a variety of applications including PROM data storage,

More information

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士 Computer Architecture 计算机体系结构 Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review ISA, micro-architecture, physical design Evolution of ISA CISC vs

More information

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS Lipasti, artin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar GAS STATION Pipelining II Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin,

More information

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT Features High Performance: f Clock Frequency -7K 3 CL=2-75B, CL=3-8B, CL=2 Single Pulsed RAS Interface Fully Synchronous to Positive Clock Edge Four Banks controlled by BS0/BS1 (Bank Select) Units 133

More information

Fast In-place Transposition. I-Jui Sung, University of Illinois Juan Gómez-Luna, University of Córdoba (Spain) Wen-Mei Hwu, University of Illinois

Fast In-place Transposition. I-Jui Sung, University of Illinois Juan Gómez-Luna, University of Córdoba (Spain) Wen-Mei Hwu, University of Illinois Fast In-place Transposition I-Jui Sung, University of Illinois Juan Gómez-Luna, University of Córdoba (Spain) Wen-Mei Hwu, University of Illinois Full Transposition } Full transposition is desired for

More information

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide)

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) Out-of-order Pipeline Register Read When do instructions read the register file? Fetch Decode Rename Dispatch Buffer of instructions Issue Reg-read Execute Writeback Commit Option #: after select, right

More information

COSC 6385 Computer Architecture. - Tomasulos Algorithm

COSC 6385 Computer Architecture. - Tomasulos Algorithm COSC 6385 Computer Architecture - Tomasulos Algorithm Fall 2008 Analyzing a short code-sequence DIV.D F0, F2, F4 ADD.D F6, F0, F8 S.D F6, 0(R1) SUB.D F8, F10, F14 MUL.D F6, F10, F8 1 Analyzing a short

More information

Advanced Superscalar Architectures

Advanced Superscalar Architectures Advanced Suerscalar Architectures Krste Asanovic Laboratory for Comuter Science Massachusetts Institute of Technology Physical Register Renaming (single hysical register file: MIPS R10K, Alha 21264, Pentium-4)

More information

SYNCHRONOUS DRAM. 128Mb: x32 SDRAM. MT48LC4M32B2-1 Meg x 32 x 4 banks

SYNCHRONOUS DRAM. 128Mb: x32 SDRAM. MT48LC4M32B2-1 Meg x 32 x 4 banks SYNCHRONOUS DRAM 128Mb: x32 MT48LC4M32B2-1 Meg x 32 x 4 banks For the latest data sheet, please refer to the Micron Web site: www.micron.com/sdramds FEATURES PC100 functionality Fully synchronous; all

More information

Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches

Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches Se-Hyun Yang and Babak Falsafi Computer Architecture Laboratory (CALCM) Carnegie Mellon University {sehyun, babak}@cmu.edu http://www.ece.cmu.edu/~powertap

More information

Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology

Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology Dual-Rail Domino Logic Circuits with PVT Variations in VDSM Technology C. H. Balaji 1, E. V. Kishore 2, A. Ramakrishna 3 1 Student, Electronics and Communication Engineering, K L University, Vijayawada,

More information

Decoupling Loads for Nano-Instruction Set Computers

Decoupling Loads for Nano-Instruction Set Computers Decoupling Loads for Nano-Instruction Set Computers Ziqiang (Patrick) Huang, Andrew Hilton, Benjamin Lee Duke University {ziqiang.huang, andrew.hilton, benjamin.c.lee}@duke.edu ISCA-43, June 21, 2016 1

More information

Online Learning and Optimization for Smart Power Grid

Online Learning and Optimization for Smart Power Grid 1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 22: Memery, ROM [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12 CMPEN 411 L22 S.1

More information

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design ENGN64: Design of Computing Systems Topic 5: Pipeline Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

Marwan Adas December 6, 2011

Marwan Adas December 6, 2011 Marwan Adas December 6, 2011 SPONGENT A Lighweight hash function SPONGENT = SPONGE + PRESENT + Unkeyed PRESENT- - - type permutation π: 4- bit S- box and bit diffusion Diagrams from www.spongent.com SPONGENT

More information

Storage and Memory Hierarchy CS165

Storage and Memory Hierarchy CS165 Storage and Memory Hierarchy CS165 What is the memory hierarchy? L1

More information

A Predictive Delay Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture

A Predictive Delay Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture A Predictive Fault Avoidance Scheme for Coarse Grained Reconfigurable Architecture Toshihiro Kameda 1 Hiroaki Konoura 1 Dawood Alnajjar 1 Yukio Mitsuyama 2 Masanori Hashimoto 1 Takao Onoye 1 hasimoto@ist.osaka

More information

Battery durability. Accelerated ageing test method

Battery durability. Accelerated ageing test method Battery durability Accelerated ageing test method Battery performance degradation ageing Four principal types of battery performance degradation Capacity fade Loss of cycleable Li Loss of electroactive

More information

Design Specification. DDR2 UDIMM Enhanced Performance Profiles

Design Specification. DDR2 UDIMM Enhanced Performance Profiles Design Specification DDR2 UDIMM Enhanced Performance Profiles Document Change History REV Date Reason for Change 01 Initial Release i Design Specification Table of Contents Chapter 1. Enhanced Performance

More information

Enhancing Energy Efficiency of Database Applications Using SSDs

Enhancing Energy Efficiency of Database Applications Using SSDs Seminar Energy-Efficient Databases 29.06.2011 Enhancing Energy Efficiency of Database Applications Using SSDs Felix Martin Schuhknecht Motivation vs. Energy-Efficiency Seminar 29.06.2011 Felix Martin Schuhknecht

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register

More information

Parallelism I: Inside the Core

Parallelism I: Inside the Core Parallelism I: Inside the Core 1 The final Comprehensive Same general format as the Midterm. Review the homeworks, the slides, and the quizzes. 2 Key Points What is wide issue mean? How does does it affect

More information

Storage-less and converter-less maximum power tracking of photovoltaic cells for a nonvolatile microprocessor

Storage-less and converter-less maximum power tracking of photovoltaic cells for a nonvolatile microprocessor Seoul National University Storage-less and converter-less maximum power tracking of photovoltaic cells for a nonvolatile microprocessor Cong Wang, Naehyuck Chang, Y. Kim, S. Park, Yongpan Liu, Hyung Gyu

More information

THE alarming rate, at which global energy reserves are

THE alarming rate, at which global energy reserves are Proceedings of the 12th International IEEE Conference on Intelligent Transportation Systems, St. Louis, MO, USA, October 3-7, 2009 One Million Plug-in Electric Vehicles on the Road by 2015 Ahmed Yousuf

More information

DAT105: Computer Architecture Study Period 2, 2009 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation

DAT105: Computer Architecture Study Period 2, 2009 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation Study Period 2, 29 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation Mafijul Islam Department of Computer Science and Engineering November 12, 29 Study Period 2, 29 Goals: To understand

More information

SDRAM AS4SD8M Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory. PIN ASSIGNMENT (Top View)

SDRAM AS4SD8M Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory. PIN ASSIGNMENT (Top View) 128 Mb: 8 Meg x 16 SDRAM Synchronous DRAM Memory FEATURES Full Military temp (-55 C to 125 C) processing available Configuration: 8 Meg x 16 (2 Meg x 16 x 4 banks) Fully synchronous; all signals registered

More information

DQ18 DQ19 VDD DQ20 NC *VREF **CKE1 VSS DQ21 DQ22 DQ23 VSS DQ24 DQ25 DQ26 DQ27 VDD DQ28 DQ29 DQ30 DQ31 VSS **CLK2 NC NC SDA SCL VDD

DQ18 DQ19 VDD DQ20 NC *VREF **CKE1 VSS DQ21 DQ22 DQ23 VSS DQ24 DQ25 DQ26 DQ27 VDD DQ28 DQ29 DQ30 DQ31 VSS **CLK2 NC NC SDA SCL VDD PIN CONFIGURATIONS (Front side/back side) Pin Front Pin Front Pin Front Pin Back Pin Back Pin Back 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 21 22 23 24 25 26 27 DQ8 DQ9 0 1 2 3 4 5 CB0 CB1 WE 0

More information

Bringing ARB_gpu_shader_fp64 to Intel GPUs

Bringing ARB_gpu_shader_fp64 to Intel GPUs Bringing ARB_gpu_shader_fp64 to Intel GPUs Iago Toral Quiroga XDC 2016 Helsinki, Finland ARB_gpu_shader_fp64 Overview Scope Intel implementation NIR i965 Current status Contents ARB_gpu_shader_fp64

More information

The Effect of Data Granularity on Load Data Compression

The Effect of Data Granularity on Load Data Compression The Effect of Data Granularity on Load Data Compression Andreas Unterweger 1, Dominik Engel 1 and Martin Ringwelski 2 1 Salzburg University of Applied Sciences, Josef Ressel Center for User-Centric Smart

More information

Optimal Thermostat Programming and Electricity Prices for Customers with Demand Charges

Optimal Thermostat Programming and Electricity Prices for Customers with Demand Charges Arizona State University School for Engineering of Matter, Transport and Energy Optimal Thermostat Programming and Electricity Prices for Customers with Demand Charges Reza Kamyar and Matthew Peet Cybernetic

More information

IS42S32200L IS45S32200L

IS42S32200L IS45S32200L IS42S32200L IS45S32200L 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM OCTOBER 2012 FEATURES Clock frequency: 200, 166, 143, 133 MHz Fully synchronous; all signals referenced to a positive

More information

INCREASING ENERGY EFFICIENCY BY MODEL BASED DESIGN

INCREASING ENERGY EFFICIENCY BY MODEL BASED DESIGN INCREASING ENERGY EFFICIENCY BY MODEL BASED DESIGN GREGORY PINTE THE MATHWORKS CONFERENCE 2015 EINDHOVEN 23/06/2015 FLANDERS MAKE Strategic Research Center for the manufacturing industry Integrating the

More information

SFM/TFM Power Integrity Guidelines Samtec SFM/TFM Series Measurement and Simulation Data

SFM/TFM Power Integrity Guidelines Samtec SFM/TFM Series Measurement and Simulation Data SFM/TFM Power Integrity Guidelines Samtec SFM/TFM Series Measurement and Simulation Data Scott McMorrow, Director of Engineering Page 1 SFM/TFM Power Integrity Guidelines Modeled Section SFM Board TFM

More information

SPARC T4-4 Server with. Oracle Database 11g Release 2

SPARC T4-4 Server with. Oracle Database 11g Release 2 SPARC T4-4 Server with Oracle Database 11g Release 2 TPC-H Rev. 2.14.2 TPC-Pricing 1.6.0 Report Date: September 26, 2011 Total System Cost Composite Query per Hour Metric Price / Performance $925,525 USD

More information

Design and Analysis of 32 Bit Regular and Improved Square Root Carry Select Adder

Design and Analysis of 32 Bit Regular and Improved Square Root Carry Select Adder 76 Design and Analysis of 32 Bit Regular and Improved Square Root Carry Select Adder Anju Bala 1, Sunita Rani 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India

More information

WESTERN INTERCONNECTION TRANSMISSION TECHNOLGOY FORUM

WESTERN INTERCONNECTION TRANSMISSION TECHNOLGOY FORUM 1 1 The Latest in the MIT Future of Studies Recognizing the growing importance of energy issues and MIT s role as an honest broker, MIT faculty have undertaken a series of in-depth multidisciplinary studies.

More information

IS42S32200C1. 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM

IS42S32200C1. 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM 512K Bits x 32 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM JANUARY 2007 FEATURES Clock frequency: 183, 166, 143 MHz Fully synchronous; all signals referenced to a positive clock edge Internal bank

More information

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science. Chapter 3: Computer Organization Fundamentals Prof. Ben Lee Oregon State University School of Electrical Engineering and Computer Science Chapter Goals Understand the organization of a computer system

More information

Power Integrity Guidelines Samtec MPT/MPS Series Connectors Measurement and Simulation Data

Power Integrity Guidelines Samtec MPT/MPS Series Connectors Measurement and Simulation Data Power Integrity Guidelines Samtec MPT/MPS Series Connectors Measurement and Simulation Data Scott McMorrow, Director of Engineering Page 1 Modeled Section MPS Board MPT Board Power Via Power Via Power

More information

CSCI 510: Computer Architecture Written Assignment 2 Solutions

CSCI 510: Computer Architecture Written Assignment 2 Solutions CSCI 510: Computer Architecture Written Assignment 2 Solutions The following code does compution over two vectors. Consider different execution scenarios and provide the average number of cycles per iterion

More information

Pipelined MIPS Datapath with Control Signals

Pipelined MIPS Datapath with Control Signals uction ess uction Rs [:26] (Opcode[5:]) [5:] ranch luor. Decoder Pipelined MIPS path with Signals luor Raddr at Five instruction sequence to be processed by pipeline: op [:26] rs [25:2] rt [2:6] rd [5:]

More information

Using Advanced Limit Line Features

Using Advanced Limit Line Features Application Note Using Advanced Limit Line Features MS2717B, MS2718B, MS2719B, MS2723B, MS2724B, MS2034A, MS2036A, and MT8222A Economy Microwave Spectrum Analyzer, Spectrum Master, and BTS Master The limit

More information

PV inverters in a High PV Penetration scenario Challenges and opportunities for smart technologies

PV inverters in a High PV Penetration scenario Challenges and opportunities for smart technologies PV inverters in a High PV Penetration scenario Challenges and opportunities for smart technologies Roland Bründlinger Operating Agent IEA-PVPS Task 14 UFTP & IEA-PVPS Workshop, Istanbul, Turkey 16th February

More information

Online Learning and Optimization for Smart Power Grid

Online Learning and Optimization for Smart Power Grid 1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical

More information

Algebraic Integer Encoding and Applications in Discrete Cosine Transform

Algebraic Integer Encoding and Applications in Discrete Cosine Transform RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Algebraic Integer Encoding and Applications in Discrete Cosine Transform Minyi Fu Supervisors: Dr. G. A. Jullien Dr. M. Ahmadi Department

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

F²MC-8FX FAMILY MB95330 SERIES DC INVERTER CONTROL F2MC- 8L/8FX SOFTUNE C LIBRARY 120 HALL SENSOR/SENSORLESS 8-BIT MICROCONTROLLER APPLICATION NOTE

F²MC-8FX FAMILY MB95330 SERIES DC INVERTER CONTROL F2MC- 8L/8FX SOFTUNE C LIBRARY 120 HALL SENSOR/SENSORLESS 8-BIT MICROCONTROLLER APPLICATION NOTE Fujitsu Semiconductor (Shanghai) Co., Ltd. Application Note MCU-AN-500067-E-14 F²MC-8FX FAMILY 8-BIT MICROCONTROLLER MB95330 SERIES 120 HALL SENSOR/SENSORLESS DC INVERTER CONTROL F2MC- 8L/8FX SOFTUNE C

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12

More information

Field Programmable Gate Arrays a Case Study

Field Programmable Gate Arrays a Case Study Designing an Application for Field Programmable Gate Arrays a Case Study Bernd Däne www.tu-ilmenau.de/ra Bernd.Daene@tu-ilmenau.de de Technische Universität Ilmenau Topics 1. Introduction and Goals 2.

More information

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media Xinyue Liu, Xiangnan Kong, Yanhua Li Worcester Polytechnic Institute February 22, 2017 1 / 34 About

More information

- - DQ0 NC DQ1 DQ0 DQ2 - NC DQ1 DQ3 NC - NC

- - DQ0 NC DQ1 DQ0 DQ2 - NC DQ1 DQ3 NC - NC SYNCHRONOUS DRAM 64Mb: x4, x8, x16 MT48LC16M4A2 4 Meg x 4 x 4 banks MT48LC8M8A2 2 Meg x 8 x 4 banks MT48LC4M16A2 1 Meg x 16 x 4 banks For the latest data sheet, please refer to the Micron Web site: www.micron.com/mti/msp/html/datasheet.html

More information

Electric Power Research Institute, USA 2 ABB, USA

Electric Power Research Institute, USA 2 ABB, USA 21, rue d Artois, F-75008 PARIS CIGRE US National Committee http : //www.cigre.org 2016 Grid of the Future Symposium Congestion Reduction Benefits of New Power Flow Control Technologies used for Electricity

More information

CHECK AND CALIBRATION PROCEDURES FOR FATIGUE TEST BENCHES OF WHEEL

CHECK AND CALIBRATION PROCEDURES FOR FATIGUE TEST BENCHES OF WHEEL STANDARDS October 2017 CHECK AND CALIBRATION PROCEDURES FOR FATIGUE TEST BENCHES OF WHEEL E S 3.29 Page 1/13 PROCÉDURES DE CONTRÔLE ET CALIBRAGE DE FATIGUE BANCS D'ESSAIS DE ROUE PRÜFUNG UND KALIBRIERUNG

More information

M2 Instruction Set Architecture

M2 Instruction Set Architecture M2 Instruction Set Architecture Module Outline Addressing modes. Instruction classes. MIPS-I ISA. High level languages, Assembly languages and object code. Translating and starting a program. Subroutine

More information

Layout Design and Implementation of Adiabatic based Low Power CPAL Ripple Carry Adder

Layout Design and Implementation of Adiabatic based Low Power CPAL Ripple Carry Adder Layout Design and Implementation of Adiabatic based Low Power CPAL Ripple Carry Adder Ms. Bhumika Narang TCE Department CMR Institute of Technology, Bangalore er.bhumika23@gmail.com Abstract this paper

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Digital Arithmetic Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletch and Andrew Hilton (Duke) Last

More information

INTERCONNECTION POSSIBILITIES FOR THE WORKING VOLUMES OF THE ALTERNATING HYDRAULIC MOTORS

INTERCONNECTION POSSIBILITIES FOR THE WORKING VOLUMES OF THE ALTERNATING HYDRAULIC MOTORS Scientific Bulletin of the Politehnica University of Timisoara Transactions on Mechanics Special issue The 6 th International Conference on Hydraulic Machinery and Hydrodynamics Timisoara, Romania, October

More information

ARC-H: Adaptive replacement cache management for heterogeneous storage devices

ARC-H: Adaptive replacement cache management for heterogeneous storage devices Journal of Systems Architecture 58 (2012) ARC-H: Adaptive replacement cache management for heterogeneous storage devices Young-Jin Kim, Division of Electrical and Computer Engineering, Ajou University,

More information

DS1643/DS1643P Nonvolatile Timekeeping RAM

DS1643/DS1643P Nonvolatile Timekeeping RAM Nonvolatile Timekeeping RAM www.dalsemi.com FEATURES Integrated NV SRAM, real time clock, crystal, power-fail control circuit and lithium energy source Clock registers are accessed identically to the static

More information

TS1SSG S (TS16MSS64V6G)

TS1SSG S (TS16MSS64V6G) Description The TS1SSG10005-7S (TS16MSS64V6G) is a 16M bit x 64 Synchronous Dynamic RAM high-density memory module. The TS1SSG10005-7S (TS16MSS64V6G) consists of 4 piece of CMOS 16Mx16bits Synchronous

More information

Exploiting Clock Skew Scheduling for FPGA

Exploiting Clock Skew Scheduling for FPGA Exploiting Clock Skew Scheduling for FPGA Sungmin Bae, Prasanth Mangalagiri, N. Vijaykrishnan Email {sbae, mangalag, vijay}@cse.psu.edu CSE Department, Pennsylvania State University, University Park, PA

More information

mith College Computer Science CSC231 Assembly Fall 2017 Week #4 Dominique Thiébaut

mith College Computer Science CSC231 Assembly Fall 2017 Week #4 Dominique Thiébaut mith College Computer Science CSC231 Assembly Fall 2017 Week #4 Dominique Thiébaut dthiebaut@smith.edu How are Integers Stored in Memory? 120 11F 11E 11D 11C 11B 11A 119 118 117 116 115 114 113 112 111

More information

Fault-tolerant Control System for EMB Equipped In-wheel Motor Vehicle

Fault-tolerant Control System for EMB Equipped In-wheel Motor Vehicle EVS8 KINTEX, Korea, May 3-6, 15 Fault-tolerant Control System for EMB Equipped In-wheel Motor Vehicle Seungki Kim 1, Kyungsik Shin 1, Kunsoo Huh 1 Department of Automotive Engineering, Hanyang University,

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

Test Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints

Test Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints Test Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints Thomas Edison Yu, Tomokazu Yoneda, Krishnendu Chakrabarty and Hideo Fujiwara Nara Institute of Science

More information

Steady-State Power System Security Analysis with PowerWorld Simulator

Steady-State Power System Security Analysis with PowerWorld Simulator Steady-State Power System Security Analysis with PowerWorld Simulator using PowerWorld Simulator 2001 South First Street Champaign, Illinois 61820 +1 (217) 384.6330 support@powerworld.com http://www.powerworld.com

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 02

More information

M464S1724CT1 SDRAM SODIMM 16Mx64 SDRAM SODIMM based on 8Mx16,4Banks,4K Refresh,3.3V Synchronous DRAMs with SPD. Pin. Pin. Back. Front DQ53 DQ54 DQ55

M464S1724CT1 SDRAM SODIMM 16Mx64 SDRAM SODIMM based on 8Mx16,4Banks,4K Refresh,3.3V Synchronous DRAMs with SPD. Pin. Pin. Back. Front DQ53 DQ54 DQ55 M464S1724CT1 SDRAM SODIMM 16Mx64 SDRAM SODIMM based on 8Mx16,4Banks,4K Refresh,3.3V Synchronous DRAMs with SPD GENERAL DESCRIPTION The Samsung M464S1724CT1 is a 16M bit x 64 Synchronous Dynamic RAM high

More information

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining CMU 18-447 Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining Instructor: Prof. Onur Mutlu TAs: Justin Meza, Yoongu Kim, Jason Lin 1 Adding the REP

More information

IS42S Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM FEATURES OVERVIEW. PIN CONFIGURATIONS 54-Pin TSOP (Type II)

IS42S Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM FEATURES OVERVIEW. PIN CONFIGURATIONS 54-Pin TSOP (Type II) 1 Meg Bits x 16 Bits x 4 Banks (64-MBIT) SYNCHRONOUS DYNAMIC RAM JANUARY 2008 FEATURES Clock frequency: 166, 143 MHz Fully synchronous; all signals referenced to a positive clock edge Internal bank for

More information

Model-Based Design and Hardware-in-the-Loop Simulation for Clean Vehicles Bo Chen, Ph.D.

Model-Based Design and Hardware-in-the-Loop Simulation for Clean Vehicles Bo Chen, Ph.D. Model-Based Design and Hardware-in-the-Loop Simulation for Clean Vehicles Bo Chen, Ph.D. Dave House Associate Professor of Mechanical Engineering and Electrical Engineering Department of Mechanical Engineering

More information

Newly Developed High Power 2-in-1 IGBT Module

Newly Developed High Power 2-in-1 IGBT Module Newly Developed High Power 2-in-1 IGBT Module Takuya Yamamoto Shinichi Yoshiwatari ABSTRACT Aiming for applications to new energy sectors, such as wind power and solar power generation, which are continuing

More information

Intelligent Energy Management System Simulator for PHEVs at a Municipal Parking Deck in a Smart Grid Environment

Intelligent Energy Management System Simulator for PHEVs at a Municipal Parking Deck in a Smart Grid Environment Intelligent Energy Management System Simulator for PHEVs at a Municipal Parking Deck in a Smart Grid Environment Preetika Kulshrestha, Student Member, IEEE, Lei Wang, Student Member, IEEE, Mo-Yuen Chow,

More information

Techniques, October , Boston, USA. Personal use of this material is permitted. However, permission to

Techniques, October , Boston, USA. Personal use of this material is permitted. However, permission to Copyright 1996 IEEE. Published in the Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques, October 21-23 1996, Boston, USA. Personal use of this material is permitted.

More information

100GE PCS Modeling. Oded Trainin, Hadas Yeger, Mark Gustlin. IEEE HSSG September 2007

100GE PCS Modeling. Oded Trainin, Hadas Yeger, Mark Gustlin. IEEE HSSG September 2007 100GE PCS Modeling Oded Trainin, Hadas Yeger, Mark Gustlin IEEE HSSG September 2007 How Random is the PCS Data? The Proposed 100G PCS has the concept of virtual lanes A 100G stream is scrambled and then

More information

Biologically-inspired reactive collision avoidance

Biologically-inspired reactive collision avoidance Biologically-inspired reactive collision avoidance S. D. Ross 1,2, J. E. Marsden 2, S. C. Shadden 2 and V. Sarohia 3 1 Aerospace and Mechanical Engineering, University of Southern California, RRB 217,

More information

PERMAS Users' Conference on April 12-13, 2018, Stuttgart

PERMAS Users' Conference on April 12-13, 2018, Stuttgart Topology optimization to maximize the dynamic input stiffness of front axle coach structure N. Kuppuswamy, P. J. Eberle, G. Steinmetz, A. Schünemann, B. Zickler INTES GmbH April 12-13, 2018 PERMAS Users'

More information

FUEL CORRECTIONS: 13 July 2015

FUEL CORRECTIONS: 13 July 2015 Groups/STANDARD MAPPING/FUEL CORRECTIONS Injection Angle Control Method: END_ANGLE Injection Angle Rate of Change (deg/cylinder): 719.75 Base Cal Select Enable: DISABLED (see below) MULTIPLIERS/THROTTLE

More information