Decoupling Loads for Nano-Instruction Set Computers
|
|
- Merry Lawrence
- 6 years ago
- Views:
Transcription
1 Decoupling Loads for Nano-Instruction Set Computers Ziqiang (Patrick) Huang, Andrew Hilton, Benjamin Lee Duke University {ziqiang.huang, andrew.hilton, ISCA-43, June 21,
2 In-order vs Out-of-order IO Performance Low OoO High Power Low High Area Low High Complexity Low High 2
3 What makes OoO perform better? Scheduling Hardware support better schedules ROB, Register renamer, Ld/St queue Dynamism React to dynamic events Cache misses Branch mispredictions 12% 88% Scheduling Dynamism [McFarlin13] 3
4 ISA Evolution: CISC RISC CISC ISAs are Great! poly insque remque 4
5 ISA Evolution: CISC RISC High RISC, High Reward! ld add bne 5
6 ISA Evolution: CISC RISC NISC High RISC, High Reward! Specify target Compute condition Transfer control ld add bne Compute address Access memory Write register Prepare-to-Branch Branch Delay Slots Branch Vanguard Decoupled Load (Our Work) 6
7 Outline Motivation Decoupled Loads Microarchitecture Compiler Evaluation Conclusion 7
8 Decoupled Load in a Nutshell Conventional Load: ld rd, I(rA) Load Tag Compute address Access memory } ld.d$ lta, Write register Order 8
9 Decoupled Load in a Nutshell Conventional Load: Compute address Access memory Write register Order Decoupled Load: ld ld.d$ scheduled early ld.wb at original position rd, I(rA) } } ld.d$ ld.wb Load Tag lta, I(rA) 9
10 Key Rule A decoupled load behaves just as if the load happens at ld.wb Stores Exceptions Coherence Just follow this rule! 10
11 Compiler Challenge: May-Alias Stores A: ld r1, 0(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D: ld r6, 4(r2) E: mul r3, r6, r4 11
12 Compiler Challenge: May-Alias Stores A: ld r1, 0(r2) B: mul r3, r1, r4 What if r5 = r2 + 4? C: st r3, 0(r5) D: ld r6, 4(r2) E: mul r3, r6, r4 Compiler can t schedule loads above may-alias stores OoO addresses problem with load/store queue 12
13 Decoupled Load to the Rescue A: ld r1, 0(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D: ld r6, 4(r2) E: mul r3, r6, r4 A: ld r1, 0(r2) D.0: ld.d$ lt0, 4(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D.1: ld.wb r6, lt0 E: mul r3, r6, r4 Alias? Not a problem Remember our rule? 13
14 Compiler Challenge: Branches A: ld r1, 0(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D: add r6, r6, 1 E: bne r6, r7,.. F: ld r8, 4(r2) What if? Exception! G: mul r3, r8, r4 Compiler can t schedule loads above branches OoO addresses problem with reorder buffer 14
15 Decoupled Load to the Rescue A: ld r1, 0(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D: add r6, r6, 1 E: bne r6, r7,.. F: ld r8, 4(r2) G: mul r3, r8, r4 Remember our rule? A: ld r1, 0(r2) F.0: ld.d$ lt0, 4(r2) B: mul r3, r1, r4 C: st r3, 0(r5) D: add r6, r6, 1 E: bne r6, r7,.. Exception? Not a problem Not this way? F.1: ld.wb r8, lt0 G: mul r3, r8, r4 15
16 Wait, Isn t this? Advanced Loads in Itanium No, Itanium is SPECULATIVE Require fix up code (software recovery is nasty!) Require large number of registers Prefetching No, prefetching is ORTHOGONAL Prefetching only helps when data not in L1 16
17 Outline Motivation Decoupled Loads Microarchitecture Compiler Evaluation Conclusion 17
18 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 Load Tag Table LT# Status Exn VA PA Value 0 Empty 18
19 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.d$ 3FCA ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Empty 19
20 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.d$ 8BCA ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Pending 3FCA 20
21 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.d$ 1 Exception? ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Pending 1 3FCA 8BCA 21
22 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.d$ ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 1 22
23 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ st 8BCA ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 1 Match! 23
24 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB st 2 SB D$ ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 1 Match! 24
25 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB st SB D$ ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 2 25
26 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.wb ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 2 26
27 Decoupled Loads: Working Example F/D D/E E/M 1 M 1 /M 2 M 2 /M 3 M 3 /W I$ RF ALU DTLB SB D$ ld.wb ld.d$ lt0, 4(r2) st r3, 4(r4) ld.wb r1, lt0 LT# Status Exn VA PA Value 0 Complete 3FCA 8BCA 2 27
28 Outline Motivation Decoupled Loads Microarchitecture Compiler Evaluation Conclusion 28
29 List Scheduling ld r1, 0(r2) mul r3, r1, r4 st r3, 0(r5) ld r6, 4(r2) mul r3, r6, r4 ld 4 mul st 4 1 ld 4 mul 29
30 List Scheduling ld r1, 0(r2) mul r3, r1, r4 st r3, 0(r5) Cycle 0: ld Bubble Cycles Cycle 4: mul Bubble Cycles Cycle 8: st ld r6, 4(r2) Cycle 9: ld mul r3, r6, r4 Cycle 13: mul 30
31 List Scheduling with Decoupled Loads ld r1, 0(r2) mul r3, r1, r4 st r3, 0(r5) ld 4 mul st 4 1 ld.d$ 3 Cycle 0: Cycle 1: Cycle 4: Cycle 8: ld r6, 4(r2) mul r3, r6, r4 ld.wb 41 mul Cycle 9: Cycle 10: 31
32 List Scheduling with Decoupled Loads ld r1, 0(r2) mul r3, r1, r4 Cycle 0: ld Cycle 1: ld.d$ Cycle 4: mul st r3, 0(r5) ld r6, 4(r2) Cycle 8: Cycle 9: st ld.wb mul r3, r6, r4 Cycle 10: mul 32
33 Hoisting Above Branches A ld r4, 0(r7) ld.d$ lt0, 4(r6) B ld.d$ lt0, 4(r6) blt r3, r5, B 99% 100% 1% C ld.wb r5, lt0 Correctness Performance 33
34 Hoisting Above Branches A ld r4, 0(r7) ld.d$ lt0, 4(r6) 100% 1% C B ld.wb r5, lt0 ld.d$ lt0, 4(r6) blt r3, r5, B 99% Correctness Performance A ld r4, 0(r7) ld.d$ lt0, 4(r6) 100% C B ld.wb r5, lt0 blt r3, r5, B 1% ld.d$ lt0, 4(r6) 100% 99% Correctness Performance 34
35 Outline Motivation Decoupled Loads Microarchitecture Compiler Evaluation Conclusion 35
36 Evaluation: Methodology ISA: OpenRISC Compiler: LLVM 3.5 Benchmarks: SPEC2006, 8 INT Cycle Level Simulation 36
37 Speedup (%) Instruction Mix (%) Results: Speedup & Instruction Mix Decoupled Conventional Redundant Average speedup is 8.4% Speedup correlates with decoupled loads 37
38 Speedup (%) Hoisting above Stores versus Branches Stores Branches Scheduler must handle both stores, branches 38
39 Speedup (%) Are We Simply Prefetching? Perfect L1 + Decoupled Loads 39
40 Speedup (%) Are We Simply Prefetching? No prefetching + Decoupled Loads Perfect L1 + Decoupled Loads Majority of benefits from hiding hit latency 40
41 Outline Motivation Decoupled Loads Compiler Microarchitecture Evaluation Conclusion 41
42 Towards Promised OoO Performance Decoupled loads separate access from writeback and ordering Improve static scheduling for IO performance Bring IO a step closer to OoO Require modest microarchitectural and system support Scheduling Algorithms Decoupled Loads icfp [Hilton09] Branch Vanguard [McFarlin15] 42
43 Thank you! Questions? 43
Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士
Computer Architecture 计算机体系结构 Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review ISA, micro-architecture, physical design Evolution of ISA CISC vs
More informationComputer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University
Computer Architecture: Out-of-Order Execution Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Reading for Today Smith and Sohi, The Microarchitecture of Superscalar Processors, Proceedings
More informationCode Scheduling & Limitations
This Unit: Static & Dynamic Scheduling CIS 371 Computer Organization and Design Unit 11: Static and Dynamic Scheduling App App App System software Mem CPU I/O Code scheduling To reduce pipeline stalls
More informationLecture 14: Instruction Level Parallelism
Lecture 14: Instruction Level Parallelism Last time Pipelining in the real world Today Control hazards Other pipelines Take QUIZ 10 over P&H 4.10-15, before 11:59pm today Homework 5 due Thursday March
More informationOut-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide)
Out-of-order Pipeline Register Read When do instructions read the register file? Fetch Decode Rename Dispatch Buffer of instructions Issue Reg-read Execute Writeback Commit Option #: after select, right
More informationLecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 20: Parallelism ILP to Multicores James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L20 S1, James C. Hoe, CMU/ECE/CALCM, 2018 18 447 S18 L20 S2, James C. Hoe, CMU/ECE/CALCM,
More informationComputer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3. David Wentzlaff Department of Electrical Engineering Princeton University
Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3 David Wentzlaff Department of Electrical Engineering Princeton University 1 Agenda SpeculaJon and Branches Register Renaming Memory DisambiguaJon
More informationParallelism I: Inside the Core
Parallelism I: Inside the Core 1 The final Comprehensive Same general format as the Midterm. Review the homeworks, the slides, and the quizzes. 2 Key Points What is wide issue mean? How does does it affect
More informationCIS 371 Computer Organization and Design
CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by Milo Martin & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin
More informationUnit 9: Static & Dynamic Scheduling
CIS 501: Computer Architecture Unit 9: Static & Dynamic Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Mar;n at University of Pennsylvania CIS 501: Comp. Arch. Prof. Milo Martin
More informationCIS 371 Computer Organization and Design
CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by M. Martin, A.Roth, C.J. Taylor and Benedict Brown at the University of Pennsylvania with sources that included
More informationAdvanced Superscalar Architectures. Speculative and Out-of-Order Execution
6.823, L16--1 Advanced Superscalar Architectures Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Speculative and Out-of-Order Execution Branch Prediction kill kill Branch
More informationPIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS
PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission
More informationTomasulo-Style Register Renaming
Tomasulo-Style Register Renaming ldf f0,x(r1) allocate RS#4 map f0 to RS#4 mulf f4,f0, allocate RS#6 ready, copy value f0 not ready, copy tag Map Table f0 f4 RS#4 RS T V1 V2 T1 T2 4 REG[r1] 6 REG[] RS#4
More informationPipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold
Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
More informationChapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.
Chapter 3: Computer Organization Fundamentals Prof. Ben Lee Oregon State University School of Electrical Engineering and Computer Science Chapter Goals Understand the organization of a computer system
More informationENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design
ENGN64: Design of Computing Systems Topic 5: Pipeline Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More informationCOSC 6385 Computer Architecture. - Tomasulos Algorithm
COSC 6385 Computer Architecture - Tomasulos Algorithm Fall 2008 Analyzing a short code-sequence DIV.D F0, F2, F4 ADD.D F6, F0, F8 S.D F6, 0(R1) SUB.D F8, F10, F14 MUL.D F6, F10, F8 1 Analyzing a short
More informationStorage and Memory Hierarchy CS165
Storage and Memory Hierarchy CS165 What is the memory hierarchy? L1
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 10 Instruction-Level Parallelism Part 3
ECE 552 / CPS 550 Advanced Comuter Architecture I Lecture 10 Instruction-Level Parallelism Part 3 Benjamin Lee Electrical and Comuter Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html
More informationAdvanced Superscalar Architectures
Advanced Suerscalar Architectures Krste Asanovic Laboratory for Comuter Science Massachusetts Institute of Technology Physical Register Renaming (single hysical register file: MIPS R10K, Alha 21264, Pentium-4)
More informationCMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining
CMU 18-447 Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining Instructor: Prof. Onur Mutlu TAs: Justin Meza, Yoongu Kim, Jason Lin 1 Adding the REP
More informationHakim Weatherspoon CS 3410 Computer Science Cornell University
Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register
More informationAnne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon]
Anne Bracy CS 3410 Computer Science Cornell University [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Prog. Mem PC +4 inst Reg. File 5 5 5 control ALU Data Mem Fetch Decode Execute Memory WB
More information6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019
6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019 http://csg.csail.mit.edu/6.823/ This self-assessment test is intended to help you determine your
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 02
More informationCACHE LINE AWARE OPTIMIZATIONS FOR CCNUMA SYSTEMS
CACHE LINE AWARE OPTIMIZATIONS FOR CCNUMA SYSTEMS 24th ACM International Symposium on High-Performance Parallel and Distributed Computing HPDC 15, Portland, 2015 Sabela Ramos (sramos@udc.es) GAC, Universidade
More informationHigh Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP)
High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP) 1 T H E A C M I E E E I N T E R N A T I O N A L S Y M P O S I U M O N C O M P U T E R A R C H I T E C T U R E ( I S C A
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 23 Synchronization 2006-11-16 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Udam Saini and Jue Sun www-inst.eecs.berkeley.edu/~cs152/ 1 Last Time:
More informationARC-H: Adaptive replacement cache management for heterogeneous storage devices
Journal of Systems Architecture 58 (2012) ARC-H: Adaptive replacement cache management for heterogeneous storage devices Young-Jin Kim, Division of Electrical and Computer Engineering, Ajou University,
More informationCS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars
CS 152 Comuter Architecture and Engineering Lecture 15 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste
More informationApplication challenges and potential solutions for robust radar sensors
Application challenges and potential solutions for robust radar sensors Dirk Steinbuch Robert Bosch GmbH Dirk.Steinbuch@de.bosch.com WS12: EuMIC - SiGe for mm-wave and THz Content System Level Challenges
More informationCS152: Computer Architecture and Engineering Introduction to Pipelining. October 22, 1997 Dave Patterson (http.cs.berkeley.
CS152: Computer Architecture and Engineering Introduction to Pipelining October 22, 1997 Dave Patterson (http.cs.berkeley.edu/~patterson) lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/ cs 152
More informationHelping Moore s Law: Architectural Techniques to Address Parameter Variation
Helping Moore s Law: Architectural Techniques to Address Parameter Variation Computer Science Department University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/~teodores Technology scaling
More informationSinfonia: a new paradigm for building scalable distributed systems
CS848 Paper Presentation Sinfonia: a new paradigm for building scalable distributed systems Aguilera, Merchant, Shah, Veitch, Karamanolis SOSP 2007 Presented by Somayyeh Zangooei David R. Cheriton School
More informationGOPALAN COLLEGE OF ENGINEERING AND MANAGEMENT Department of Computer Science and Engineering COURSE PLAN
Appendix - C GOPALAN COLLEGE OF ENGINEERING AND MANAGEMENT Department of Computer Science and Engineering Academic Year: 2016-17 Semester: EVEN COURSE PLAN Semester: V Subject Code& Name: 10CS63 & Compiler
More informationEnhancing Energy Efficiency of Database Applications Using SSDs
Seminar Energy-Efficient Databases 29.06.2011 Enhancing Energy Efficiency of Database Applications Using SSDs Felix Martin Schuhknecht Motivation vs. Energy-Efficiency Seminar 29.06.2011 Felix Martin Schuhknecht
More informationScaling Document Clustering in the Cloud. Robert Gillen Computer Science Research Cloud Futures 2011
Scaling Document Clustering in the Cloud Robert Gillen Computer Science Research Cloud Futures 2011 Overview Introduction to Piranha Existing Limitations Current Solution Tracks Early Results & Future
More informationImproving Memory System Performance with Energy-Efficient Value Speculation
Improving Memory System Performance with Energy-Efficient Value Speculation Nana B. Sam and Min Burtscher Computer Systems Laboratory Cornell University Ithaca, NY 14853 {besema, burtscher}@csl.cornell.edu
More informationAutonomous taxicabs in Berlin a spatiotemporal analysis of service performance. Joschka Bischoff, M.Sc. Dr.-Ing. Michal Maciejewski
Autonomous taxicabs in Berlin a spatiotemporal analysis of service performance Joschka Bischoff, M.Sc. Dr.-Ing. Michal Maciejewski Mobil.TUM 2016, 7 June 2016 Contents Motivation Methodology Results Conclusion
More informationDrowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge
Drowsy Caches Simple Techniques for Reducing Leakage Power Krisztián Flautner Nam Sung Kim Steve Martin David Blaauw Trevor Mudge krisztian.flautner@arm.com kimns@eecs.umich.edu stevenmm@eecs.umich.edu
More informationNear-Optimal Precharging in High-Performance Nanoscale CMOS Caches
Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches Se-Hyun Yang and Babak Falsafi Computer Architecture Laboratory (CALCM) Carnegie Mellon University {sehyun, babak}@cmu.edu http://www.ece.cmu.edu/~powertap
More informationExperience Report: Applying and Introducing TSP to Electronic Design Automation
Experience Report: Applying and Introducing TSP to Electronic Design Automation Elias Fallon, Engineering Director TSP Symposium 2012 St. Petersburg, FL September 20 th, 2012 Agenda Introduction: Electronic
More informationBicycle Hardware in the Loop Simulator for Braking Dynamics Assistance System
Bicycle Hardware in the Loop Simulator for Braking Dynamics Assistance System IPG Apply & Innovate 2016 Conference Session: Off Highway Cornelius Bott, Martin Pfeiffer, Oliver Maier, Jürgen Wrede 21.09.2016
More informationFleet Penetration of Automated Vehicles: A Microsimulation Analysis
Fleet Penetration of Automated Vehicles: A Microsimulation Analysis Corresponding Author: Elliot Huang, P.E. Co-Authors: David Stanek, P.E. Allen Wang 2017 ITE Western District Annual Meeting San Diego,
More informationChapter 2 ( ) -Revisit ReOrder Buffer -Exception handling and. (parallelism in HW)
Comuter Architecture A Quantitative Aroach, Fifth Edition Chater 2 (2.6-2.11) -Revisit ReOrder Buffer -Excetion handling and (seculation in hardware) -VLIW and EPIC (seculation in SW, arallelism in SW)
More informationOptimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao
Optimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao Feb 28th, 2002 Our Questions about Tomasulo Questions about Tomasulo s Algorithm Is it optimal (can always produce the wisest instruction execution
More informationParameter Design and Tuning Tool for Electric Power Steering System
TECHNICL REPORT Parameter Design and Tuning Tool for Electric Power Steering System T. TKMTSU T. TOMIT Installation of Electric Power Steering systems (EPS) for automobiles has expanded rapidly in the
More informationComputer Architecture and Parallel Computing 并行结构与计算. Lecture 5 SuperScalar and Multithreading. Peng Liu
Comuter Architecture and Parallel Comuting 并行结构与计算 Lecture 5 SuerScalar and Multithreading Peng Liu College of Info. Sci. & Elec. Eng. Zhejiang University liueng@zju.edu.cn Last time in Lecture 04 Register
More informationDepartment of Civil Engineering The University of British Columbia. Nicolas Saunier
Department of Civil Engineering The University of British Columbia TRUCK SIGNAL PRIORITY Nicolas Saunier Wook Kang Why Truck Priority? Reduce Rd the Cost of Goods Transportation Reduce Red Light Running
More information- Status Report - System Power Determination of Electrified (Light Duty) Vehicles. Subgroup Leader: Germany, Korea. EVE-17 meeting
System Power Determination of Electrified (Light Duty) Vehicles - Status Report - Subgroup Leader: Germany, Korea EVE-17 meeting Geneva - January 11, 2016 Outline 1. Status: Screening of methods for determination
More informationCS 152 Computer Architecture and Engineering. Lecture 14 - Advanced Superscalars
CS 152 Comuter Architecture and Engineering Lecture 14 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste
More informationReal-Time Simulation of A Modular Multilevel Converter Based Hybrid Energy Storage System
Real-Time Simulation of A Modular Multilevel Converter Based Hybrid Energy Storage System Feng Guo, PhD NEC Laboratories America, Inc. Cupertino, CA 5/13/2015 Outline Introduction Proposed MMC for Hybrid
More informationSHC Swedish Centre of Excellence for Electromobility
SHC Swedish Centre of Excellence for Electromobility Cost effective electric machine requirements for HEV and EV Anders Grauers Associate Professor in Hybrid and Electric Vehicle Systems SHC SHC is a national
More informationCS 6354: Tomasulo. 21 September 2016
1 CS 6354: Tomasulo 21 September 2016 To read more 1 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer
More informationModelling of Diesel Vehicle Emissions under transient conditions
Modelling of Diesel Vehicle Emissions under transient conditions Dr. Gavin Dober Combustion and Hydraulics Manager, Davide Del Pozzo Delphi Trainee 216-217 Advanced Injection & Combustion Center Delphi
More informationAn Integrated Process for FDIR Design in Aerospace
An Integrated Process for FDIR Design in Aerospace Fondazione Bruno Kessler, Trento, Italy Benjamin Bittner, Marco Bozzano, Alessandro Cimatti, Marco Gario Thales Alenia Space,France Regis de Ferluc Thales
More informationOnline Learning and Optimization for Smart Power Grid
1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical
More informationOptimal Design Methodology for LLC Resonant Converter in Battery Charging Applications Based on Time-Weighted Average Efficiency
LeMeniz Infotech Page number 1 Optimal Design Methodology for LLC Resonant Converter in Battery Charging Applications Based on Time-Weighted Average Efficiency Abstract The problems of storage capacity
More informationFixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs
Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs Louis Bavoil, Principal Engineer Booth #223 - South Hall www.nvidia.com/gdc Full-Screen Pixel Shader SM TEX L2 DRAM CROP SM = Streaming
More informationAlloyed Branch History: Combining Global and Local Branch History for Robust Performance
Alloyed Branch History: Combining Global and Local Branch History for Robust Performance UNIV. OF VIRGINIA DEPT. OF COMPUTER SCIENCE TECH. REPORT CS-22-21 Zhijian Lu, John Lach, Mircea R. Stan, Kevin Skadron
More informationTo read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer.
To read more CS 6354: Tomasulo 21 September 2016 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer Architecture:
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 20 Synchronous Digital Systems Blu-ray vs HD-DVD war over? As you know, there are two different, competing formats for the next
More informationVelocity Optimization of Pure Electric Vehicles with Traffic Dynamics Consideration
Velocity Optimization of Pure Electric Vehicles with Traffic Dynamics Consideration Liuwang Kang, Haiying Shen, and Ankur Sarker Department of Computer Science, University of Virginia Outline Introduction
More informationUnderstanding the benefits of using a digital valve controller. Mark Buzzell Business Manager, Metso Flow Control
Understanding the benefits of using a digital valve controller Mark Buzzell Business Manager, Metso Flow Control Evolution of Valve Positioners Digital (Next Generation) Digital (First Generation) Analog
More informationWHITE PAPER. Informatica PowerCenter 8 on HP Integrity Servers: Doubling Performance with Linear Scalability for 64-bit Enterprise Data Integration
WHITE PAPER Informatica PowerCenter 8 on HP Integrity Servers: Doubling Performance with Linear Scalability for 64-bit Enterprise Data Integration This document contains Confi dential, Proprietary and
More informationСостояние и перспективы развития интегрированной модульной авионики
Международная конференция Состояние и перспективы развития интегрированной модульной авионики MASIW: Model Based Toolset for IMA System Design and Integration Alexey Khoroshilov (ISPRAS) Москва, 29-30
More informationOPENSTEERING PLATFORM
MDYNAMIX AFFILIATED INSTITUTE OF MUNICH UNIVERSITY OF APPLIED SCIENCES OPENSTEERING PLATFORM FOR DEVELOPMENT OF ADVANCED STEERING FUNCTIONS, ADAS AND AUTONOMOUS VEHICLES 9th International Munich Chassis
More informationOPTIMIZATION STUDIES OF ENGINE FRICTION EUROPEAN GT CONFERENCE FRANKFURT/MAIN, OCTOBER 8TH, 2018
OPTIMIZATION STUDIES OF ENGINE FRICTION EUROPEAN GT CONFERENCE FRANKFURT/MAIN, OCTOBER 8TH, 2018 M.Sc. Oleg Krecker, PhD candidate, BMW B.Eng. Christoph Hiltner, Master s student, Affiliation BMW AGENDA
More informationIntelligent Mobility for Smart Cities
Intelligent Mobility for Smart Cities A/Prof Hussein Dia Centre for Sustainable Infrastructure CRICOS Provider 00111D @HusseinDia Outline Explore the complexity of urban mobility and how the convergence
More informationAlgebraic Integer Encoding and Applications in Discrete Cosine Transform
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Algebraic Integer Encoding and Applications in Discrete Cosine Transform Minyi Fu Supervisors: Dr. G. A. Jullien Dr. M. Ahmadi Department
More informationDAT105: Computer Architecture Study Period 2, 2009 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation
Study Period 2, 29 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation Mafijul Islam Department of Computer Science and Engineering November 12, 29 Study Period 2, 29 Goals: To understand
More informationUsing Virtualization to Accelerate the Development of ADAS & Automated Driving Functions
Using Virtualization to Accelerate the Development of ADAS & Automated Driving Functions GTC Europe 2017 Dominik Dörr 2 Motivation Virtual Prototypes Virtual Sensor Models CarMaker and NVIDIA DRIVE PX
More informationUKSM: Swift Memory Deduplication via Hierarchical and Adaptive Memory Region Distilling
UKSM: Swift Memory Deduplication via Hierarchical and Adaptive Memory Region Distilling Nai Xia* Chen Tian* Yan Luo + Hang Liu + Xiaoliang Wang* *: Nanjing University +: University of Massachusetts Lowell
More informationModel Based Design: Balancing Embedded Controls Development and System Simulation
All-Day Hybrid Power On the Job Model Based Design: Balancing Embedded Controls Development and System Simulation Presented by : Bill Mammen 1 Topics Odyne The Project System Model Summary 2 About Odyne
More informationImproving Performance: Pipelining!
Iproving Perforance: Pipelining! Meory General registers Meory ID EXE MEM WB Instruction Fetch (includes PC increent) ID Instruction Decode + fetching values fro general purpose registers EXE EXEcute arithetic/logic
More informationPowerChop: Identifying and Managing Non-critical Units in Hybrid Processor Architectures
PowerChop: Identifying and Managing Non-critical Units in Hybrid Processor Architectures Michael A. Laurenzano, Yunqi Zhang, Jiang Chen, Lingjia Tang and Jason Mars Department of Electrical Engineering
More informationD6.5 Public report on experience & results from FCEV city car demonstration in Oslo
D6.5 Public report on experience & results from FCEV city car demonstration in Oslo Final Report Dissemination level: PU February 2013 Page 1 of 13 Introduction WP6 Deliverable D6.5 Public report on experience
More informationOnline Learning and Optimization for Smart Power Grid
1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical
More informationScheduling. Purpose of scheduling. Scheduling. Scheduling. Concurrent & Distributed Systems Purpose of scheduling.
427 Concurrent & Distributed Systems 2017 6 Uwe R. Zimmer - The Australian National University 429 Motivation and definition of terms Purpose of scheduling 2017 Uwe R. Zimmer, The Australian National University
More informationSIMULATING AUTONOMOUS VEHICLES ON OUR TRANSPORT NETWORKS
SIMULATING AUTONOMOUS VEHICLES ON OUR TRANSPORT NETWORKS www.ptvgroup.com Alastair Evanson, Solution Director PTV Vissim TOMORROW S CONNECTED & BUSINESS AUTONOMOUS MODEL: VEHICLES SIGNIFICANT SHIFT TO
More informationWESTERN INTERCONNECTION TRANSMISSION TECHNOLGOY FORUM
1 1 The Latest in the MIT Future of Studies Recognizing the growing importance of energy issues and MIT s role as an honest broker, MIT faculty have undertaken a series of in-depth multidisciplinary studies.
More informationTest Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints
Test Infrastructure Design for Core-Based System-on-Chip Under Cycle-Accurate Thermal Constraints Thomas Edison Yu, Tomokazu Yoneda, Krishnendu Chakrabarty and Hideo Fujiwara Nara Institute of Science
More informationFast Orbit Feedback (FOFB) at Diamond
Fast Orbit Feedback (FOFB) at Diamond Guenther Rehm, Head of Diagnostics Group 29/06/2007 FOFB at Diamond 1 Ground, Girder and Beam Motion 29/06/2007 FOFB at Diamond 2 Fast Feedback Design Philosophy Low
More informationTechniques, October , Boston, USA. Personal use of this material is permitted. However, permission to
Copyright 1996 IEEE. Published in the Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques, October 21-23 1996, Boston, USA. Personal use of this material is permitted.
More informationPPEP: ONLINE PERFORMANCE, POWER, AND ENERGY PREDICTION FRAMEWORK
PPEP: ONLINE PERFORMANCE, POWER, AND ENERGY PREDICTION FRAMEWORK BO SU JUNLI GU LI SHEN WEI HUANG JOSEPH L. GREATHOUSE ZHIYING WANG NUDT AMD RESEARCH DECEMBER 17, 2014 BACKGROUND Dynamic Voltage and Frequency
More informationRule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata
1 Robotics Rule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata 2 Motivation Construction of mobile robot controller Evolving neural networks using genetic algorithm (Floreano,
More informationThe MathWorks Crossover to Model-Based Design
The MathWorks Crossover to Model-Based Design The Ohio State University Kerem Koprubasi, Ph.D. Candidate Mechanical Engineering The 2008 Challenge X Competition Benefits of MathWorks Tools Model-based
More informationGasoline and Diesel Blending Course
Duration Gasoline and Diesel Blending Course Three classroom days: NOVEMBER 7 to 9, (NEW YORK CITY) THE GASOLINE AND DIESEL BLENDING COURSE IN NEW YORK CITY The course is FOCUSED on BLENDING ECONOMICS
More informationEXTENDING PRT CAPABILITIES
EXTENDING PRT CAPABILITIES Prof. Ingmar J. Andreasson* * Director, KTH Centre for Traffic Research and LogistikCentrum AB. Teknikringen 72, SE-100 44 Stockholm Sweden, Ph +46 705 877724; ingmar@logistikcentrum.se
More informationOperations Research & Advanced Analytics 2015 INFORMS Conference on Business Analytics & Operations Research
Simulation Approach for Aircraft Spare Engines & Engine Parts Planning Operations Research & Advanced Analytics 2015 INFORMS Conference on Business Analytics & Operations Research 1 Outline Background
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017
ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Digital Arithmetic Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletch and Andrew Hilton (Duke) Last
More informationHawai'i Island Planning and Operations MEASURES TO IMPROVE RELIABILITY WITH HIGH DER
1 Hawai'i Island Planning and Operations MEASURES TO IMPROVE RELIABILITY WITH HIGH DER Lisa Dangelmaier Hawaii Electric Light lisa.dangelmaier@hawaiielectriclight.com Hawai'i Electric Light System Overview
More informationAPR Performance APR004 Wing Profile CFD Analysis NOTES AND IMAGES
APR Performance APR004 Wing Profile CFD Analysis NOTES AND IMAGES Andrew Brilliant FXMD Aerodynamics Japan Office Document number: JP. AMB.11.6.17.002 Last revision: JP. AMB.11.6.24.003 Purpose This document
More informationHow Much Power Does your Server Consume? Estimating Wall Socket Power Using RAPL Measurements
How Much Power Does your Server Consume? Estimating Wall Socket Power Using RAPL Measurements Kashif Nizam Khan Zhonghong Ou, Mikael Hirki, Jukka K. Nurminen, Tapio Niemi 1 Motivation The Large Hadron
More informationDesign and evaluate vehicle architectures to reach the best trade-off between performance, range and comfort. Unrestricted.
Design and evaluate vehicle architectures to reach the best trade-off between performance, range and comfort. Unrestricted. Introduction Presenter Thomas Desbarats Business Development Simcenter System
More information(FPGA) based design for minimizing petrol spill from the pipe lines during sabotage
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 05, Issue 01 (January. 2015), V3 PP 26-30 www.iosrjen.org (FPGA) based design for minimizing petrol spill from the pipe
More informationTHERMAL MANAGEMENT SYNERGY THROUGH INTEGRATION PETE BRAZAS
THERMAL MANAGEMENT SYNERGY THROUGH INTEGRATION PETE BRAZAS 1 Propulsion System Trends Evolution of the TMM A Closer Look at Electrification System Integration Approach Outlook Powertrain Technology Roadmap
More informationJon Konings Former CEM Coordinator
Jon Konings Former CEM Coordinator Not covering every detail of these QA topics. There is such a wide variation in the configuration of hardware out there, and I can t cover everything, so I will address
More informationAnnouncements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS
Lipasti, artin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar GAS STATION Pipelining II Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin,
More information