Systems I. Pipelining I. Topics Pipelining principles Pipeline overheads Pipeline registers and stages

Similar documents
Systems I. Pipelining I. Topics Pipelining principles Pipeline overheads Pipeline registers and stages

BKF Off-line filter / cooler unit

BUILDING FOUR ±245,094 SF OFFICE/WAREHOUSE ON ±28.71 ACRES COMPAQ CENTER DRIVE WEST HOUSTON, TX TOMBALL PARKWAY HWY 249 COMPAQ CENTER DRIVE

SPIRAL STAIR INSTALLATION GUIDE

Factory Modifications

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

PIglide AT3 Linear Stage with Air Bearings

SPIRAL STAIR INSTALLATION GUIDE

Volvo Engines Specifications

School Improvement Plan. Moorsbridge Elementary School

Paper F9. Financial Management. March/June 2018 Sample Questions. Fundamentals Level Skills Module. The Association of Chartered Certified Accountants

ELEC th Tutorial Additional Slides

Power Distribution Board SVS201-PWR (ControlPlex )

Sample 4 year Degree Plan Catalog

AF General Information. Ordering. Popular Downloads. Dimensions. Technical

AF General Information. Ordering. Popular Downloads. Dimensions. Technical

Needle Felted Camping Tutorial

CAUTION: Do not install damaged parts!!!

School Improvement Plan Brooklands Elementary School

Pre-Load Bolt Assemblies BS EN HRC Assemblies BAPP

Zone 1 Zone 2 Zone 3 Old Rate Everyone Adult

AF General Information. Ordering. Popular Downloads. Dimensions. Technical

General Information. Ordering. Popular Downloads. Dimensions. Technical

Spray Control Systems CLASSIC METAL SPRAY CONTROL SOLUTIONS ENGINEERED FOR OPTIMAL PERFORMANCE AND STYLE

Air Combination Air Filter + Regulator + Lubricator. AC20-A to AC40-A. How to Order. Description. Rc NPT 1/8 1/4 3/8 1/2 3/4

INSTRUCTION MANUAL NASHVILLE KNOX CHIFFOROBE. Model No: UPC CODE: NOTE: Lot number: Date of purchase: BR (Rustic Gray)

o one can argue the fact that Volkswagen made all the right moves when it came to the inception of the Gen-IV Golf. German engineers put tremendous

Composite Events for Active Databases: Semantics, S. Chakravarthy V. Krishnaprasad E. Anwar S.-K. Kim

Modelling and Simulation of Object Detection in Automotive Power Window

EK-X Technical Data vertical order pickers EK-X

The Development of WaterSim 4.0 A Dynamic Simulation Modeling Cleaning Processes in Semiconductor Wafer Manufacturing (Thrust C, Task C-5) C

SPIRAL STAIR INSTALLATION GUIDE. Salter -Wood Tread Covers -Continuous Sleeve -Aluminum Handrail -Primed or Powdercoated

Beaumont Elementary. School Improvement Plan. Beaumont Elementary School

Hydraulic cylinders type CH - square heads with counterflanges to ISO nominal pressure 16 MPa (160 bar) - max 25 MPa (250 bar)

2.5 - Programme. This catalogue presents the different series of rollers in production and their relative utilisation criteria.

PIMag High-Load Linear Stage

Steam-converting Valve Type DUP. and spray water valve

Hydraulic cylinders type CK - square heads with tie rods to ISO nominal pressure 16 MPa (160 bar) - max 25 MPa (250 bar)

Financial Management (FM)

8. INSTALLATION/WIRE CONNECT GUIDE

GENERIC RISK ASSESSMENT - TRAVEL BY COACH

Innovations in Canada The Future of Construction

Lecture 14: Instruction Level Parallelism

Precision Linear Stage

A V 50Hz / V 60Hz

K Series. 360 screening for full EMC shielding high packing density for space savings rugged housing for extreme working conditions.

Reference No L-Rev0 SUPPLEMENTAL GEOTECHNICAL LABORATORY TESTING GEORGE MASSEY TUNNEL REPLACEMENT PROJECT, RICHMOND & DELTA, BC

ROTARY TILLER "N" SERIES

New end mills with multiple internal holes for efficient flow of coolant. y

Sliding clamp, compact single acting with spring return

Genentech Sustainability Data and Notes

TABLES TO FACILITATE FITTING SB CURVES. N.L. Johnson* J.0. Kitchen. Department of Statistics University of North Carolina at Chapel Hill

MBF Series. Xe nâng Pallet chi phí thấp... Hàng chất lượng cao MB25 MB30 MB35 MB50. Frame and body

Province of Alberta FORESTS ACT SCALING REGULATION. Alberta Regulation 195/2002. With amendments up to and including Alberta Regulation 99/2012

Selection of pressure gauges The bowl is covered with a transparent bowl guard! Square embedded type pressure gauge.

PREMIER PUMP & SUPPLY

Leadership in Filtration. ENTARON XD/MD 40 High performance for high air flows

Boeing 377 Stratocruiser Check-lists For A2A Simulations Boeing 377 Stratocruiser Boeing 377 Stratocruiser Check-lists

GRADER TROUBLE-SHOOTING

Paper F9. Financial Management. September/December 2017 Sample Questions. Fundamentals Level Skills Module

Speed Controller with One-touch Fittings

/ /rip P 1 42LT EL: important MANUAL Do Not Throw Away LAWN TRACTOR Rev. 1 o8.19.o_ AP Printed in the U.S.A. WARNING:

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design

W SERIES HYDRAULIC MOTORS. WM600 Flange Options

FRAMED PIVOT DOOR INSTALLATION INSTRUCTIONS

Siteline EX Primed Double-Hung Windows Premium Wood

Extreme Performance Roller Chain Catalog. Rexnord Extreme Performance Roller Chain

BROOKESIDES INTERNATIONAL BROOKESIDES INTERNA LIMITED TIONAL

VEHICLE DYNAMICS CONTROL (VDC) (DIAGNOSTICS)

EN Structural Bolts

Speed Controller with One-touch Fittings

Parallelism I: Inside the Core

MX MXA ALUMINIUM MOBILE ENCLOSURE APPLICATON EXAMPLE. DESKTOP AUDIO PREAMPLIFIER with USB WIRELESS MICROPHONE SYSTEM MODEL RAILROAD CONTROLLER

2307/08.02 RIVKLE. Fasteners for sheet metal and plastic, offering a simple solution even for single-sided installation.

Port A. Slew rate gate drive logic level shift GND. Fig. 1 - Typical Application Circuit

2017 Manhole Rehabilitation CONTRACT NO ADDENDUM NUMBER ONE

3. Record your results on chart paper that can be seen by the entire class. 4. Select a spokesperson and be prepared to present in 30 minutes.

IA C 58 STAY INFORMED! PLEASE SEND MONEY!

2004 LEGACY SERVICE MANUAL QUICK REFERENCE INDEX

Single-stage safety solenoid valves MVD, MVD/5, MVDLE/5

Section 2. PROGEF Natural PP Piping System. Serviced by GF Tustin

B30ST ! IMPORTANT SAFEGUARDS! SAVE THESE INSTRUCTIONS. Installation Instructions SELF-TESTING EMERGENCY LIGHTING EQUIPMENT

Jurnal Teknologi SPEED CONTROL OF BLDC MOTOR WITH SEAMLESS SPEED REVERSAL CAPABILITY USING MODIFIED FUZZY GAIN SCHEDULING.

Available online at ScienceDirect. Procedia Engineering 178 (2017 )

Coupling local renewable energy production with electric vehicle charging: a survey of the French case

Siteline EX Clad Double-Hung Windows Premium Wood

ECONOMIC LOSSES AT BUS STOP AREA

TEMPUR. ERGO System COMPLETE REFERENCE GUIDE

CATHEDRAL GROVE. Pedestrian and Traffic Safety Study

Production of High Strength Concrete in Sudan

product catalogueq2 14 High Precision Performance Connection 1140BAR PSI installed in minutes no welding metal-to-metal sealed

Electro-Mechanical Friction Clutch (EMFC) Controller Development for Automotive Application

Issuing W-2 forms for IMRF members

Les Industries Spectralux Inc. Spectralux Industries Inc.

Research Article Sliding Mode Variable Structure Control and Real-Time Optimization of Dry Dual Clutch Transmission during the Vehicle s Launch

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Konsole Typ Traglast/Paar kg

TF General Information. Ordering. Dimensions. Popular Downloads. Technical

Mobility in modern metropolitan cities is plagued by overcrowding, producing. Prototype by ENEA for Urban Mobility. Ennio Rossi and Carlo Villante

TF General Information. Ordering. Dimensions. Popular Downloads. Technical

Transcription:

Systms I Piplinin I Topics Piplinin principls Piplin ovrhads Piplin ristrs and stas

Ovrviw Whatʼs s wron with th squntial (SEQ) Y86? Itʼs slow! Each pic of hardwar is usd only a small fraction of tim W would lik to find a way to t mor prformanc with only a littl mor hardwar Gnral Principls of Piplinin Goal Difficultis Cratin a Piplind Y86 Procssor arranin SEQ Insrtin piplin ristrs Problms with data and control hazards 2

al-world Piplins: Car Washs Squntial Paralll Piplind Ida Divid procss into indpndnt stas Mov objcts throuh stas in squnc At any ivn tims, multipl objcts bin procssd 3

Laundry xampl Ann, Brian, Cathy, Dav ach hav on load of cloths to wash, dry, and fold Washr taks 30 minuts A B C D Dryr taks 30 minuts Foldr taks 30 minuts Stashr taks 30 minuts to put cloths into drawrs Slid courtsy of D. Pattrson 4

Squntial Laundry 6 PM 7 8 9 10 11 12 1 2 AM T a s k O r d r A B C D 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 Tim Squntial laundry taks 8 hours for 4 loads If thy larnd piplinin, how lon would laundry tak? Slid courtsy of D. Pattrson 5

Piplind Laundry: Start ASAP T a s k 12 2 AM 6 PM 7 8 9 10 11 1 A B C 30 30 30 30 30 30 30 Tim O D r d r Piplind laundry taks 3.5 hours for 4 loads! Slid courtsy of D. Pattrson 6

T a s k O r d r Piplinin Lssons 6 PM 7 8 9 Tim 30 30 30 30 30 30 30 A B C D Slid courtsy of D. Pattrson Piplinin dosnʼt t hlp latncy of sinl task, it hlps throuhput of ntir workload Multipl tasks opratin simultanously usin diffrnt rsourcs Potntial spdup = Numbr pip stas Piplin rat limitd by slowst piplin sta Unbalancd lnths of pip stas rducs spdup Tim to filll piplin and tim to drain it rducs spdup Stall for Dpndncs 7

Computational Exampl 300 ps 20 ps Combinational loic Dlay = 320 ps Throuhput = 3.12 GOPS Clock Systm Computation rquirs total of 300 picosconds Additional 20 picosconds to sav rsult in ristr Must hav clock cycl of at last 320 ps 8

3-Way Piplind Vrsion 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps loic A loic B loic C Dlay = 360 ps Throuhput = 8.33 GOPS Systm Divid combinational loic into 3 blocks of 100 ps ach Can bin nw opration as soon as prvious on passs throuh sta A. Bin nw opration vry 120 ps Ovrall latncy incrass 360 ps from start to finish Clock 9

Piplin Diarams Unpiplind OP1 OP2 OP3 Tim Cannot start nw opration until prvious on complts 3-Way Piplind OP1 OP2 OP3 A B C A B C A B C Tim Up to 3 oprations in procss simultanously 10

Opratin a Piplin Clock 239 241 300 359 OP1 OP2 OP3 A B C A B C A B C 0 120 240 360 480 640 Tim 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps loic A loic B loic C Clock 11

Limitations: Nonuniform Dlays 50 ps 20 ps 150 ps 20 ps 100 ps 20 ps loic A loic B loic C Dlay = 510 ps Throuhput = 5.88 GOPS OP1 OP2 OP3 Clock A B C A B C A B C Tim Throuhput limitd by slowst sta Othr stas sit idl for much of th tim Challnin to partition systm into balancd stas 12

Limitations: istr Ovrhad 50 ps 20 ps 50 ps 20 ps 50 ps 20 ps 50 ps 20 ps 50 ps 20 ps 50 ps 20 ps loic loic loic loic loic loic Clock Dlay = 420 ps, Throuhput = 14.29 GOPS As try to dpn piplin, ovrhad of loadin ristrs bcoms mor sinificant Prcnta of clock cycl spnt loadin ristr: 1-sta piplin: 6.25% 3-sta piplin: 16.67% 6-sta piplin: 28.57% Hih spds of modrn procssor dsins obtaind throuh vry dp piplinin 13

visitin th Prformanc Eqn Sconds Instructions Cycls CPU tim = =!! Proram Proram Instruction Sconds Cycl Instruction Count: No chan Clock Cycl Tim Improvs by factor of almost N for N-dp piplin Not quit factor of N du to piplin ovrhads Cycls Pr Instruction In idal world, CPI would stay th sam An individual instruction taks N cycls But w hav N instructions in fliht at a tim So - avra CPI pip = CPI no_pip * N/N Thus prformanc can improv by up to factor of N 14

Data Dpndncis Combinational loic Clock OP1 OP2 OP3 Tim Systm Each opration dpnds on rsult from prcdin on 15

Data Hazards loic A loic B loic C OP1 A B C OP2 A B C OP3 A B C OP4 A B C Tim Clock sult dos not fd back around in tim for nxt opration Piplinin has chand bhavior of systm 16

Data Dpndncis in Procssors 1 irmovl $50, %ax 2 addl %ax, %bx 3 mrmovl 100( %bx ), %dx sult from on instruction usd as oprand for anothr ad-aftr-writ (AW) dpndncy Vry common in actual prorams Must mak sur our piplin handls ths proprly Gt corrct rsults Minimiz prformanc impact 17

SEQ Hardwar nw Nw Stas occur in squnc On opration in procss at a tim Mmory Mm. control rad writ Data mmory Addr valm data out Data On sta for ach loical piplin opration Ftch (t nxt instruction from mmory) Dcod (fiur out what instruction dos and t valus from rfil) Excut (comput) Mmory (accss data mmory if ncssary) Writ back (writ any instruction rsult to rfil) Excut Dcod Ftch Bch CC icod ifun ra ALU A Instruction mmory rb vale ALU valc ALU B valp vala incrmnt ALU fun. valb A B M istr fil E dste dstm srca srcb dste dstm srca srcb Writ back 18

SEQ+ Hardwar Mmory Mm. control rad writ valm data out Data mmory Still squntial implmntation ordr sta to put at binnin Excut Bch CC ALU A vale ALU Addr ALU B Data ALU fun. Sta Task is to slct for currnt instruction vala valb dste dstm srca srcb dste dstm srca srcb Basd on rsults computd by prvious instruction Dcod icod ifun ra rb valc valp A B M istr fil E Writ back Procssor Stat Ftch Instruction mmory incrmnt is no lonr stord in ristr But, can dtrmin basd on othr stord information picod pbch pvalm pvalc pvalp 19

Addin Piplin istrs vale, valm W_icod, W_valM Writ back valm valm W valm W_valE, W_valM, W_dstE, W_dstM Mmory Data mmory Addr, Data Mmory M_icod, M_Bch, M_valA Data mmory Addr, Data vale M Excut Bch CC ALU Excut Bch CC vale ALU alua, alub alua, alub vala, valb E Dcod Ftch icod, valc valp icod, ifun ra, rb valc Instruction mmory srca, srcb dsta, dstb valp incrmnt A B M istr fil E Dcod D icod, ifun, ra, rb, valc d_srca, d_srcb vala, valb A B istr M fil E valp valp Writ back Ftch Instruction mmory incrmnt prd pstat f_ F 20

Piplin Stas Ftch Slct currnt ad instruction Comput incrmntd Dcod ad proram ristrs Excut Oprat ALU Mmory ad or writ data mmory Writ Back Updat ristr fil Mmory Excut Dcod Ftch W_icod, W_valM M_icod, M_Bch, M_valA W M E D icod, ifun, ra, rb, valc F Instruction mmory Addr, Data Bch d_srca, d_srcb valm CC alua, alub Data mmory valp W_valE, W_valM, W_dstE, W_dstM vale ALU vala, valb incrmnt f_ A B istr M fil E valp Writ back prd 21

Summary Today Piplinin principls (assmbly lin) Ovrhads du to imprfct piplinin Brakin instruction xcution into squnc of stas Nxt Tim Piplinin hardwar: ristrs and fdback paths Difficultis with piplins: hazards Mthod of mitiatin hazards 22