Chapter 2 ( ) -Revisit ReOrder Buffer -Exception handling and. (parallelism in HW)

Size: px
Start display at page:

Download "Chapter 2 ( ) -Revisit ReOrder Buffer -Exception handling and. (parallelism in HW)"

Transcription

1 Comuter Architecture A Quantitative Aroach, Fifth Edition Chater 2 ( ) -Revisit ReOrder Buffer -Excetion handling and (seculation in hardware) -VLIW and EPIC (seculation in SW, arallelism in SW) -Multile Issue rocessors (arallelism in HW) Coyright 2012, Elsevier Inc. All rights reserved. 1

2 Comuter Architecture A Quantitative Aroach, Fifth Edition Review -What are the three tyes of hazards? -Which hazards does register renaming remove? -What are the advantages of hardware based scheduling (OoO)? Disadvantages? -What are the advantages of SW based scheduling? Disadvantages? Coyright 2012, Elsevier Inc. All rights reserved. 2

3 Multile Issue and Static Scheduling To achieve CPI < 1, need to comlete multile instructions er clock Solutions: Statically scheduled suerscalar rocessors VLIW (very long instruction ti word) rocessors (done in SW) dynamically scheduled suerscalar rocessors (done in HW) Multile Issue an nd Static Schedul ling Coyright 2012, Elsevier Inc. All rights reserved. 3

4 Seculation Control Seculation Move instructions ti across a branch boundary Data seculation Execute load/stores OoO. Multile Issue an nd Static Coyright 2012, Elsevier Inc. All rights reserved. 4Scheduling

5 Seculation How to design an OoO rocessor that: Uses register renaming to remove WAW and WAR deendencies Can handle instruction excetions Can execute across branch boundaries Can reorder load/store instructions Multile Issue an nd Static Coyright 2012, Elsevier Inc. All rights reserved. 5Scheduling

6 L11-6 Dataflow execution Ins# use exec o 1 src1 2 src2 tr 2 next to deallocate t 1 t 2... rt 1 next available Reorder buffer t n Instruction slot is candidate for execution when: October 19, 2011 It holds a valid instruction ( use bit is set) It has not already started execution ( exec bit is clear) Both oerands are available (1 and 2 are set) htt://

7 Renaming & Out-of-order Issue An examle Renaming table data F1 F2 v1 t1 F3 F4 t2 t5 F5 F6 t3 F7 F8 v4 t4 Reorder buffer Ins# use exec o 1 src1 2 src LD LD MUL 10 v2 t2 1 v SUB 1 v1 1 v DIV 1 v1 01 t4 v4 t 1 t 2 t 3 t 4 t 5.. L11-7 data / t i 1 LD F2, 34(R2) 2 LD F4, 45(R3) When are names in sources 3 MULTD F6, F4, F2 relaced by data? 4 SUBD F8, F2, F2 Whenever an FU roduces data 5 DIVD F4, F2, F8 When can a name be reused? 6 ADDD F10, F6, F4 Whenever an instruction comletes October 19, 2011 htt://

8 L11-8 Data-Driven Driven Execution Renaming table & reg file Reorder buffer Ins# use exec o 1 src1 2 src2 t 1 t2.. t n Relacing the tag by its value is an exensive oeration Load Unit FU FU Store Unit < t, result > Instruction temlate (i.e., tag t) is allocated by the Decode stage, which also stores the tag in the reg file When an instruction comletes, its tag is deallocated October 19, 2011 htt://

9 L11-9 Simlifying Allocation/Deallocation Ins# use exec o 1 src1 2 src2 tr 2 next to deallocate t 1 t 2... rt 1 next available Reorder buffer t n Instruction buffer is managed circularly October 19, 2011 exec bit is set when instruction begins execution When an instruction comletes its use bit is marked free tr 2 is incremented only if the use bit is marked free htt://

10 L11-10 Effectiveness? Renaming and Out-of-order execution was first imlemented in 1969 in IBM 360/91 but did not show u in the subsequent models until mid- Nineties. Why? 1. Effective on a very small class of rograms 2. Did not address the memory latency roblem which turned out be a much bigger issue than FU latency 3. Made excetions imrecise One more roblem needed to be solved Control transfers October 19, 2011 htt:// More on this in the next lecture

11 L11-11 Precise Excetions Excetions are relatively l unlikely l events that t need secial rocessing, but where adding exlicit control flow instructions is not desired, e.g., divide by 0, age fault Excetions can be viewed as an imlicit conditional subroutine call that is inserted between two instructions. Therefore, it must aear as if the excetion is taken between two instructions (say I i and I i+1 ) the effect of all instructions u to and including I i is comlete no effect of any instruction after I i has taken lace The handler either aborts the rogram or restarts it at I i+1. October 19, 2011 htt://

12 Effect on Excetions Out-of-order Comletion L11-12 I 1 DIVD f6, f6, f4 I 2 LD f2, 45(r3) I 3 MULTD f0, f2, f4 I 4 DIVD f8, f6, f2 I 5 SUBD f10, f0, f6 I 6 ADDD f6, f8, f2 out-of-order of order com Consider excetions restore f2 restore f10 October 19, 2011 Precise excetions are difficult to imlement at high seed - want to start execution of later instructions before excetion checks finished on earlier instructions htt://

13 L11-13 Excetions Excetions create a deendence on the value of the next PC Otions for handling this deendence: Stall No Byass: No Find something else to do No Change the architecture Sometimes: Alha, Multiflow Seculate! Most common aroach! How can we handle rollback on mis-seculation Delay state udate until commit on seculated instructions Note: earlier excetions must override later ones October 19, 2011 htt://

14 L11-14 Phases of Instruction Execution October 19, 2011 PC I-cache Fetch Buffer Issue Buffer Func. Units Result Buffer Arch. State Fetch: Instruction bits retrieved from cache. Decode: Instructions laced in aroriate issue (aka disatch ) stage buffer Execute: Instructions and oerands sent to execution units. When execution comletes, all results and excetion flags are available. Commit: Instruction irrevocably udates architectural state (aka graduation or comletion ). htt://

15 Excetion Handling (In-Order Five-Stage Pieline) Commit Point L11-15 PC Inst. Mem D Data Decode E + M Mem W PC Address Illegal l Overflow Data Addr Kill Excetions Ocode Excet Writeback Exc Exc Exc Cause D E M Select Handler PC Kill F Stage PC D Kill D Stage Hold excetion flags in ieline until commit oint (M stage) If excetion at commit: udate Cause/EPC registers kill all stages fetch at handler PC Inject external interruts at commit oint October 19, 2011 PC E Kill E Stage htt:// PC M Asynchronous Interruts EPC

16 In-Order Commit for Precise Excetions L11-16 In-order Out-of-order In-order Fetch Decode Reorder Buffer Commit Kill Kill Inject handler PC Execute Kill Excetion? Instructions fetched and decoded into instruction reorder buffer in-order Execution is out-of-order ( out-of-order comletion) Commit (write-back to architectural state, i.e., regfile & memory, is in-order Temorary storage needed to hold results before e commit (shadow registers and store buffers) October 19, 2011 htt://

17 L11-17 Extensions for Precise Excetions Inst# use exec o 1 src1 2 src2 d dest data cause tr 2 next to commit tr 1 next available Reorder buffer add <d, dest, data, cause> fields in the instruction temlate commit instructions to reg file and memory in rogram order buffers can be maintained circularly on excetion, clear reorder buffer by resetting tr 1 =tr 2 (stores must wait for commit before udating memory) October 19, 2011 htt://

18 L11-18 Renaming Table Rename Table r 1 t v tag valid bit r 2 Register File Reorder buffer Ins# use exec o 1 src1 2 src2 d dest data t 1 t 2.. t n Load Unit FU FU FU Store Unit Commit < t, result > Renaming table is a cache to seed u register name look u. It needs to be cleared after each excetion taken. When else are valid bits cleared? Control transfers October 19, 2011 htt://

19 L11-19 Physical Register files Reorder buffers are sace inefficient a data value may be stored in multile laces in the reorder buffer idea kee all data values in a hysical register file Tag reresents the name of the data value and name of the hysical register that holds it Reorder buffer contains only tags Thus, 64 data values may be relaced by 8-bit tags for a 256 element hysical register file More on this in later lectures October 19, 2011 htt://

20 L13-20 Recovering ROB/Renaming Table Rename Table r 1 r 2 t t vv t v Rename t v Snashots Registe r File Ptr 2 next to commit rollback next available Ptr 1 next available Ins# use exec o 1 src1 2 src2 d dest data t 1 t 2.. t n Reorder buffer Load Unit FU FU FU Store Unit Commit < t, result > Take snashot of register rename table at each redicted branch, recover earlier snashot if branch misredicted October 26, 2011 htt://

21 L13-21 Ma Table Recovery - Snashots Seculative value management of microarchitectural state Reg Ma V Sna Ma V Sna Ma V R0 T20 X T20 X T20 X R1 T73 T08 X T73 X T08 R2 T45 X T45 X T45 X R3 T128 X T128 T128 X R30 T54 T54 T54 R31 T88 X T88 X T88 X What kind of value management is this? Greedy!! October 26, 2011 htt://

22 L13-22 O-o-O O Execution with ROB Rename Table Next to commit Next available Reorder buffer R1 R2 R3 R4 t i 0 t j 0 t 2 1 t 1 1 : : tag Register valid bit File R1 1 R2 2 R3 3 : Ins# use exec o 1 src1 2 src2 d dest data 0 X X add X 1 X 2 X R4 4 8 X ld X 256 R3 t 1 t 2.. t n Load Unit FU FU FU Store Commit Unit < t, result > Basic Oeration: Enter o and tag or data (if known) for each source Relace tag with data as it becomes available Issue instruction when all sources are available Save dest data when oeration finishes Commit saved dest data when instruction commits October 26, 2011 htt://

23 L13-23 Lifetime of Physical Registers Physical regfile holds committed and seculative values Physical registers decouled d from ROB entries (no data in ROB) a) ld r1, (r3) ld P1, (Px) b) add r3, r1, #4 add P2, P1, #4 c) sub r1, r3, r9 sub P3, P2, Py d) add r3, r1, r7 Rename add P4, P3, Pz e) ld r6,,(r1) ld P5,,(P3) f) add r8, r6, r3 add P6, P5, P4 g) st r8, (r1) st P6, (P3) h) ld r3, (r11) ld P7, (Pw) When can we reuse a hysical register? When next write of same architectural register commits October 26, 2011 htt://

24 L13-24 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) (LPRd requires third read ort on Rename Table for each instruction) i October 26, 2011 htt://

25 L13-25 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd x ld P7 r1 P8 P0 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) October 26, 2011 htt://

26 L13-26 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd x ld P7 r1 P8 P0 x add P0 r3 P7 P1 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) October 26, 2011 htt://

27 L13-27 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P3 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd x ld P7 r1 P8 P0 x add P0 r3 P7 P1 x sub P6 P5 r6 P5 P3 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) October 26, 2011 htt://

28 L13-28 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P3 P2 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd x ld P7 r1 P8 P0 x add P0 r3 P7 P1 x sub P6 P5 r6 P5 P3 x add P1 P3 r3 P1 P2 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) October 26, 2011 htt://

29 L13-29 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P3 P2 P4 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 use ex o 1 PR1 2 PR2 Rd LPRd PRd x ld P7 r1 P8 P0 x add P0 r3 P7 P1 x sub P6 P5 r6 P5 P3 x add P1 P3 r3 P1 P2 x ld P0 r6 P3 P4 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) October 26, 2011 htt://

30 L13-30 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P3 P2 P4 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R1> <R6> <R7> <R3> <R1> Free List P0 P1 P3 P2 P4 P8 use ex o 1 PR1 2 PR2 Rd LPRd PRd x x ld P7 r1 P8 P0 x add P0 r3 P7 P1 x sub P6 P5 r6 P5 P3 x add P1 P3 r3 P1 P2 x ld P0 r6 P3 P4 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) Execute & Commit October 26, 2011 htt://

31 L13-31 Physical Register Management R0 R1 R2 R3 R4 R5 R6 R7 Rename Table P8 P7 P5 P6 ROB P0 P1 P3 P2 P4 P0 P1 P2 P3 P4 P5 P6 P7 P8 Pn Physical Regs <R1> <R3> <R6> <R7> <R3> Free List P0 P1 P3 P2 P4 P8 P7 use ex o 1 PR1 2 PR2 Rd LPRd PRd x x ld P7 r1 P8 P0 x x add P0 r3 P7 P1 x sub P6 P5 r6 P5 P3 x add P1 P3 r3 P1 P2 x ld P0 r6 P3 P4 ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) Execute & Commit October 26, 2011 htt://

32 Unified Physical Register File (MIPS R10K, Alha 21264, Pentium 4) L13-32 r 1 r 2 t i t j Snashots for misredict recovery t 1 t 2. t n Reg File Rename Load Table FU FU FU (ROB not shown) Unit Store Unit < t, result > One regfile for both committed and seculative values (no data in ROB) During decode, instruction result allocated new hysical register, source regs translated to hysical regs through rename table Instruction reads data from regfile at start of execute (not in decode) Write-back udates reg. busy bits on instructions in ROB (assoc. search) Snashots of rename table taken at every branch to recover misredicts On excetion, renaming undone in reverse order of issue (MIPS R10000) October 26, 2011 htt://

33 L13-33 Seculative & Out-of-Order Execution Branch Prediction kill kill Branch Resolution kill kill Out-of-Order Udate redictors In-Order PC Fetch Decode & Rename Reorder Buffer Commit In-Order Physical Reg. File Branch Unit Execute ALU MEM Store Buffer D$ October 26, 2011 htt://

34 Reorder Buffer Holds Active Instruction Window L13-34 (Older instructions) i Commit ld r1, (r3) ld r1, (r3) add r3, r1, r2 add r3, r1, r2 sub r6, r7, r9 Execute sub r6, r7, r9 add r3, r3, r6 add r3, r3, r6 ld r6, (r1) ld r6, (r1) add r6, r6, r3 st r6, (r1) add r6, r6, r3 Fetch ld r6, (r1) st r6, (r1) ld r6,,(r1) (Newer instructions) Cycle t + 1 Cycle t October 26, 2011 htt://

35 L11-35 Branch Penalty Next fetch started PC I-cache Fetch How many instructions Fetch need to be killed on a Buffer misrediction? Issue Modern rocessors may Buffer have > 10 ieline stages between next c calculation and branch resolution! Func. Units Decode Execute Branch executed Result Buffer Commit October 19, 2011 htt:// Arch. State

36 Getting CPI below 1 CPI 1 if issue only 1 instruction ti every clock cycle Multile-issue rocessors come in 3 flavors: 1. Statically-scheduled l d suerscalar rocessors In-order execution Varying number of instructions i issued (comiler) 2. Dynamically-scheduled suerscalar rocessors Out-of-order execution Varying number of instructions issued (CPU) 3. VLIW (very long instruction i word) rocessors In-order execution Fixed number of instructions ti issued

37 VLIW: Very Large Instruction Word (1/2) Each VLIW has exlicit coding for multile oerations Several instructions combined into ackets Possibly with arallelism indicated Tradeoff instruction sace for simle decoding Room for many oerations Indeendent oerations => execute in arallel E.g., 2 integer oerations, 2 FP os, 2 Memory refs, 1 branch

38 VLIW: Very Large Instruction Word (2/2) Assume 2 load/store, 2 f, 1 int/branch VLIW with 0-5 oerations. Why 0? Imortant to avoid emty instruction slots Loo unrolling Local scheduling Global scheduling Scheduling across branches Difficult to find all deendencies in advance Solution1: Block on memory accesses Solution2: CPU detects some deendencies

39 Recall: Unrolled Loo that minimizes stalls for Scalar Source code: for (i = 1000; i >0; i=i-1) x[i] = x[i] + s; Loo: LD L.D F0,0(R1) 0(R1) L.D F6,-8(R1) L.D F10,-16(R1) L.D F14,-24(R1) ADD.D F4,F0,F2 ADD.DD F8,F6,F2F6 F ADD.D F12,F10,F2 ADD.DD F16,F14,F2F14 F2 S.D F4,0(R1) S.D F8,-8(R1) DADDUI R1,R1,#-32 S.D F12,-16(R1) SD F6 (R ) BNE R1,R2,Loo Register maing: S.D F16,-24(R1) s F2 i R1

40 Loo Unrolling in VLIW Memory Memory FP FP Int. o/ Clock reference 1 reference 2 oeration 1 o. 2 branch L.D F0,0(R1) L.D F6,-8(R1) 1 L.D F10,-16(R1) L.D F14,-24(R1) 2 L.D F18,-32(R1) L.D F22,-40(R1) ADD.D F4,F0,F2 ADD.D F8,F6,F2 3 L.D F26,-48(R1) ADD.D F12,F10,F2 ADD.D F16,F14,F2 4 ADD.D F20,F18,F2 ADD.D F24,F22,F2 5 S.D 0(R1),F4 S.D -8(R1),F8 ADD.D F28,F26,F2 6 S.D -16(R1),F12 S.D -24(R1),F16 7 SD S.D -32(R1),F20 SD S.D -40(R1),F24 DSUBUI R1,R1,#48R 8 S.D -0(R1),F28 BNEZ R1,LOOP 9 Unrolled 7 iterations to avoid delays 7 results in 9 clocks, or 1.3 clocks er iteration (1.8X) Average: 2.5 os er clock, 50% efficiency Note: Need more registers s in VLIW (15 vs. 6 in SS)

41 Problems with 1st Generation VLIW Increase in code size Loo unrolling Partially emty VLIW Oerated in lock-ste; no hazard detection HW A stall in any functional unit ieline causes entire rocessor to stall, since all functional units must be ket synchronized Comiler might redict function units, but caches hard to redict Moder VLIWs are interlocked (identify deendences between bundles and stall). Binary code comatibility Strict VLIW => different numbers of functional units and unit latencies require different versions of the code

42 VLIW Tradeoffs Advantages Simler hardware because the HW does not have to identify indeendent instructions. Disadvantages Relies on smart comiler Code incomatibility between generations There are limits to what the comiler can do (can t move loads above branches, can t move loads above stores) Common uses Embedded market where hardware simlicity is imortant, alications exhibit lenty of ILP, and binary comatibility is a non-issue.

43 IA-64 and EPIC 64 bit instruction set architecture Not a CPU, but an architecture Itanium and Itanium 2 are CPUs based on IA-64 Made by Intel and Hewlett-Packard (itanium 2 and 3 designed in Colorado) Uses EPIC: Exlicitly Parallel Instruction Comuting Dearture from the x86 architecture Meant to achieve out-of-order of order erformance with inorder HW + comiler-smarts Sto bits to hel with code density Suort for control seculation (moving loads above branches) Suort for data seculation (moving loads above stores) Details in Aendix G.6

44 Control Seculation Can the comiler schedule an indeendent load above a branch? Bne R1, R2, TARGET Ld R3, R4(0) What are the roblems? EPIC rovides seculative loads Ld.s R3, R4(0) Bne R1, R2, TARGET Check R4(0)

45 Data Seculation Can the comiler schedule an indeendent load above a store? St R5, R6(0) Ld R3, R4(0) What are the roblems? EPIC rovides advanced loads and an ALAT (Advanced Load Address Table) Ld.a R3, R4(0) creates entry in ALAT St R5, R6(0) looks k u ALAT, if match, jum to fixu code

46 EPIC Conclusions Goal of EPIC was to maintain advantages of VLIW, but achieve erformance of out-of-order. Results: Comlicated bundling rules saves some sace, but makes the hardware more comlicated Add secial hardware and instructions for scheduling loads above stores and branches (new comlicated hardware) Add secial hardware to remove branch enalties (redication) End result is a machine as comlicated as an out-oforder, but now also requiring a suer-sohisticated comiler.

47 Multile Issue and Static Scheduling Multile Issue Coyright 2012, Elsevier Inc. All rights reserved. 47

48 Dynamic Scheduling, Multile Issue, and Seculation Modern microarchitectures: Dynamic scheduling + multile l issue + seculation Two aroaches: Assign reservation stations and udate ieline control table in half clock cycles Only suorts 2 instructions/clock Design logic to handle any ossible deendencies between the instructions Hybrid aroaches Issue logic can become bottleneck Dynamic Schedu uling, Mu ultile Iss sue, and Seculat tion Coyright 2012, Elsevier Inc. All rights reserved. 48

49 Dynamic Scheduling, Multile Issue, and Seculation Overview of Design Coyright 2012, Elsevier Inc. All rights reserved. 49

50 Multile Issue Limit the number of instructions of a given class that can be issued in a bundle I.e. on FP, one integer, one load, one store Examine all the deendencies amoung the instructions in the bundle If deendencies d exist in bundle, encode them in reservation stations Also need multile comletion/commit Dynamic Schedu uling, Mu ultile Iss sue, and Seculat tion Coyright 2012, Elsevier Inc. All rights reserved. 50

51 Examle Loo: LD R2,0(R1) ;R2=array element DADDIU R2,R2,#1 ;increment R2 SD R2,0(R1) ;store result DADDIU R1,R1,#8 ;increment ointer BNE R2,R3,LOOP ;branch if not last element Dynamic Schedu uling, Mu ultile Issue, and Seculation Coyright 2012, Elsevier Inc. All rights reserved. 51

52 Dynamic Scheduling, Multile Issue, and Seculation Examle (No Seculation) Coyright 2012, Elsevier Inc. All rights reserved. 52

53 Dynamic Scheduling, Multile Issue, and Seculation Examle Coyright 2012, Elsevier Inc. All rights reserved. 53

54 Branch-Target Buffer Need high instruction bandwidth! Branch-Target buffers Next PC rediction buffer, indexed by current PC Adv. Tec chniques for Instruction Delivery and Seculation Coyright 2012, Elsevier Inc. All rights reserved. 54

55 Branch Folding Otimization: Larger branch-target buffer Add target instruction into buffer to deal with longer decoding di time required by larger buffer Branch folding Adv. Tec chniques for Instr ruction Delivery and Seculation Coyright 2012, Elsevier Inc. All rights reserved. 55

56 Return Address Predictor Most unconditional branches come from function returns The same rocedure can be called from multile sites Causes the buffer to otentially forget about the return address from revious calls Create return address buffer organized as a stack Adv. Tec chniques for Instr ruction D elivery a nd Seculation Coyright 2012, Elsevier Inc. All rights reserved. 56

57 Integrated Instruction Fetch Unit Design monolithic unit that erforms: Branch rediction Instruction refetch Fetch ahead Instruction memory access and buffering Deal with crossing cache lines Adv. Tec chniques for Instr ruction D elivery and Seculation Coyright 2012, Elsevier Inc. All rights reserved. 57

58 How Much? How much to seculate Mis-seculation degrades erformance and ower relative to no seculation May cause additional misses (cache, TLB) Prevent seculative code from causing higher costing misses (e.g. L2) Seculating through multile branches Comlicates seculation recovery No rocessor can resolve multile l branches er cycle Adv. Tec chniques for Instr ruction D elivery a nd Secu ulation Coyright 2012, Elsevier Inc. All rights reserved. 58

59 Review What is Control Seculation? What is Data Seculation? What are the advantages of a suerscalar vs a VLIW? What are the disadvantages of a suerscalar vs a VLIW? When is a VLIW aroriate? When is a suerscalar aroriate? Multile Issue an nd Static Schedul ling Coyright 2012, Elsevier Inc. All rights reserved. 59

Advanced Superscalar Architectures

Advanced Superscalar Architectures Advanced Suerscalar Architectures Krste Asanovic Laboratory for Comuter Science Massachusetts Institute of Technology Physical Register Renaming (single hysical register file: MIPS R10K, Alha 21264, Pentium-4)

More information

CS 152 Computer Architecture and Engineering. Lecture 14 - Advanced Superscalars

CS 152 Computer Architecture and Engineering. Lecture 14 - Advanced Superscalars CS 152 Comuter Architecture and Engineering Lecture 14 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste

More information

CS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars

CS 152 Computer Architecture and Engineering. Lecture 15 - Advanced Superscalars CS 152 Comuter Architecture and Engineering Lecture 15 - Advanced Suerscalars Krste Asanovic Electrical Engineering and Comuter Sciences University of California at Berkeley htt://www.eecs.berkeley.edu/~krste

More information

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 10 Instruction-Level Parallelism Part 3

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 10 Instruction-Level Parallelism Part 3 ECE 552 / CPS 550 Advanced Comuter Architecture I Lecture 10 Instruction-Level Parallelism Part 3 Benjamin Lee Electrical and Comuter Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html

More information

Computer Architecture and Parallel Computing 并行结构与计算. Lecture 5 SuperScalar and Multithreading. Peng Liu

Computer Architecture and Parallel Computing 并行结构与计算. Lecture 5 SuperScalar and Multithreading. Peng Liu Comuter Architecture and Parallel Comuting 并行结构与计算 Lecture 5 SuerScalar and Multithreading Peng Liu College of Info. Sci. & Elec. Eng. Zhejiang University liueng@zju.edu.cn Last time in Lecture 04 Register

More information

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution

Advanced Superscalar Architectures. Speculative and Out-of-Order Execution 6.823, L16--1 Advanced Superscalar Architectures Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Speculative and Out-of-Order Execution Branch Prediction kill kill Branch

More information

Lecture 14: Instruction Level Parallelism

Lecture 14: Instruction Level Parallelism Lecture 14: Instruction Level Parallelism Last time Pipelining in the real world Today Control hazards Other pipelines Take QUIZ 10 over P&H 4.10-15, before 11:59pm today Homework 5 due Thursday March

More information

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide)

Out-of-order Pipeline. Register Read. OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) OOO execution (2-wide) Out-of-order Pipeline Register Read When do instructions read the register file? Fetch Decode Rename Dispatch Buffer of instructions Issue Reg-read Execute Writeback Commit Option #: after select, right

More information

COSC 6385 Computer Architecture. - Tomasulos Algorithm

COSC 6385 Computer Architecture. - Tomasulos Algorithm COSC 6385 Computer Architecture - Tomasulos Algorithm Fall 2008 Analyzing a short code-sequence DIV.D F0, F2, F4 ADD.D F6, F0, F8 S.D F6, 0(R1) SUB.D F8, F10, F14 MUL.D F6, F10, F8 1 Analyzing a short

More information

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士

Computer Architecture 计算机体系结构. Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I. Chao Li, PhD. 李超博士 Computer Architecture 计算机体系结构 Lecture 3. Instruction-Level Parallelism I 第三讲 指令级并行 I Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review ISA, micro-architecture, physical design Evolution of ISA CISC vs

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by Milo Martin & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin

More information

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 20: Parallelism ILP to Multicores. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 20: Parallelism ILP to Multicores James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L20 S1, James C. Hoe, CMU/ECE/CALCM, 2018 18 447 S18 L20 S2, James C. Hoe, CMU/ECE/CALCM,

More information

Parallelism I: Inside the Core

Parallelism I: Inside the Core Parallelism I: Inside the Core 1 The final Comprehensive Same general format as the Midterm. Review the homeworks, the slides, and the quizzes. 2 Key Points What is wide issue mean? How does does it affect

More information

Unit 9: Static & Dynamic Scheduling

Unit 9: Static & Dynamic Scheduling CIS 501: Computer Architecture Unit 9: Static & Dynamic Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Mar;n at University of Pennsylvania CIS 501: Comp. Arch. Prof. Milo Martin

More information

CIS 371 Computer Organization and Design

CIS 371 Computer Organization and Design CIS 371 Computer Organization and Design Unit 10: Static & Dynamic Scheduling Slides developed by M. Martin, A.Roth, C.J. Taylor and Benedict Brown at the University of Pennsylvania with sources that included

More information

Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3. David Wentzlaff Department of Electrical Engineering Princeton University

Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3. David Wentzlaff Department of Electrical Engineering Princeton University Computer Architecture ELE 475 / COS 475 Slide Deck 6: Superscalar 3 David Wentzlaff Department of Electrical Engineering Princeton University 1 Agenda SpeculaJon and Branches Register Renaming Memory DisambiguaJon

More information

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Computer Architecture: Out-of-Order Execution Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University Reading for Today Smith and Sohi, The Microarchitecture of Superscalar Processors, Proceedings

More information

DAT105: Computer Architecture Study Period 2, 2009 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation

DAT105: Computer Architecture Study Period 2, 2009 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation Study Period 2, 29 Exercise 2 Chapter 2: Instruction-Level Parallelism and Its Exploitation Mafijul Islam Department of Computer Science and Engineering November 12, 29 Study Period 2, 29 Goals: To understand

More information

CSCI 510: Computer Architecture Written Assignment 2 Solutions

CSCI 510: Computer Architecture Written Assignment 2 Solutions CSCI 510: Computer Architecture Written Assignment 2 Solutions The following code does compution over two vectors. Consider different execution scenarios and provide the average number of cycles per iterion

More information

Code Scheduling & Limitations

Code Scheduling & Limitations This Unit: Static & Dynamic Scheduling CIS 371 Computer Organization and Design Unit 11: Static and Dynamic Scheduling App App App System software Mem CPU I/O Code scheduling To reduce pipeline stalls

More information

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon]

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Anne Bracy CS 3410 Computer Science Cornell University [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon] Prog. Mem PC +4 inst Reg. File 5 5 5 control ALU Data Mem Fetch Decode Execute Memory WB

More information

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold

Pipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

More information

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 1 submission

More information

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS

Announcements. Programming assignment #2 due Monday 9/24. Talk: Architectural Acceleration of Real Time Physics Glenn Reinman, UCLA CS Lipasti, artin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar GAS STATION Pipelining II Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin,

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. memory inst register

More information

CS152: Computer Architecture and Engineering Introduction to Pipelining. October 22, 1997 Dave Patterson (http.cs.berkeley.

CS152: Computer Architecture and Engineering Introduction to Pipelining. October 22, 1997 Dave Patterson (http.cs.berkeley. CS152: Computer Architecture and Engineering Introduction to Pipelining October 22, 1997 Dave Patterson (http.cs.berkeley.edu/~patterson) lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/ cs 152

More information

Decoupling Loads for Nano-Instruction Set Computers

Decoupling Loads for Nano-Instruction Set Computers Decoupling Loads for Nano-Instruction Set Computers Ziqiang (Patrick) Huang, Andrew Hilton, Benjamin Lee Duke University {ziqiang.huang, andrew.hilton, benjamin.c.lee}@duke.edu ISCA-43, June 21, 2016 1

More information

Tomasulo-Style Register Renaming

Tomasulo-Style Register Renaming Tomasulo-Style Register Renaming ldf f0,x(r1) allocate RS#4 map f0 to RS#4 mulf f4,f0, allocate RS#6 ready, copy value f0 not ready, copy tag Map Table f0 f4 RS#4 RS T V1 V2 T1 T2 4 REG[r1] 6 REG[] RS#4

More information

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019 6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019 http://csg.csail.mit.edu/6.823/ This self-assessment test is intended to help you determine your

More information

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science. Chapter 3: Computer Organization Fundamentals Prof. Ben Lee Oregon State University School of Electrical Engineering and Computer Science Chapter Goals Understand the organization of a computer system

More information

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design

ENGN1640: Design of Computing Systems Topic 05: Pipeline Processor Design ENGN64: Design of Computing Systems Topic 5: Pipeline Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

Improving Performance: Pipelining!

Improving Performance: Pipelining! Iproving Perforance: Pipelining! Meory General registers Meory ID EXE MEM WB Instruction Fetch (includes PC increent) ID Instruction Decode + fetching values fro general purpose registers EXE EXEcute arithetic/logic

More information

CIS 662: Sample midterm w solutions

CIS 662: Sample midterm w solutions CIS 662: Sample midterm w solutions 1. (40 points) A processor has the following stages in its pipeline: IF ID ALU1 MEM1 MEM2 ALU2 WB. ALU1 stage is used for effective address calculation for loads, stores

More information

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining CMU 18-447 Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining Instructor: Prof. Onur Mutlu TAs: Justin Meza, Yoongu Kim, Jason Lin 1 Adding the REP

More information

CS 6354: Tomasulo. 21 September 2016

CS 6354: Tomasulo. 21 September 2016 1 CS 6354: Tomasulo 21 September 2016 To read more 1 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer

More information

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer.

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer. To read more CS 6354: Tomasulo 21 September 2016 This day s paper: Tomasulo, An Efficient Algorithm for Exploiting Multiple Arithmetic Units Supplementary readings: Hennessy and Patterson, Computer Architecture:

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

Optimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao

Optimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao Optimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao Feb 28th, 2002 Our Questions about Tomasulo Questions about Tomasulo s Algorithm Is it optimal (can always produce the wisest instruction execution

More information

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Pipeline Hazards. See P&H Chapter 4.7. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Pipeline Hazards See P&H Chapter 4.7 Hakim Weatherspoon CS 341, Spring 213 Computer Science Cornell niversity Goals for Today Data Hazards Revisit Pipelined Processors Data dependencies Problem, detection,

More information

Problem Solving Using Algebraic Models. Algebraic model A mathematical statement that represents a real-life problem

Problem Solving Using Algebraic Models. Algebraic model A mathematical statement that represents a real-life problem 1.5 Problem Solving Using Algebraic s Goals Use a general roblem solving lan. Use other roblem solving strategies. Your Notes VOCABULARY model A word equation reresenting a real-life roblem Algebraic model

More information

Pipelined MIPS Datapath with Control Signals

Pipelined MIPS Datapath with Control Signals uction ess uction Rs [:26] (Opcode[5:]) [5:] ranch luor. Decoder Pipelined MIPS path with Signals luor Raddr at Five instruction sequence to be processed by pipeline: op [:26] rs [25:2] rt [2:6] rd [5:]

More information

MaxxForce TM 11 High Pressure and Oil. Engine Systems

MaxxForce TM 11 High Pressure and Oil. Engine Systems 2004 2006 2007 MaxxForce TM 11 High Pressure and Oil System MaxxForce Diagnostics 13 Engine Systems A N AV I S TA R C O M PA N Y Study Study Guide Guide TMT-120717 Study Guide MaxxForce High TM Pressure

More information

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. UCB CS61C : Machine Structures

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. UCB CS61C : Machine Structures Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 31 Caches II 2008-04-12 HP has begun testing research prototypes of a novel non-volatile memory element, the

More information

COMPARATIVE ANALYSIS OF PMAC MOTORS FOR EV AND HEV APPLICATIONS

COMPARATIVE ANALYSIS OF PMAC MOTORS FOR EV AND HEV APPLICATIONS COMPARATVE ANALYSS OF PMAC MOTORS FOR EV AND HEV APPLCATONS Velev B. nstitute of Electrochemistry and Energy Systems, Bulgaria Abstract: n current work is made comarison between the differences in structure

More information

UTILIZING WAVE ROTOR TECHNOLOGY TO ENHANCE THE TURBO COMPRESSION IN POWER AND REFRIGERATION CYCLES

UTILIZING WAVE ROTOR TECHNOLOGY TO ENHANCE THE TURBO COMPRESSION IN POWER AND REFRIGERATION CYCLES Proceedings of IMECE 3 3 ASME International Mechanical Engineering Congress & Exosition Washington, D.C., November -, 3 IMECE3- UTILIZING WAVE ROTOR TECHNOLOGY TO ENHANCE THE TURBO COMPRESSION IN POWER

More information

A New Decentralized Algorithm for Optimal Load Shifting via Electric Vehicles

A New Decentralized Algorithm for Optimal Load Shifting via Electric Vehicles Proceedings of the 36th Chinese Control Conference July 26-28, 2017, Dalian, China A New Decentralized Algorithm for Otimal Load Shifting via Electric Vehicles Hao Xing 1, Zhiyun Lin 1, Minyue Fu 2 1.

More information

GRUNDFOS DATA BOOKLET. Hydro Grundfos Hydro 1000 booster sets with 1-4 CR pumps 50 Hz

GRUNDFOS DATA BOOKLET. Hydro Grundfos Hydro 1000 booster sets with 1-4 CR pumps 50 Hz GRUNDFOS DATA BOOKLET ydro Grundfos ydro booster sets with 1-4 CR ums z Contents Product data Performance range 3 ydro 4 Tye key 4 Oerating conditions 4 Other versions on request 4 Function 5 Grundfos

More information

GRUNDFOS DATA BOOKLET. Hydro 1000 G - X. Grundfos Hydro 1000 G - X booster sets with 1-4 CR pumps 50 Hz

GRUNDFOS DATA BOOKLET. Hydro 1000 G - X. Grundfos Hydro 1000 G - X booster sets with 1-4 CR pumps 50 Hz GRUNDFOS DATA BOOKLET ydro 0 G - X Grundfos ydro 0 G - X booster sets with 1-4 CR ums z Contents Product data Performance range Page 3 ydro 0 G - X Page 4 Tye key Page 4 Oerating conditions Page 4 Other

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 20 Synchronous Digital Systems Blu-ray vs HD-DVD war over? As you know, there are two different, competing formats for the next

More information

CS 152 Computer Architecture and Engineering

CS 152 Computer Architecture and Engineering CS 152 Computer Architecture and Engineering Lecture 23 Synchronization 2006-11-16 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Udam Saini and Jue Sun www-inst.eecs.berkeley.edu/~cs152/ 1 Last Time:

More information

9SX SX PX PX SX EBM 180V 9PX EBM 180V. Installation and user manual ENGLISH. Copyright 2012 EATON All rights reserved.

9SX SX PX PX SX EBM 180V 9PX EBM 180V. Installation and user manual ENGLISH. Copyright 2012 EATON All rights reserved. ENGLISH 9SX 5000 9SX 6000 9PX 5000 9PX 6000 9SX EBM 180V 9PX EBM 180V Installation and user manual Coyright 2012 EATON All rights reserved. Service and suort: Call your local service reresentative Page

More information

CS 250! VLSI System Design

CS 250! VLSI System Design CS 250! VLSI System Design Lecture 3 Timing 2014-9-4! Professor Jonathan Bachrach! slides by John Lazzaro TA: Colin Schmidt www-insteecsberkeleyedu/~cs250/ UC Regents Fall 2013/1014 UCB everything doesn

More information

Mine Ventilation Solutions. Quiet. Efficient. Durable. a company of

Mine Ventilation Solutions. Quiet. Efficient. Durable. a company of Mine Ventilation Solutions Quiet. Efficient. Durable. a comany of TLT-Turbo - A cororate success story. Innovation for more than 140 years. As a ioneer and innovator in the fan and blower market with a

More information

Chapter 10 And, Finally... The Stack

Chapter 10 And, Finally... The Stack Chapter 10 And, Finally... The Stack Stacks: An Abstract Data Type A LIFO (last-in first-out) storage structure. The first thing you put in is the last thing you take out. The last thing you put in is

More information

In-Place Associative Computing:

In-Place Associative Computing: In-Place Associative Computing: A New Concept in Processor Design 1 Page Abstract 3 What s Wrong with Existing Processors? 3 Introducing the Associative Processing Unit 5 The APU Edge 5 Overview of APU

More information

ABSTRACT. Keywords: flat electrodynamic tether, grazing impact, ballistic limit 1. INTRODUCTION

ABSTRACT. Keywords: flat electrodynamic tether, grazing impact, ballistic limit 1. INTRODUCTION SURVVIBILITY TO HYPERVELOCITY IMPCTS OF ELECTRODYNMIC TPE TETHERS FOR DEORBITING SPCECRFT IN LEO. Francesconi*, C. Giacomuzzo*, F. Branz*, E.C. Lorenzini* *University of Padova CISS G. Colombo, Padova,

More information

M2 Instruction Set Architecture

M2 Instruction Set Architecture M2 Instruction Set Architecture Module Outline Addressing modes. Instruction classes. MIPS-I ISA. High level languages, Assembly languages and object code. Translating and starting a program. Subroutine

More information

Programming Languages (CS 550)

Programming Languages (CS 550) Programming Languages (CS 550) Mini Language Compiler Jeremy R. Johnson 1 Introduction Objective: To illustrate how to map Mini Language instructions to RAL instructions. To do this in a systematic way

More information

Lecture 31 Caches II TIO Dan s great cache mnemonic. Issues with Direct-Mapped

Lecture 31 Caches II TIO Dan s great cache mnemonic. Issues with Direct-Mapped CS61C L31 Caches II (1) inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 31 Caches II 26-11-13 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia GPUs >> CPUs? Many are using

More information

Artificial Neural Network Based Modeling of Injection Pressure in Diesel Engines

Artificial Neural Network Based Modeling of Injection Pressure in Diesel Engines Artificial Neural Network Based Modeling of Injection Pressure in Diesel Engines MALI AKCAYOL, CAN CINAR, HIBRAHIM BULBUL, ALI KILICARSALAN 4 Deartment of Comuter Engineering, Gazi University, Maltee,

More information

Supersonic (Engine) Inlets

Supersonic (Engine) Inlets School of Aerosace Engineering Suersonic (Engine) Inlets For air-breathing engines on suersonic vehicles, usually want to slow flow down to subsonic seeds inside engine need diffuser (M>1 M

More information

EECS 583 Class 9 Classic Optimization

EECS 583 Class 9 Classic Optimization EECS 583 Class 9 Classic Optimization University of Michigan September 28, 2016 Generalizing Dataflow Analysis Transfer function» How information is changed by something (BB)» OUT = GEN + (IN KILL) /*

More information

Special conveyor chains

Special conveyor chains 56 Secial conveyor chains iwis offers an extensive rogram of secial chains for various industrial alications and requirements. While the late chain is being used wherever smooth and reliable conveying

More information

Flywheel Energy Storage Systems for Rail

Flywheel Energy Storage Systems for Rail Imerial College London Deartment of Mechanical Engineering Flywheel Energy Storage Systems for ail Matthew ead November 2010 Thesis submitted for the Diloma of the Imerial College (DIC), PhD degree of

More information

Simplex roller chains - European design... 9 according to standards čsn , din 8187 and Iso 606

Simplex roller chains - European design... 9 according to standards čsn , din 8187 and Iso 606 AND BUSH CHAINS Simlex roller chains - Euroean design... 9 according to standards čsn 02 3311, din 8187 and Iso 606 Dulex roller chains - Euroean design... 10 according to standards čsn 02 3311, din 8187

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp12

More information

Storage and Memory Hierarchy CS165

Storage and Memory Hierarchy CS165 Storage and Memory Hierarchy CS165 What is the memory hierarchy? L1

More information

MECHANICAL ENGINEERING

MECHANICAL ENGINEERING Serial : 0 PT_ME_A+B_IC Engine_6098 CLASS TEST Delhi Noida Bhoal Hyderabad Jaiur Lucknow Indore Pune Bhubaneswar Kolkata Patna Web: E-mail: info@madeeasy.in Ph: 0-56 MECHANICAL ENGINEERING I.C. Engine

More information

DL8FM-TW INSTALLATION INSTRUCTIONS

DL8FM-TW INSTALLATION INSTRUCTIONS DL8FM-TW INSTALLATION INSTRUCTIONS INDEX: WIRING INSTRUCTIONS... G 2-4 DOOR LOCKS/UNLOCKS... G 4-8 HI-JACK ROGRAMMING... G 9 LED STATUS INDICATOR... G 9 SHOCK SENSOR... G 9 NEG(-) HORN HONK OUTUT... G

More information

ON THE SAFETY OF HYDRATE REMEDIATION BY ONE-SIDED DEPRESSURIZATION

ON THE SAFETY OF HYDRATE REMEDIATION BY ONE-SIDED DEPRESSURIZATION Proceedings of the 7th International Conference on Gas Hydrates (ICGH 2011), Edinburgh, Scotland, United Kingdom, July 17-21, 2011. ON THE SAFETY OF HYDRATE REMEDIATION BY ONE-SIDED DEPRESSURIZATION Ricardo

More information

Series 32D Electronic Pressure Switch for Hydraulic and High Pressure Applications. Series 18D Hydraulic. Pressure Switch

Series 32D Electronic Pressure Switch for Hydraulic and High Pressure Applications. Series 18D Hydraulic. Pressure Switch Contents Pressure Switch Overview.......................2 Series 18D Pneumatic Pressure Switches...........6 Series 18D Hydraulic Pressure Switches...........10 Series 31D Electronic Pressure Switches

More information

THERMODYNAMICS AND ENGINE CYCLES

THERMODYNAMICS AND ENGINE CYCLES CHAPTER 4 THERMODYNAMICS AND ENGINE CYCLES 4.1 Introduction In this chater, a brief engine history is resented to trace some of the thermodynamic ideas that are used in modern engines. The ideal gas law

More information

WHATEVER THE DISTANCE. LINEA &TOUREA High-Comfort Bus Driver Seats.

WHATEVER THE DISTANCE. LINEA &TOUREA High-Comfort Bus Driver Seats. HOME ON THE ROAD. WHATEVER THE DISTANCE. LINEA &TOUREA High-Comfort Bus Driver Seats. SITTING RETTY WITH A HEALTHY OSTURE. Sitting Comfort with roofen Ergonomics. Otimum seat ergonomics decisively imrove

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 20: Multiplier Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411

More information

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT

128Mb Synchronous DRAM. Features High Performance: Description. REV 1.0 May, 2001 NT5SV32M4CT NT5SV16M8CT NT5SV8M16CT Features High Performance: f Clock Frequency -7K 3 CL=2-75B, CL=3-8B, CL=2 Single Pulsed RAS Interface Fully Synchronous to Positive Clock Edge Four Banks controlled by BS0/BS1 (Bank Select) Units 133

More information

GRUNDFOS DATA BOOKLET. CHV booster. Hydro Pack, Hydro Dome 50 Hz

GRUNDFOS DATA BOOKLET. CHV booster. Hydro Pack, Hydro Dome 50 Hz GRUNDFOS DATA BOOKLET CV booster ydro Pack, ydro Dome 5 z Contents ydro Pack General data Performance range Page 3 Alications Page 4 General descrition Page 4 Oerating conditions Page 4 Tye key Page 4

More information

[EN-037] Airborne Conflict Modeling and Resolution for UAS Insertion in Civil Non-Segregated Airspace

[EN-037] Airborne Conflict Modeling and Resolution for UAS Insertion in Civil Non-Segregated Airspace ENRI Int. Worksho on ATM/CNS. Tokyo, Jaan. (EIWAC 2010) [EN-037] Airborne Conflict Modeling and Resolution for UAS Insertion in Civil Non-Segregated Airsace (EIWAC 2010) C.A. Persiani*, S. Bagassi** *Deartment

More information

Investing in. Mattei Compressors. Compressed Air Industry Automotive Transit. Good morning. How is business at Mattei Compressors?

Investing in. Mattei Compressors. Compressed Air Industry Automotive Transit. Good morning. How is business at Mattei Compressors? New Flyer Bus Audit Mattei Rotary Vane O c t o b e r 2 0 0 8 1 0 / 0 8 Comressed Air Industry Automotive Transit Investing in Mattei Comressors Comressed Air Best Practices soke with Jay Hedges (General

More information

Warped-Compression: Enabling Power Efficient GPUs through Register Compression

Warped-Compression: Enabling Power Efficient GPUs through Register Compression WarpedCompression: Enabling Power Efficient GPUs through Register Compression Sangpil Lee, Keunsoo Kim, Won Woo Ro (Yonsei University*) Gunjae Koo, Hyeran Jeon, Murali Annavaram (USC) (*Work done while

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Registers and Counters CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev

More information

The Control System for the Production of Biodiesel

The Control System for the Production of Biodiesel INTERNATINAL JURNAL F CIRCUITS, SYSTEMS AND SIGNAL PRCESSING The Control System for the Production of Biodiesel Stanislav Plšek, Vladimír Vašek Abstract This article describes the control unit for the

More information

Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs

Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs Fixing the Hyperdrive: Maximizing Rendering Performance on NVIDIA GPUs Louis Bavoil, Principal Engineer Booth #223 - South Hall www.nvidia.com/gdc Full-Screen Pixel Shader SM TEX L2 DRAM CROP SM = Streaming

More information

Advantages of using a Switched Reluctance Generator (SRG) for wind energy applications

Advantages of using a Switched Reluctance Generator (SRG) for wind energy applications Advantages of using a Switched Reluctance Generator (SRG) for wind energy alications Eleonora Darie, Costin Ceisca, Emanuel Darie Abstract Wind energy found to be one of the most useful solutions to hel

More information

ScienceDirect. Highly flexible hot gas generation system for turbocharger testing

ScienceDirect. Highly flexible hot gas generation system for turbocharger testing Available online at www.sciencedirect.com ScienceDirect Energy Procedia 45 ( 2014 ) 1116 1125 68 th Conference of the Italian hermal Machines Engineering Association, AI2013 Highly flexible hot gas generation

More information

machine design, Vol.3(2011) No.3, ISSN pp

machine design, Vol.3(2011) No.3, ISSN pp machine design, Vol.3(2011) No.3, ISSN 1821-1259. 189-194 Preliminary note ESTIMATION OF THE DRIVING FORCE AND DRAG FORCE OF THE POWERTRAIN SYSTEM WITH THE USE OF A UNIVERSAL PORTABLE DEVICE IN ROAD TEST

More information

24 Vdc ELECTRONIC CONTROL UNIT FOR SWING GATES USE INSTRUCTIONS - INSTALLATION INSTRUCTIONS

24 Vdc ELECTRONIC CONTROL UNIT FOR SWING GATES USE INSTRUCTIONS - INSTALLATION INSTRUCTIONS 24 Vdc ELECTRONIC CONTROL UNIT FOR SWING GTES USE INSTRUCTIONS INSTLLTION INSTRUCTIONS 1. GENERL CHRCTERISTICS This 24 Vdc control unit for swing gates offers high erformance and a wide range of adjustments:

More information

Lecture Secure, Trusted and Trustworthy Computing Trusted Execution Environments Intel SGX

Lecture Secure, Trusted and Trustworthy Computing Trusted Execution Environments Intel SGX 1 Lecture Secure, and Trustworthy Computing Execution Environments Intel Prof. Dr.-Ing. Ahmad-Reza Sadeghi System Security Lab Technische Universität Darmstadt (CASED) Germany Winter Term 2015/2016 Intel

More information

Energy Efficient Content-Addressable Memory

Energy Efficient Content-Addressable Memory Energy Efficient Content-Addressable Memory Advanced Seminar Computer Engineering Institute of Computer Engineering Heidelberg University Fabian Finkeldey 26.01.2016 Fabian Finkeldey, Energy Efficient

More information

GRUNDFOS DATA BOOKLET. Hydro MPC. Booster systems with 2 to 6 pumps 50 Hz

GRUNDFOS DATA BOOKLET. Hydro MPC. Booster systems with 2 to 6 pumps 50 Hz GRUNDFOS DATA BOOKLET ydro MPC Booster systems with 2 to 6 ums 50 z Contents Introduction Benefits 3 Product data Performance range 5 Product range 6 Tye key 7 Oerating conditions 7 Construction Pum 8

More information

Research Note PRACTICAL IMPLEMENTATION OF MULTI-MOTOR DRIVES FOR WIDE SPAN GANTRY CRANES *

Research Note PRACTICAL IMPLEMENTATION OF MULTI-MOTOR DRIVES FOR WIDE SPAN GANTRY CRANES * ranian Journal of Science & Technology, Transaction B: Engineering, Vol. 34, No. B6, 649-654 Printed in The slamic Reublic of ran, 2010 Shiraz University Research Note PRACTCAL MPLEMENTATON OF MULT-MOTOR

More information

2View how. 1View how to. 3View how to. 4Customer. 5, 7.5, 10 T Dump Kit Owner s Manual

2View how. 1View how to. 3View how to. 4Customer. 5, 7.5, 10 T Dump Kit Owner s Manual 5, 7.5, 10 T Dum Kit Owner s Manual Log on to www.youtube.com/iercearrowinc or use your QR code reader to view the following videos of a 5T installation: 1View how to begin your installation. 2View how

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 02

More information

GRUNDFOS DATA BOOKLET. Hydro MPC. Booster systems with 2 to 6 pumps 50 Hz

GRUNDFOS DATA BOOKLET. Hydro MPC. Booster systems with 2 to 6 pumps 50 Hz GRUNDFOS DATA BOOKLET ydro MPC Booster systems with 2 to 6 ums 50 z Contents Introduction Benefits 3 Product data Performance range 5 Product range 6 Tye key 7 Oerating conditions 7 Construction Pum 8

More information

TALFM-TW INSTALLATION INSTRUCTIONS

TALFM-TW INSTALLATION INSTRUCTIONS TALFM-TW INSTALLATION INSTRUCTIONS INDEX: WIRING INSTRUCTIONS... G 2-4 DOOR LOCKS/UNLOCKS... G 4-7 HI-JACK ROGRAMMING... G 8 LED STATUS INDICATOR... G 9 SHOCK SENSOR... G 9 NEG(-) HORN HONK OUTUT... G

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Digital Arithmetic Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletch and Andrew Hilton (Duke) Last

More information

VECTOR INVERTER -INSTRUCTION MANUAL- ADDITIONAL OPEN COLLECTOR OUTPUT / PLG PULSE DIVISION OUTPUT FR-V5AY

VECTOR INVERTER -INSTRUCTION MANUAL- ADDITIONAL OPEN COLLECTOR OUTPUT / PLG PULSE DIVISION OUTPUT FR-V5AY VECTOR INVERTER -INSTRUCTION MANUAL- ADDITIONAL OPEN COLLECTOR OUTPUT / PLG PULSE DIVISION OUTPUT FR-V5AY Thank you for choosing the Mitsubishi vector inverter option unit. This instruction manual gives

More information

Power Electronics 11. Thyristors. Electronic Science. Module -11. Thyristors

Power Electronics 11. Thyristors. Electronic Science. Module -11. Thyristors 1 Module -11 Thyristors 1. Introduction 2. Classification of Thyristors 3. Unidirectional Thyristors with Turn-On Caability 3.1. Phase-controlled thyristors (or SCRs) 3.2. Fast switching thyristors (or

More information

BN SERIES BATTERY NOTCHING. BN Series Battery Notching. For lithium-ion battery production

BN SERIES BATTERY NOTCHING. BN Series Battery Notching. For lithium-ion battery production BN SERIES BATTERY NOTCHING BN Series Battery Notching For lithium-ion battery roduction 2 BN SERIES BATTERY NOTCHING MANZ AG 3 BN SERIES BATTERY NOTCHING MANZ AG GERMAN ENGINEERING INTERNATIONALLY STAGED

More information

The Discussion of this exercise covers the following points:

The Discussion of this exercise covers the following points: Exercise 3-2 Hydraulic Brakes EXERCISE OBJECTIVE When you have completed this exercise, you will be familiar with the hydraulic circuits of the yaw and the rotor brakes. You will control brakes by changing

More information

VHDL (and verilog) allow complex hardware to be described in either single-segment style to two-segment style

VHDL (and verilog) allow complex hardware to be described in either single-segment style to two-segment style FFs and Registers In this lecture, we show how the process block is used to create FFs and registers Flip-flops (FFs) and registers are both derived using our standard data types, std_logic, std_logic_vector,

More information