ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Similar documents
CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

Logic Gates and Digital Electronics

Introduction to Digital Techniques

correlated to the Virginia Standards of Learning, Grade 6

Parallelism I: Inside the Core

Grade 3: Houghton Mifflin Math correlated to Riverdeep Destination Math

Design and Analysis of 32 Bit Regular and Improved Square Root Carry Select Adder

Fourth Grade. Slide 1 / 146. Slide 2 / 146. Slide 3 / 146. Multiplication and Division Relationship. Table of Contents. Multiplication Review

Scientific Notation. Slide 1 / 106. Slide 2 / 106. Slide 3 / th Grade. Table of Contents. New Jersey Center for Teaching and Learning

Registers Shift Registers Accumulators Register Files Register Transfer Language. Chapter 8 Registers. SKEE2263 Digital Systems

Fourth Grade. Multiplication Review. Slide 1 / 146 Slide 2 / 146. Slide 3 / 146. Slide 4 / 146. Slide 5 / 146. Slide 6 / 146

VHDL (and verilog) allow complex hardware to be described in either single-segment style to two-segment style

6.823 Computer System Architecture Prerequisite Self-Assessment Test Assigned Feb. 6, 2019 Due Feb 11, 2019

Low Power And High Performance 32bit Unsigned Multiplier Using Adders. Hyderabad, A.P , India. Hyderabad, A.P , India.

Chapter 3: Computer Organization Fundamentals. Oregon State University School of Electrical Engineering and Computer Science.

TECHNICAL REPORTS from the ELECTRONICS GROUP at the UNIVERSITY of OTAGO. Table of Multiple Feedback Shift Registers

Scientific Notation. Slide 1 / 106. Slide 2 / 106. Slide 4 / 106. Slide 3 / 106. Slide 5 / 106. Slide 6 / th Grade.

Roehrig Engineering, Inc.

Grade 1: Houghton Mifflin Math correlated to Riverdeep Destination Math

Using Tridium s Sedona 1.2 Components with Workbench

Exercise 2: Series-Opposing DC Sources

PHY152H1S Practical 3: Introduction to Circuits

FULLY SYNCHRONOUS DESIGN By Serge Mathieu

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR NPTEL ONLINE CERTIFICATION COURSE. On Industrial Automation and Control

CprE 281: Digital Logic

Busy Ant Maths and the Scottish Curriculum for Excellence Foundation Level - Primary 1

Computer Architecture: Out-of-Order Execution. Prof. Onur Mutlu (editted by Seth) Carnegie Mellon University

Speakers and Motors. Three feet of magnet wire to make a coil (you can reuse any of the coils you made in the last lesson if you wish)

Grade 8 Science. Unit 4: Systems in Action

Lecture 14: Instruction Level Parallelism

SHOCK DYNAMOMETER: WHERE THE GRAPHS COME FROM

Sequential Circuit Background. Young Won Lim 11/6/15

ABB June 19, Slide 1

JNC, JC, and JNZ Instructions for the WIMP51

Missouri Learning Standards Grade-Level Expectations - Mathematics

Anne Bracy CS 3410 Computer Science Cornell University. [K. Bala, A. Bracy, S. McKee, E. Sirer, H. Weatherspoon]

NOT gate (P = NOT A) AND gate (P = A AND B) Create this circuit. Create this circuit. Copy this truth table. Copy this truth table

Mandatory Experiment: Electric conduction

Getting to Know: Matthew Tongue, Norbar Torque Tools Ltd

The purpose of this lab is to explore the timing and termination of a phase for the cross street approach of an isolated intersection.

CMU Introduction to Computer Architecture, Spring 2013 HW 3 Solutions: Microprogramming Wrap-up and Pipelining

CH 19 MEASURING LENGTH

FPGA-based New Hybrid Adder Design with the Optimal Bit-Width Configuration

SCHOOL START-UP - Maths

70 psi x 1 sq. in. = 70 lbs of force 70 psi x 9 sq. in. = 630 lbs of force. Figure 1 Figure 2

C capacitance, 91 capacitors, codes for, 283 coupling, polarized and nonpolarized,

To read more. CS 6354: Tomasulo. Intel Skylake. Scheduling. How can we reorder instructions? Without changing the answer.

In-Place Associative Computing:

SCHOOL START-UP - Maths

Improving Performance: Pipelining!

EEEE 524/624: Fall 2017 Advances in Power Systems

Learning Objectives:

UC Berkeley CS61C : Machine Structures

Mound Math Excel Exercise Numbers and Operations

12 Electricity and Circuits

Arduino-based OBD-II Interface and Data Logger. CS 497 Independent Study Ryan Miller Advisor: Prof. Douglas Comer April 26, 2011

Problem Solving Recording Sheet

Physics Work with your neighbor. Ask me for help if you re stuck. Don t hesistate to compare notes with nearby groups.

Simple Gears and Transmission

ALIGNING A 2007 CADILLAC CTS-V

BASIC ELECTRICAL MEASUREMENTS By David Navone

Module 9. DC Machines. Version 2 EE IIT, Kharagpur

Draft Unofficial description of the UNRC charger menus

Problem Set 3 - Solutions

WHATEVER HAPPENED TO DOING THINGS RIGHT? ERA Battery Conference, Solihul, England Glenn Albér, Albércorp, Florida USA

Thought: Welfare- Maximizing Speeding Fines

In order to discuss powerplants in any depth, it is essential to understand the concepts of POWER and TORQUE.

Lecture 10: Circuit Families

CHAPTER 19 DC Circuits Units

Houghton Mifflin MATHEMATICS. Level 1 correlated to Chicago Academic Standards and Framework Grade 1

Every Friday, Bart and Lisa meet their friends at an after-school club. They spend the afternoon playing Power Up, a game about batteries.

Autonomously Controlled Front Loader Senior Project Proposal

Composite Layout CS/ECE 5710/6710. N-type from the top. N-type Transistor. Polysilicon Mask. Diffusion Mask

Electricity and Magnetism Module 2 Student Guide

Utility Trailer 5 x 8 Building Notes

LETTER TO PARENTS SCIENCE NEWS. Dear Parents,

CS 6354: Tomasulo. 21 September 2016

AUTO 140A: VEHICLE MAINTENANCE

Problem 1: The trouble with DC electrical systems

Science Olympiad Shock Value ~ Basic Circuits and Schematics

Chapter Review Problems

WARRANTY AND DISCLAIMER

HOW TO MAKE YOUR OWN BATTERIES

Fig 1 An illustration of a spring damper unit with a bell crank.

feature 10 the bimmer pub

Optimality of Tomasulo s Algorithm Luna, Dong Gang, Zhao

Using Sedona 1.2 Components from Tridium s Kits

How Regenerative Braking Works

Teacher s Guide: Safest Generation Ad Activity

CSci 127: Introduction to Computer Science

Successive Approximation Time-to-Digital Converter with Vernier-level Resolution

WARNING These following pages are instruction for C5 CE stripes; however, it is the same method applying vinyl. Please spend time to read thru these

SMARTSTRINGSTM. Owner's Manual

Electromechanical Arithmetic Logic Unit. David Bober E90 Project Proposal 12/2/2008

11.1 CURRENT ELECTRICITY. Electrochemical Cells (the energy source) pg Wet Cell. Dry Cell. Positive. Terminal. Negative.

Greddy E-manage Installation and Tuning Information

Locomotive decoder LE104XF 1

Using your Digital Multimeter

How to Set the Alignment on Ford Mustangs

ACTIVITY 1: Electric Circuit Interactions

Transcription:

ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Digital Arithmetic Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletch and Andrew Hilton (Duke)

Last Time in ECE 550. Who can remind us what we talked about last time? Numbers One hot Binary Hex Digital Logic Sum of products Encoders Decoders Binary Numbers and Math Overflow 2

Designing a 1-bit adder, or half adder What boolean function describes the low bit? XOR What boolean function describes the high bit? AND 0 + 0 = 00 0 + 1 = 01 1 + 0 = 01 1 + 1 = 10 3

Designing a 1-bit adder (full adder) Remember how we did binary addition: Add the two bits Do we have a carry-in for this bit? Do we have to carry-out to the next bit? 01101100 01101101 +00101100 10011001 4

Designing a 1-bit adder (full adder) So we ll need to add three bits (including carry-in) Two-bit output is the carry-out and the sum a b C in 0 + 0 + 0 = 00 0 + 0 + 1 = 01 0 + 1 + 0 = 01 0 + 1 + 1 = 10 1 + 0 + 0 = 01 1 + 0 + 1 = 10 1 + 1 + 0 = 10 1 + 1 + 1 = 11 5

A 1-bit Full Adder Cin 01101100 a b Sum 01101101 +00101100 10011001 Cout Using just 2-in gates Exploiting associativity of xor Cout Full Adder A Sum B Cin a b C in Sum C out 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1 6

Ripple Carry S3 S2 S1 S0 C out Full Adder Full Adder Full Adder Full Adder a3 b3 a2 b2 a1 b1 a0 b0 Full Adder = Add 1 Bit Can chain together to add many bits Upside: Simple Downside? Slow. Let s see why. 7

Full adder delay A B Cin Sum A B Cout Full Adder Cin Cout Sum Cout depends on Cin 2 gate delays through single full adder cell for carry 8

Ripple Carry S3 S2 S1 S0 C out Full Adder Full Adder Full Adder Full Adder a3 b3 a2 b2 a1 b1 a0 b0 Carries form a chain Need CO of bit N is CI of bit N+1 For few bits (e.g., 4) no big deal For realistic numbers of bits (e.g., 32, 64), slow For 64 bits in worst case, how many gate delays? Nb variability itself is problematic! 9

Adding Adding is important Want to fit add in single clock cycle (More on clocking soon) Why? Add is ubiquitous Ripple Carry is slow Maybe can do better? But seems like Cin always depends on prev Cout and Cout always depends on current Cin 10

Hardware!= Software If this were software, we d be out of luck But hardware is different Parallelism: can do many things at once Speculation: can guess 11

Carry Select A 31-16 B 31-16 A 31-16 B31-16 A 15-0 B 15-0 16-bit RC Adder 1 16-bit RC Adder 0 16-bit RC Adder 0 16-bit 2:1 mux Sum 31-16 Sum 15-0 Do three things at once (32 gates) Add low 16 bits Add high 16 bits assuming CI = 0 Add high 16 bits assuming CI =1 Then pick correct assumption for high bits (2 3 gates) Cuts time roughly in half 12

Carry Select A 31-16 B 31-16 A 31-16 B 31-16 A 15-0 B 15-0 16-bit CS Adder 1 16-bit CS Adder 0 16-bit CS Adder 0 16-bit 2:1 mux Sum 31-16 Sum 15-0 Could apply same idea again Replace 16-bit RC adders with 16-bit CS adders (built out of 3x 8 bit RC adders) Reduce delay for 16 bit add from 32 to 18 Total 32 bit adder delay = 20 So just go nuts with this right? 13

Tradeoffs Tradeoffs in doing this Power and Area (~= number of gates) Roughly double every level of carry select we use Less return on increase each time Adding more mux delays Wire delays increase with area Not easy to count in slides But will eat into real performance Fancier adders exist: Carry-lookahead, conditional sum adder, carry-skip adder, carry-complete adder, etc 14

Recall: Subtraction 2 s complement makes subtraction easy: Remember: A - B = A + (-B) And: -B = ~B + 1 é that means flip bits ( not ) So we just flip the bits and start with CI = 1 Fortunate for us: makes circuits easy 1 0110101 -> 0110101-1010010 + 0101101 15

32-bit Adder/subtractor Ovf Cout A 32 B 32 32 32-bit Adder 32 Sum Cin Add/Sub 32way 2:1 mux Inputs: A, B, Add/Sub (0=Add,1 = Sub) Outputs: Sum, Cout, Ovf (Overflow) 16

32-bit Adder/subtractor Ovf Cout A 32 B 32 32 32-bit Adder 32 Sum Cin Add/Sub By the way: With a fast adder, that thing has about 3,000 transistors Aren t you glad we have abstraction? 17

Arithmetic Logic Unit (ALU) ALUs do a variety of math/logic Add Subtract Bit-wise operations: And, Or, Xor, Not Shift (left or right) Take two inputs (A,B) + operation (add,shift..) Do a variety in parallel, then mux based on op 18

Bit-wise operations: SHIFT Left shift (<<) Moves left, bringing in 0s at right, excess bits fall off 10010001 << 2 = 01000100 x << k corresponds to x * 2 k Logical (or unsigned) right shift (>>) Moves bits right, bringing in 0s at left, excess bits fall off 10010001 >> 3 = 00010010 x >>k corresponds to (approximately) x / 2 k Arithmetic (or signed) right shift (>>) Moves bits right, brining in (sign bit) at left 10010001 >> 3= 11110010 00010001 >> 3= 00000010 x >>k corresponds to (approximately) x / 2 k for unsigned x for signed x 19

Shift: Implementation? Suppose an 8-bit number b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 Shifted left by a 3 bit number s 2 s 1 s 0 Option 1: Truth Table? 11 inputs, 8 outputs 2048 rows? Not appealing but you can do it. Truth table gives this expression for output bit 0: ( b0 &!b1 &!b2 &!b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 &!b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 & b5 &!b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 &!b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 & b5 & b6 &!b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 &!b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 & b5 &!b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 &!b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 &!b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 &!b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 &!b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 &!b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 &!b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 &!b2 & b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 &!b2 & b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 &!b1 & b2 & b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) ( b0 & b1 & b2 & b3 & b4 & b5 & b6 & b7 &!s0 &!s1 &!s2) 20

Let s simplify 1 bit left shifter Simpler problem: 8-bit number shifted left by 1 bit number (shift amount selects each mux) b 7 b 6 out 7 out 6 b 5 b 4 out 5 out 4 b 3 b 2 b 1 b 0 out 3 out 2 out 1 0 out 0 21

Let s simplify 2 bit left shifter Simpler problem: 8-bit number shifted by 2 bit number (0, 1, 2, or 3 places) b 7 b 6 b 5 b 4 b 3 b 2 out 7 out 6 out 5 out 4 out 3 b 1 out 2 b 0 0 out 1 out 0 22

Now left shifted by 3-bit number Full problem: 8-bit number shifted by 3 bit number (0-7 bit shift) b 7 b 6 b 5 b 4 b 3 b 2 b 1 out 7 out 6 out 5 out 4 out 3 out 2 b 0 out 1 0 out 0 23

Now shifted by 3-bit number Shifter in action: shift by 000 (all muxes have S=0) b 7 b 6 b 5 b 4 b 3 b 2 b 1 out 7 out 6 out 5 out 4 out 3 out 2 b 0 out 1 0 out 0 24

Now shifted by 3-bit number Shifter in action: shift by 010 From L to R: S = 0, 1, 0 (reverse of shift amount) b 7 b 6 out 7 out 6 b 5 b 4 b 3 b 2 out 5 out 4 out 3 b 1 out 2 b 0 out 1 0 out 0 25

Now shifted by 3-bit number Shifter in action: shift by 011 From L to R: S= 1, 1, 0 (reverse of shift amount) b 7 b 6 out 7 out 6 b 5 b 4 b 3 b 2 out 5 out 4 out 3 b 1 out 2 b 0 out 1 0 out 0 26

What About Non-integer Numbers? There are infinitely many real numbers between two integers Many important numbers are real Pi = 3.14159265358965 ½ = 0.5 How could we represent these sorts of numbers? Fixed Point (embedded systems) Rational (represent numerator, denominator separately awkward) Floating Point (IEEE Single Precision) 27

Floating Point Think about scientific notation for a second: For example: 6.02 * 10 23 Real number, but comprised of ints: 6 only 1 digit here in canonical form 2 any number here 10 always 10 (base we work in) 23 can be positive or negative Canonical: we could write 60.2x 10^22, 6020 x 10^20, but we pick 6.02 x 10 ^23 as standard, or canonical, form Can represent really large, really small numbers Can we do something like this in binary? 28

How about: +/- X.YYYYYY * 2 +/-N Floating Point Big numbers: large positive N Small numbers (<<1): large negative N Numbers near 0: small N This is floating point : most common way 29

IEEE single precision floating point Specific format called IEEE single precision: +/- 1.YYYYY * 2 (N-127) float in Java, C, C++, Assume X is always 1 (save a bit) 1 sign bit (+ = 0, 1 = -) 8 bit biased exponent (we store exponent + 127 rather than exponent for good but complex reasons) Implicit 1 before binary point - since in canonical form, mantissa always begins 1.xx why store the 1! 23-bit mantissa fraction (YYYYY) 30

Binary fractions 1.YYYY has a binary point Like a decimal point but in binary After a decimal point, you have tenths hundredths thousandths So after a binary point you have Halves Quarters Eighths 31

Floating point example Binary fraction example: 101.101 = 4 + 1 + ½ + 1 / 8 = 5.625 For floating point, needs normalization: 1.01101 * 2 2 Sign is +, which = 0 Exponent = 127 + 2 = 129 = 1000 0001 Mantissa = 1.011 0100 0000 0000 0000 0000 31 30 23 22 0 1000 0001 011 0100 0000 0000 0000 0000 0 32

Floating Point Representation Example: What floating-point number is: 0xC1580000? 33

Answer What floating-point number is 0xC1580000? 1100 0001 0101 1000 0000 0000 0000 0000 X = 31 30 23 22 1 1000 0010 101 1000 0000 0000 0000 0000 s E F 0 Sign = 1 which is negative Exponent = (128+2)-127 = 3 Mantissa = 1.1011-1.1011x2 3 = -1101.1 = -13.5 34

Trick question How do you represent 0.0? Why is this a trick question? 0.0 = 000000000 But need 1.XXXXX representation? Exponent of 0 is denormalized, treated as special case Implicit 0. instead of 1. in mantissa Allows 0000.0000 to be 0 (related to why we use biased code for exponent!) Helps with very small numbers near 0 Results in +/- 0 in FP (but they are equal ) 35

Other weird FP numbers Exponent = 1111 1111 also not standard by decree of the standard, All 0 mantissa: +/- 1/0 = + -1/0 = - Non zero mantissa: Not a Number (NaN) sqrt(-42) = NaN 0/0= NaN 36

Floating Point Representation Double Precision Floating point: 64-bit representation: 1-bit sign 11-bit (biased) exponent 52-bit fraction (with implicit 1). double in Java, C, C++, S Exp Mantissa 1 11-bit 52 - bit 37

Danger: floats cannot hold all ints! Many programmers think: Floats can represent all ints NOT true Doubles can represent all 32-bit ints (but not all 64-bit ints) 38

Wrap Up Implementation of Math Addition/Subtraction Shifting Floating Point Numbers IEEE representation Denormalized Numbers Next Time: Storage Clocking 39