Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran

Similar documents
Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear. Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method

LECTURE 6: HETEROSKEDASTICITY

LET S ARGUE: STUDENT WORK PAMELA RAWSON. Baxter Academy for Technology & Science Portland, rawsonmath.

Problem Set 3 - Solutions

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

Relating your PIRA and PUMA test marks to the national standard

Relating your PIRA and PUMA test marks to the national standard

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian

GRADE 7 TEKS ALIGNMENT CHART

Improving CERs building

WHITE PAPER. Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard

Chapter 2 & 3: Interdependence and the Gains from Trade

Technical Papers supporting SAP 2009

The U.S. Auto Industry, Washington and New Priorities:

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1

Derivative Valuation and GASB 53 Compliance Report For the Period Ending September 30, 2015

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications

Busy Ant Maths and the Scottish Curriculum for Excellence Foundation Level - Primary 1

1. INTRODUCTION 3 2. COST COMPONENTS 17

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics

Busy Ant Maths and the Scottish Curriculum for Excellence Year 6: Primary 7

SHAFT ALIGNMENT FORWARD

Getting Electricity A pilot indicator set from the Doing Business Project. of the World Bank

Stat 301 Lecture 30. Model Selection. Explanatory Variables. A Good Model. Response: Highway MPG Explanatory: 13 explanatory variables

Fourth Grade. Multiplication Review. Slide 1 / 146 Slide 2 / 146. Slide 3 / 146. Slide 4 / 146. Slide 5 / 146. Slide 6 / 146

PROCEDURES FOR ESTIMATING THE TOTAL LOAD EXPERIENCE OF A HIGHWAY AS CONTRIBUTED BY CARGO VEHICLES

CHAPTER 19 DC Circuits Units

Fourth Grade. Slide 1 / 146. Slide 2 / 146. Slide 3 / 146. Multiplication and Division Relationship. Table of Contents. Multiplication Review

Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test

Modeling Ignition Delay in a Diesel Engine

9.3 Tests About a Population Mean (Day 1)

ESSAYS ESSAY B ESSAY A and 2009 are given below:

1ACE Exercise 1. Name Date Class

Descriptive Statistics

Voting Draft Standard

When the points on the graph of a relation lie along a straight line, the relation is linear

Transportation Issues Poll New York City Speed Safety Cameras in School Zones

The purpose of this lab is to explore the timing and termination of a phase for the cross street approach of an isolated intersection.

Algebra 2 Plus, Unit 10: Making Conclusions from Data Objectives: S- CP.A.1,2,3,4,5,B.6,7,8,9; S- MD.B.6,7

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

correlated to the Virginia Standards of Learning, Grade 6

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data

EXST7034 Multiple Regression Geaghan Chapter 11 Bootstrapping (Toluca example) Page 1

Module: Mathematical Reasoning

Linking the Florida Standards Assessments (FSA) to NWEA MAP

Horsepower to Drive a Pump

Chapter 5 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL

Investigation of Relationship between Fuel Economy and Owner Satisfaction

HASIL OUTPUT SPSS. Reliability Scale: ALL VARIABLES

The following output is from the Minitab general linear model analysis procedure.

INVESTIGATION ONE: WHAT DOES A VOLTMETER DO? How Are Values of Circuit Variables Measured?

Tennessee Soybean Producers Views on Biodiesel Marketing

Have Instrumental Variables Brought Us Closer to Truth?

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

34.5 Electric Current: Ohm s Law OHM, OHM ON THE RANGE. Purpose. Required Equipment and Supplies. Discussion. Procedure

University Of California, Berkeley Department of Mechanical Engineering. ME 131 Vehicle Dynamics & Control (4 units)

ME201 Project: Backing Up a Trailer Using Vector Analysis

Who has trouble reporting prior day events?

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Linking the Alaska AMP Assessments to NWEA MAP Tests

Evaluation of Renton Ramp Meters on I-405

Linear Modeling Exercises. In case you d like to see why the best fit line is also called a least squares regression line here ya go!

Passenger seat belt use in Durham Region

9/13/2017. Friction, Springs and Scales. Mid term exams. Summary. Investigating friction. Physics 1010: Dr. Eleanor Hodby

Propeller Power Curve

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

SAN PEDRO BAY PORTS YARD TRACTOR LOAD FACTOR STUDY Addendum

PARTIAL LEAST SQUARES: WHEN ORDINARY LEAST SQUARES REGRESSION JUST WON T WORK

PSYC 200 Statistical Methods in Psychology

Solutions to Suggested Homework Problems

The Coefficient of Determination

Appendix B STATISTICAL TABLES OVERVIEW

Student-Level Growth Estimates for the SAT Suite of Assessments

Selecting Hybrids Wisely. Bob Nielsen Purdue University Web:

Technical Manual for Gibson Test of Cognitive Skills- Revised

Missouri Seat Belt Usage Survey for 2017

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

Mandatory Experiment: Electric conduction

PHYSICS 111 LABORATORY Experiment #3 Current, Voltage and Resistance in Series and Parallel Circuits

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Index. Calculator, 56, 64, 69, 135, 353 Calendars, 348, 356, 357, 364, 371, 381 Card game, NEL Index

Civil Engineering and Environmental, Gadjah Mada University TRIP ASSIGNMENT. Introduction to Transportation Planning

Analyzing Crash Risk Using Automatic Traffic Recorder Speed Data

Application of claw-back

Sample Reports. Overview. Appendix C

5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS

Why calibrate? Calibrating your spray equipment

Lab 9: Faraday s and Ampere s Laws

Albert Sanzari IE-673 Assignment 5

Math 135 S18 Exam 1 Review. The Environmental Protection Agency records data on the fuel economy of many different makes of cars.

In order to discuss powerplants in any depth, it is essential to understand the concepts of POWER and TORQUE.

Burn Characteristics of Visco Fuse

Houghton Mifflin MATHEMATICS. Level 1 correlated to Chicago Academic Standards and Framework Grade 1

Some Robust and Classical Nonparametric Procedures of Estimations in Linear Regression Model

Components of Hydronic Systems

Linking the Mississippi Assessment Program to NWEA MAP Tests

White paper: Originally published in ISA InTech Magazine Page 1

Transcription:

Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O Halloran

I. Introduction A. Overview 1. Ways to describe, summarize and display data. 2.Summary statements: Mean Standard deviation Variance 3. Distributions Central Limit Theorem

I. Introduction (cont.) A. Overview 4. Test hypotheses 5. Differences of Means B. What's to come? 1. Analyze the relationship between two or more variables with a specific technique called regression analysis.

I. Introduction (cont.) A. Overview B. What's to come? 2. This tools allows us to predict the impact of one variable on another. For eample, what is the epected impact of a SIPA degree on income?

II. Causal Models Causal models eplain how changes in one variable affect changes in another variable.? Incinerator -------------------------> Bad Public Health Regression analysis gives us a way to analyze precisely the cause-and-effect relationships between variables. Directional Magnitude

II. Causal Models (cont.) A. Variables Let us start off with a few basic definitions. 1. Dependent Variable The dependent variable is the factor that we want to eplain. 2. Independent Variables Independent variable is the factor that we believe causes or influences the dependent variable. Independent variable-------> Dependent Variable Cause ------------------> Effect

II. Causal Models (cont.) A. Variables B. Voting Eample Let us say that we have a vote in the House of Representatives on health. And we want to know if party affiliation influenced individual members' voting decisions? 1. The raw data looks like this: Vote (Dep) (Indep) YES NO Party DEM 220 65 285 REP 30 120 150 250 185 435

II. Causal Models (cont.) A. Variables B. Voting Eample 2. Percentages look like this: YES NO DEM 50.6% 14.9% 65.5 REP 6.9% 27.6% 34.5 57.5 42.5 100 3. Does party affect voting behavior? Given that the legislator is a Democrat, what is the chance of voting for the health care proposal?

II. Causal Models (cont.) A. Variables B. Voting Eample 3. Does party affect voting behavior? (cont.) What is the Probability of being a democrat? What is the Probability of being a Democrat and voting yes? Vote (DepVar) Indep YES NO Party DEM REP

II. Causal Models (cont.) A. Variables B. Voting Eample 4. Casual Model This is the simplest way to state a causal model A-------------> B Party ---------> Vote 5. Interpretation The interpretation is that if party influences vote, then as we move from Republicans to Democrats we should see a move from a No vote to a YES vote.

II. Causal Models (cont.) A. Variables B. Voting Eample C. Summary 1. Regression analysis helps us to eplain the impact of one variable on another. We will be able to answer such questions as what is the relative importance of race in eplaining one's income? Or perhaps the influence of economic conditions on the levels of trade barriers?

II. Causal Models (cont.) A. Variables B. Voting Eample C. Summary 2. Univariate Model For now, we will focus on the univariate case, or the causal relation between two variables. We will then rela this assumption and look at the relation of multiple variables in a couple of weeks.

III. Fitted Line Although regression analysis can be very complicated, the heart of it is actually very simple. It centers on the notion of fitting a line through the data. 1. Eample Suppose we have a study of how wheat yield depends on fertilizer. And we observe this relation: X Fertilizer (lb /Acre) Y Yield (bu/acre) 100 40 200 50 300 50 400 70 500 65 600 65 700 80

III. Fitted Line (cont.) 1. Eample (cont.) The observed relation between Fertilizer and Yield then can be plotted as follows: 80 70 60 Yield 50 40 100 200 300 400 500 600 700 Fertilizer

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? a) Highest and Lowest Value 80 70 60 Yield 50 40 Lowest & highest value 100 200 300 400 500 600 700 Fertilizer

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? (cont.) b) Median Value 80 70 Yield 60 50 [Median] 40 100 200 300 400 500 600 700 Fertilizer

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? 3. Predicted Values a) Eample 1: The line that is fitted to the data gives the predicted value of Y for any give level of X.

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? 3. Predicted Values (cont.) a) Eample 1: 80 If X is 400 and all we know was the fitted line then we would epect the yield to be around 65. Yield 70 60 50 40 100 200 300 400 500 600 700 Fertilizer

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? 3. Predicted Values (cont.) b) Eample 2: Many times we have a lot of data and fitting the line becomes rather difficult.

III. Fitted Line (cont.) 1. Eample 2. What line best approimates the relation between these observations? 3. Predicted Values (cont.) b) Eample 2: 80 For eample, if our plotted data looked like this: Yield 70 60 50 40 100 200 300 400 500 600 700 Fertilizer

IV. OLS Ordinary Least Squares We want a methodology that allows us to be able to draw a line that best fits the data. A. The Least Square Criteria What we want to do is to fit a line whose equation is of the form: $Y = a + bx This is just the algebraic representation of a line.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria (cont.) 1. Intercept: a represents the intercept of the line. That is, the point at which the line crosses the Y ais. 2. Slope of the line: 80 b represents the slope of the line. Yield 70 60 50 40 a change in Y change in 100 200 300 400 500 600 700 Fertilizer

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria (cont.) 1. Intercept: 2. Slope of the line: Remember: the slope is just the change in Y divided by the change in X. Rise/Run 3. Minimizing the Sum or Squares a) Problem: How do we select a and b so that we minimize the pattern of vertical Y deviations (predicted errors)? We what to minimize the deviation: d = Y Y $

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria (cont.) 1. Intercept: 2. Slope of the line: 3. Minimizing the Sum or Squares b) There are several ways in which we can do this. 1. First, we could minimize the sum of d. We could find the line that will give us the lowest sum of all the d's. The problem of course is that some d's would be positive and others would be negative and when we add them all up they would end up canceling each other. In effect, we would be picking a line so that the d's add up to zero.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria (cont.) 1. Intercept: 2. Slope of the line: 3. Minimizing the Sum or Squares b) There are several ways in which we can do this. 2. Absolute Values Minimize Σ d = Σ Y -Y$ 3. Sum of Squared Deviations ( ) 2 Minimize Σd = Σ Y Y $ 2

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas 1. Fitted Line The line that we what to fit to the data is: $Y = a + bx This is simply what we call the OLS line. Remember: we are concerned with how to calculate the slope of the line b and the intercept of the line

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas 1. Fitted Line 2. OLS Slope The OLS slope can becalculated from the formula: b = ( X X)( Y Y) 2 ( X X)

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas 1. Fitted Line 2. OLS Slope In the book they use the abbreviations: X X y Y Y b= y 2

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas 1. Fitted Line 2. OLS Slope 3. Intercept Now that we have the slope b it is easy to calculate a a = Y -bx Note: when b=0 then the intercept is just the mean of the dependent variable.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield Data Deviation Form Products X Y X X Y Y y 2 100 40-300 -20 6000 90,000 200 50-200 -10 2000 40,000 300 50-100 -10 1000 10,000 400 70 0 10 0 0 500 65 100 5 500 10,000 600 65 200 5 1000 40,000 700 80 300 20 6000 90,000 X = 400 Y = 60 =0 y=0 y 2

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield So to calculate the slope we solve: b = y 2 = 16, 500 280, 000 =.059 We can then use the slope b to calculate the intercept

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield Remember: $Y = a + bx a = Y bx a = 60-.059(400) = 36.4 Plugging these estimated values into our fitted line equation, we get: Y = 36. 4+. 059 X

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield What is the predicted bushels produced with 400 lbs of fertilizer? What if we add 700 lbs of fertilizer what would be the epected yield? Y $ = 36. 4+. 059( 400 ) = 60

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a 1. Slope b Change in Y that accompanies a unit change X. The slope tells us that when there is a one unit change in the independent variable what is the predicted effect on the dependent variable?

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a 1. Slope b The slope then tells us two things: i) The directional effect of the independent variable on the dependent variable. There was a positive relation between fertilizer and yield.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a 1. Slope b The slope then tells us two things: ii) It also tells you the magnitude of the effect on the dependent variable. For each additional pound of fertilizer we epect an increased yield of.059 bushels.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a 2. The Intercept The intercept tells us what we would epect if there is no fertilizer added, we epect a yield of 36.4 bushels. Y $ = 36. 4+. 059( 0 ) = 36.4 So independent of the fertilizer you can epect 36.4 bushels. Alternatively, if fertilizer has no effect on yield, we would simply epect 36.4 bushels. The yield we epected with no fertilizer.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure 1. Casual Model We want to know if eposure to radio active waste is linked to cancer? Radio Active Waste --------------> Cancer

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure 2. Data Inde of Radio deaths per Active Eposure 10,000 X Y X X y = Y Y y 2 8.3 210 3.7 50 185 13.69 6.4 180 1.8 20 36 3.24 3.4 130-1.2-30 36 1.44 3.8 170-0.8 10-8 0.64 2.6 130-2.0-30 60 4 11.6 210 7.0 50 350 49 1.2 120-3.4-40 136 11.56 2.5 150-2.1-10 21 4.41 1.6 140-3.0-20 60 9 X = 4.6 Y = 160 =0 y=0 y =876 2

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure 3. Graph 200 190 180 170 160 150 140 130 120 110 100 1 2 3 4 5 6 7 8 9 10 11 12

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure 4. Calculate the regression line for predicting Y from X i) Slope b = y = 2 876 97. 0 = 9.03 How do we interpret the slope coefficient? For each unit of radioactive eposure, the cancer mortality rate rises by 9.03 deaths per 10,000 individuals.

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure ii) Calculate the intercept a = Y bx a = 160-9.03 (4.6) = 118.5 Plugging these estimated values into our fitted line equation, we get: Y$ = 118. 5 + 9. 03X

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure 5. Predictions: Let's calculate the mortality rate if X were 5.0. Y $ = 118. 5 + 9. 03( 5. 0 ) = 163.6 How about if X were 0? Y $ = 118. 5 + 9. 03( 0 ) = 118.5

IV. OLS Ordinary Least Squares (cont.) A. The Least Square Criteria B. OLS Formulas C. Eample 1: Fertilizer and Yield D. Interpretation of b and a E. Eample II: Radio Active Eposure How can we interpret this result? 200 190 180 170 Y=118.5+9.03X Even with no radioactive eposure, the mortality rate would be 118.5. 160 150 140 130 120 110 100 1 2 3 4 5 6 7 8 9 10 11 12

III. Advantages of OLS A. Easy 1. The least square method gives relative easy or at least computable formulas for calculating a and b. $Y = a + bx b= y 2 a = Y bx

III. Advantages of OLS (cont.) A. Easy B. OLS is similar to many concepts we have already used. 1. We are minimizing the sum of the squared deviations. In effect, this is very similar to how we find the variance. 2. Also, we saw above that when b=0, $Y = a or $ Y = Y The interpretation of this is that the best prediction we can make of Y is just the sample mean. This is the case when the two variables are independent.

III. Advantages of OLS (cont.) A. Easy B. OLS is similar to many concepts we have already used. C. Etension of the Sample Mean Since OLS is just an etension of the sample mean, it has many of the same properties like efficient and unbiased. D. Weighted Least Squares We might want to weigh some observations more heavily than others.

V. Homework Eample In the homework assignment, you are asked to select two interval/ratio level variables and calculate the fitted line that minimizes the sum of the squared deviations (the regression line). A. Choose 2 Variables What effect does the number of years of education have on the frequency that one reads the newspaper? The independent variable is Education And the dependent variable is Newspaper reading.

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables First, I made a new variable called PAPER. Recode all the missing data values to a single value. Remove missing values from the data set. Then do the same for education

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations Net, see how many valid observations are left by using the Summarize command under the Data menu.

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations 1. So we randomly sample 5 from 1019. 2. As before, use the Select command under the Data menu to get 5 random observations. 3. Then go to the Statistics menu and use the Summarize > List command to get the entries for the variables of interest.

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line Finally, you will have to compute the fitted line for these data. X= SMAR TS Y= PAPER X X y = Y Y y 2 15 1 1.6-0.4-0.64 2.56 8 2-5.4 0.6-3.24 29.16 15 1 1.6-0.4-0.64 2.56 13 2-0.4 0.6-0.24 0.16 16 1 2.6-0.4-1.04 6.76 X = 13.4 Y = 1.4 =0 y=0 y 2

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line 1. Calculate b = y/ 2 = -5.8/41.2 = -0.14 2. Calculate the intercept: 3. Calculate the OLS line: a =Y - bx a = 1.4- (-0.14)13.4 = 1.4 + 1.876 = 3.276 $Y = a + bx $Y = 3.3-0.14 X

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line 3.3 4. Plot 3 2 1 Y=3.3=0.14X 5 10 15 20

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line 5. Interpretation A person with no education would read 3.3 newspapers a day.

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line 5. Interpretation (cont.) Our results further tell us that each additional year of education reduces the number of newspapers a person reads by 0.14. So for every year of education you read 14% less.

V. Homework Eample(cont.) A. Choose 2 Variables B. Coding the Variables C. Getting the number of valid observations D. Sampling five observations E. Calculate the OLS Line 5. Interpretation (cont.) This eample suggests some of the problems with drawing inferences about the underlying population from small samples.