Introduction to R (7): Central Limit Theorem

Similar documents
9.3 Tests About a Population Mean (Day 1)

TABLE 4.1 POPULATION OF 100 VALUES 2

two populations are independent. What happens when the two populations are not independent?

Additional file 3 Contour plots & tables

Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran

XIII Statistical Process Control

Study of the Performance of a Driver-vehicle System for Changing the Steering Characteristics of a Vehicle

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

Assessing the Potential Role of Large-Scale PV Generation and Electric Vehicles in Future Low Carbon Electricity Industries

Motor Trend Yvette Winton September 1, 2016

Burn Characteristics of Visco Fuse

2.810 Manufacturing Processes and Systems. Quiz II (November 19, 2014) Open Book, Open Notes, Computers with Internet Off 90 Minutes

Reliability of Hybrid Vehicle System

Uniformity Comparison of Selected Spunbond Fabrics

Reliability-Based Bridge Load Posting

Stat 302 Statistical Software and Its Applications Graphics

SUPERVISED AND UNSUPERVISED CONDITION MONITORING OF NON-STATIONARY ACOUSTIC EMISSION SIGNALS

Jan Spoormaker Spoormaker Consultancy www. Spoormaker-Consultancy.com

Assessing Feeder Hosting Capacity for Distributed Generation Integration

1 of 28 9/15/2016 1:16 PM

Data envelopment analysis with missing values: an approach using neural network

Development of misfire detection algorithm using quantitative FDI performance analysis

Improving CERs building

2008 International ANSYS Conference

The following output is from the Minitab general linear model analysis procedure.

Supplement to Augustsson et al. (2011)---LA-ICP-MS zircon U-Pb dating method and data

CIVE HYDRAULIC ENGINEERING Fall 2015 Pierre Julien Colorado State University

Math 20 2 Statistics Review for the Final

On the potential application of a numerical optimization of fatigue life with DoE and FEM

Static Structural and Thermal Analysis of Aluminum Alloy Piston For Design Optimization Using FEA Kashyap Vyas 1 Milan Pandya 2

Capacity-Achieving Accumulate-Repeat-Accumulate Codes for the BEC with Bounded Complexity

EXAMPLES OF PRODUCT ENGINEERING WITH OPTISLANG AT DIESEL SYSTEMS

ESSAYS ESSAY B ESSAY A and 2009 are given below:

THE PHYSICS OF THE PINEWOOD DERBY

Neuron, volume 61 Supplemental Data

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1

THE accurate estimation of electric vehicle (EV) demand

Descriptive Statistics

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

Development of Low-Exergy-Loss, High-Efficiency Chemical Engines

Uncertainty in Electric Bus Mass and its Influence in Energy Consumption

Analysis of Big Data Streams to Obtain Braking Reliability Information July 2013, for 2017 Train Protection 1 / 25

Operational Vignette-based Electric Warship Load Demand

Identify Formula for Throughput with Multi-Variate Regression

Technical Papers supporting SAP 2009

Assignment 3 solutions

Config file is loaded in controller; parameters are shown in tuning tab of SMAC control center

Mathematics 43601H. Cumulative Frequency. In the style of General Certificate of Secondary Education Higher Tier. Past Paper Questions by Topic TOTAL

Performance Evaluation of DELTA PT 40 in the FCA DS 20 Wheel

Ambient Magnetic Field Compensation for the ARIEL (Advanced Rare IsotopE Laboratory) Electron Beamline. Gabriela Arias April 2014, TRIUMF

Descriptive Statistics Practice Problems (99-04)

Heat Engines Lab 12 SAFETY

ASTM D4169 Truck Profile Update Rationale Revision Date: September 22, 2016

Thinking distance in metres. Draw a ring around the correct answer to complete each sentence. One of the values of stopping distance is incorrect.

Capacity-Achieving Accumulate-Repeat-Accumulate Codes for the BEC with Bounded Complexity

Math 135 S18 Exam 1 Review. The Environmental Protection Agency records data on the fuel economy of many different makes of cars.

OVER the decades, electric vehicles (EVs) have experienced

CHAPTER 4 : RESISTANCE TO PROGRESS OF A VEHICLE - MEASUREMENT METHOD ON THE ROAD - SIMULATION ON A CHASSIS DYNAMOMETER

Application of the MANA model to Maritime Scenarios

GRADE 7 TEKS ALIGNMENT CHART

SAEHAN ENERTECH, INC.

APPLICATION VAPODEST ALCOHOL IN BEVERAGES AND INTERMEDIATES

Connecting vehicles to grid. Toshiyuki Yamamoto Nagoya University

Solutions to Suggested Homework Problems

La Salle Academy Team #6617 Providence, Rhode Island Coach: Michael McNamara

DESIGN AND ANALYSIS OF PUSH ROD ROCKER ARM SUSPENSION USING MONO SPRING

MGA Research Corporation

E/ECE/324/Add.3/Rev.3 E/ECE/TRANS/505/Add.3/Rev.3

Momentum, Energy and Collisions

Effects of two-way left-turn lane on roadway safety

Detection of Volatile Organic Compounds in Gasoline and Diesel Using the znose Edward J. Staples, Electronic Sensor Technology

How to Size VSD Air Compressors

If a customer arrives and finds both servers busy, there is a 25% probability that he departs without entering the queue.

Dispensette. Testing Instructions (SOP) 1. Introduction. May 2009

Evaluation of Major Street Speeds for Minnesota Intersection Collision Warning Systems

The Dynamics of Annuity Pricing, Credit, Mortality Risk & Capital Management for UK Life Offices

PCMO3. Fuel Economy & Engine Durability Predictor Test. Alternatives to ASTM D7589 & ILSAC SEQ 6E

Config file is loaded in controller; parameters are shown in tuning tab of SMAC control center

100GE PCS Modeling. Oded Trainin, Hadas Yeger, Mark Gustlin. IEEE HSSG September 2007

Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion

Abstract. 1. Introduction. 1.1 object. Road safety data: collection and analysis for target setting and monitoring performances and progress

Using Road Surface Measurements for Real Time Driving Simulation

Value Paper Author: Mats G Olsson. IDC, Interstand Dimension Control Field experience at Shiu Wing Steel, Hong Kong

SPRAY INTERACTION AND DROPLET COALESCENCE IN TURBULENT AIR-FLOW. AN EXPERIMENTAL STUDY WITH APPLICATION TO GAS TURBINE HIGH FOGGING

Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear. Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method

Probabilistic Analysis for Resolving Fatigue Failures of the Connecting Rod Oil Hole

5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS

Performance of the Mean- and Variance-Adjusted ML χ 2 Test Statistic with and without Satterthwaite df Correction

d / cm t 2 / s 2 Fig. 3.1

3M Connector System 0.050" x 0.100" Pitch. 3M Tripolarized Wiremount Socket - Series 820 3M 4-Wall, Tripolarized Header - Series 810

Carpooling and Carsharing in Switzerland: Stated Choice Experiments

Fall Hint: criterion? d) Based measure of spread? Solution. Page 1

Impact Analysis of Fast Charging to Voltage Profile in PEA Distribution System by Monte Carlo Simulation

EU, Norway, and the Faroe Islands request concerning long-term management strategy for mackerel in the Northeast Atlantic

An Analysis of Less Hazardous Roadside Signposts. By Andrei Lozzi & Paul Briozzo Dept of Mechanical & Mechatronic Engineering University of Sydney

ABS. Prof. R.G. Longoria Spring v. 1. ME 379M/397 Vehicle System Dynamics and Control

EPRI s Comments on the Federal Plan

Algebra 2 Plus, Unit 10: Making Conclusions from Data Objectives: S- CP.A.1,2,3,4,5,B.6,7,8,9; S- MD.B.6,7

Quantifying the factors which affect a train's operational energy consumption. James Pritchard Research Fellow in Rail Energy Systems

Modeling Strategies for Design and Control of Charging Stations

Transcription:

Introduction to R (7): Central Limit Theorem Central Limit Theorem Central limit theorem says that if x D(µ, σ) whered is a probability density or mass function (regardless of the form of the distribution), when sample size is large enough: σ x N(µ, n ). Central Limit Theorem also suggests that when the population distribution is Normal, σ we can assume x N(µ, n ). In this handout, and by using simulation methods I have included a study regarding the sampling distributions of the sampling mean of two populations. A right skewed population and a Normally distributed one. Figure 1 shows a clearly right skewed population with µ =2.029 and σ =1.382. For 1000 times 1 take samples of size 2 from this population and I will obtain 1000 sample averages associated with those random samples. Then I will repeat this process for samples of sizes 3, 6, 10, 20, 100 and each time I keep track of the mean and the distribution of those 1000 sample averages. I also plot histograms and qq-plots for each scenario. It turns out that as sample sizes increases the distribution of 1000 sample averages converge to normality (figures 2 and 3). Also, the standard deviations of those sampling distributions get closer to σ n (table 1). Next, I sample from a normal population with µ =99.602 and σ =10.2211 (figure 4). Then I repeat the same procedure for sample averages. It turns out that regardless of sample sizes, the result associated with central limit theorem hold (figures 5, 6, and table 2). Sample Size 2 3 6 10 20 100 Mean 2.0945 2.061667 2.016 2.0301 2.0186 2.02654 Standard Deviation 0.9860487 0.78204 0.5660369 0.453773 0.3010642 0.1300328 Table 1. Results for popoulation 1. The mean and standard deviations of the sample mean with different sample sizes Sample Size 2 3 6 10 20 100 Mean 99.95793 99.30017 99.5832 99.5639 99.653 99.57605 Standard Deviation 7.199265 5.83941 4.132356 3.290563 2.240824 0.9427503 Table 2. Results for population 2. The mean and standard deviations of the sample mean with different sample sizes 1

Histogram of test1 0 50 100 150 200 250 300 0 1 2 3 4 5 6 7 test1 Figure 1. Case one: Population distribution. The distribution is Skewed to the right. 2

Histogram of mean.size2 Histogram of mean.size3 0 100 200 300 0 1 2 3 4 5 mean.size2 0 1 2 3 4 5 mean.size3 Histogram of mean.size6 Histogram of mean.size10 0 100 250 1 2 3 4 mean.size6 1.0 1.5 2.0 2.5 3.0 3.5 mean.size10 Histogram of mean.size20 Histogram of mean.size100 0 100 200 0 100 200 1.5 2.0 2.5 3.0 mean.size20 1.6 1.8 2.0 2.2 2.4 mean.size100 Figure 2. Sampling Distributions of Sample means for sample sizes 2, 3, 6, 10, 20, 100. 3

0 1 2 3 4 5 0 1 2 3 4 5 1 2 3 4 1.0 2.0 3.0 1.5 2.5 1.6 2.0 2.4 Figure 3. QQ-plots for the sample mean distributions for different sample sizes. 4

Histogram of test2 0 50 100 150 200 70 80 90 100 110 120 130 test2 Figure 4. Case two: Population distribution. Normal distribution. 5

Histogram of mean.size2.norm Histogram of mean.size3.norm 0 100 200 0 100 200 300 80 90 100 110 120 130 mean.size2.norm 80 90 100 110 120 mean.size3.norm Histogram of mean.size6.norm Histogram of mean.size10.norm 85 90 95 100 105 110 mean.size6.norm 90 95 100 105 110 mean.size10.norm Histogram of mean.size20.norm Histogram of mean.size100.norm 0 50 100 95 100 105 mean.size20.norm 97 98 99 100 101 102 103 mean.size100.norm Figure 5. Sampling Distributions of Sample means for sample sizes 2, 3, 6, 10, 20, 100. 6

80 100 120 80 90 110 90 100 110 90 100 110 92 96 100 106 97 99 101 Figure 6. QQ-plots for the sample mean distributions for different sample sizes. 7

R-codes For Simulation (a) Population 1: Right Skewed Distribution We can simulate from a Poisson distribution: > test1<-rpois(1000,2) > hist(test1) > mean(test1) [1] 2.029 > sd(test1) [1] 1.382777 (b) Population 1: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Here is the case for Size 2. Others are similar. > test<-matrix(nrow=1000,ncol=2) > for(i in 1:1000) { test[i,]<-sample(test1,2) } > mean.size2<-apply(test,1,mean) > mean(mean.size2) [1] 2.0945 > sd(mean.size2) [1] 0.9860487 (c) Population 2: Obtaining 1000 Samples With Size 2, 3, 6, 10, 20, 100 Again, only the case for size 2 is included. test<-matrix(nrow=1000,ncol=2) for(i in 1:1000) { test[i,]<-sample(test2,2) } mean.size2.norm<-apply(test,1,mean) > mean(mean.size2.norm) [1] 99.95793 8

> sd(mean.size2.norm) [1] 7.199265 (d) Population 1: Plotting Histograms and QQ-plots par(mfrow=c(3,2)) hist(mean.size2) hist(mean.size3) hist(mean.size6) hist(mean.size10) hist(mean.size20) hist(mean.size100) par(mfrow=c(3,2)) qqnorm(mean.size2) qqnorm(mean.size3) qqnorm(mean.size6) qqnorm(mean.size10) qqnorm(mean.size20) qqnorm(mean.size100) (e) Population 2: Plotting Histograms and QQ-plots par(mfrow=c(3,2)) hist(mean.size2.norm) hist(mean.size3.norm) hist(mean.size6.norm) hist(mean.size10.norm) hist(mean.size20.norm) hist(mean.size100.norm) par(mfrow=c(3,2)) qqnorm(mean.size2.norm) qqnorm(mean.size3.norm) qqnorm(mean.size6.norm) qqnorm(mean.size10.norm) qqnorm(mean.size20.norm) qqnorm(mean.size100.norm) 9