Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test

Similar documents
Appendix B STATISTICAL TABLES OVERVIEW

9.3 Tests About a Population Mean (Day 1)

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

Exact distribution of Bartels Statistic

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

LET S ARGUE: STUDENT WORK PAMELA RAWSON. Baxter Academy for Technology & Science Portland, rawsonmath.

Example #1: One-Way Independent Groups Design. An example based on a study by Forster, Liberman and Friedman (2004) from the

DEPARTMENT OF STATISTICS AND DEMOGRAPHY MAIN EXAMINATION, 2011/12 STATISTICAL INFERENCE II ST232 TWO (2) HOURS. ANSWER ANY mree QUESTIONS

Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

2018 Linking Study: Predicting Performance on the NSCAS Summative ELA and Mathematics Assessments based on MAP Growth Scores

Objectives. Materials TI-73 CBL 2

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Linking the Alaska AMP Assessments to NWEA MAP Tests

Linking the Indiana ISTEP+ Assessments to the NWEA MAP Growth Tests. February 2017 Updated November 2017

two populations are independent. What happens when the two populations are not independent?

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

HASIL OUTPUT SPSS. Reliability Scale: ALL VARIABLES

Linking the Mississippi Assessment Program to NWEA MAP Tests

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

Linking the Florida Standards Assessments (FSA) to NWEA MAP

Linking the Indiana ISTEP+ Assessments to NWEA MAP Tests

A REPORT ON THE STATISTICAL CHARACTERISTICS of the Highlands Ability Battery CD

2018 Linking Study: Predicting Performance on the Performance Evaluation for Alaska s Schools (PEAKS) based on MAP Growth Scores

An Approach to Judge Homogeneity of Decision Making Units

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics

Descriptive Statistics

Online Learning and Optimization for Smart Power Grid

Linking the PARCC Assessments to NWEA MAP Growth Tests

Online Learning and Optimization for Smart Power Grid

Detection of Braking Intention in Diverse Situations during Simulated Driving based on EEG Feature Combination: Supplement

2018 Linking Study: Predicting Performance on the TNReady Assessments based on MAP Growth Scores

Outages with Initiating Cause Code Unknown by Voltage Class

TABLE 4.1 POPULATION OF 100 VALUES 2

Lecture 3: Measure of Central Tendency

STEERING ENTROPY AS A MEASURE OF IMPAIRMENT

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data

Fall Hint: criterion? d) Based measure of spread? Solution. Page 1

TRUTH AND LIES: CONSUMER PERCEPTION VS. DATA

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1

MEI Conference Session Title. Presenter Terry Dawson. An introduction to probability distributions

5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS

Motor Trend Yvette Winton September 1, 2016

Verification of Redfin s Claims about Superior Notification Speed Performance for Listed Properties

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications

Some Experimental Designs Using Helicopters, Designed by You. Next Friday, 7 April, you will conduct two of your four experiments.

Reed Switch Life Characteristics under Different Levels of Capacitive Loading

Grade 1: Houghton Mifflin Math correlated to Riverdeep Destination Math

DRIVING PERFORMANCE PROFILES OF DRIVERS WITH PARKINSON S DISEASE

Safe System Approach. Claes Tingvall (Swedish Transport Administration) Peter Larsson (Swedish Transport Agency)

DEFECT DISTRIBUTION IN WELDS OF INCOLOY 908

Bandmill Strain System Response

Antonio Olmos Priyalatha Govindasamy Research Methods & Statistics University of Denver

PERFORMANCE AND ACCEPTANCE OF ELECTRIC AND HYBRID VEHICLES

Improving CERs building

SHAFT ALIGNMENT FORWARD

Decision & abrupt change detection

Grade 3: Houghton Mifflin Math correlated to Riverdeep Destination Math

The purpose of this lab is to explore the timing and termination of a phase for the cross street approach of an isolated intersection.

HUIZHI XIE (JOINTLY WITH HONGFEI LI AND YASUO AMEMIYA)

Test-Retest Analyses of ACT Engage Assessments for Grades 6 9, Grades 10 12, and College

MIT ICAT M I T I n t e r n a t i o n a l C e n t e r f o r A i r T r a n s p o r t a t i o n

Quality of Life in Neurological Disorders. Scoring Manual

Process Control of the Rheology of Self-Compacting Concrete Based on Cusum Control Charts

Cruise Control 1993 Jeep Cherokee

Chapter 28. Direct Current Circuits

PSD & Moisture Content (71) PROFICIENCY TESTING PROGRAM REPORT

Houghton Mifflin MATHEMATICS. Level 1 correlated to Chicago Academic Standards and Framework Grade 1

SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING. Oliver Rose

Efficiency Measurement on Banking Sector in Bangladesh

LECTURE 6: HETEROSKEDASTICITY

Technical Papers supporting SAP 2009

The 1997 U.S. Residential Energy Consumption Survey s Editing Experience Using BLAISE III

Regression Models Course Project, 2016

College Board Research

Analyzing Crash Risk Using Automatic Traffic Recorder Speed Data

The following output is from the Minitab general linear model analysis procedure.

DETECTION OF IMPAIRED DRIVERS THROUGH MEASUREMENT OF SPEED AND ALIGNMENT. Barry W. E. Bragg, Nancy Dawson, Dennis Kirby and Gay Goodfellow

The Coefficient of Determination

female male help("predict") yhat age

Student-Level Growth Estimates for the SAT Suite of Assessments

The electro-mechanical power steering with dual pinion

DIBELSnet Preliminary System-Wide Percentile Ranks for DIBELS Math Early Release

Bayes Factors. Structural Equation Models (SEMs): Schwarz BIC and Other Approximations

Influence of motorcycles lane to the traffic volume and travel speed in Denpasar, Indonesia

The purpose of this experiment was to determine if current speed limit postings are

Tracey Ma, Patrick Byrne & Yoassry Elzohairy

4th Asian Academic Society International Conference (AASIC) 2016 SCI-OR-002

Statistics for Social Research

Contents 1 Introduction Reliability and Quality Mathematics Introduction to Reliability and Quality

fruitfly fecundity example summary Tuesday, July 17, :13:19 PM 1

Sample Reports. Overview. Appendix C

Hydro Plant Risk Assessment Guide

Motor Trend MPG Analysis

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Power Team Mission Day Instructions

A DIFFERENCE IN ROLLOVER FREQUENCY BETWEEN CHEVROLET AND GMC TRUCKS. Hans C. Joksch. The University of Michigan Transportation Research Institute

Planned Revisions to the NIJ Ballistic Resistant Body Armor Test Standard

Transcription:

Using Statistics To Make Inferences 6 Summary Non-parametric tests Wilcoxon Signed Ranks Test Wilcoxon Matched Pairs Signed Ranks Test Wilcoxon Rank Sum Test/ Mann-Whitney Test Goals Perform and interpret Wilcoxon Signed Ranks Test Perform and interpret Wilcoxon Matched Pairs Signed Ranks Test Perform and interpret Wilcoxon Rank Sum Test/ Mann-Whitney Test If appropriate employ a normal approximation Know when each test is appropriate Practical Perform a series of Mann-Whitney tests and compare the results to those obtained from t-tests Mike Cox 6.1 Version

Non-parametric Tests A single sample test Wilcoxon Signed Ranks Test Procedure 1. Take the difference between each observation and the median η.. Rank the absolute differences from 1 to n, allowing for ties. (1 smallest, n largest) 3. Sum the rank values for those observations above η, let this be W + 4. Sum the rank values for those observations below η, let this be W - Note: If two or more differences are equal (tied) they are assigned the average of the ranks If a difference is zero, it is omitted 5. Use the smaller of W + or W - to use as the test statistic, called W or W calc. Critical values of the test statistic are given in tables for various significance levels. For n greater than 8 a normal approximation may be employed where z 1 n( n 1) W 4 n( n 1)(n 1) 4 In this case the continuity correction is added since a lower tail is being considered. Mike Cox 6. Version

Example It has been established that an individuals median reaction time is 0.50 seconds. Twelve trials are conducted after the individual has consumed alcohol. The measured times are 0.35 0.5 0.31 0.64 0.33 0.41 0.84 0.306 0.48 0.84 0.98 0.30 Test whether the data are consistent with the median value. The first step is to subtract the median from every value. raw data difference absolute difference 0.35-0.015 0.015 0.5 0.00 0.00 0.31 0.06 0.06 0.64 0.014 0.014 0.33 0.073 0.073 0.41-0.009 0.009 0.84 0.034 0.034 0.306 0.056 0.056 0.48-0.00 0.00 0.84 0.034 0.034 0.98 0.048 0.048 0.30 0.070 0.070 Mike Cox 6.3 Version

Now rank on the absolute differences difference absolute difference rank true rank -0.00 0.00 1 1.5 0.00 0.00 1.5-0.009 0.009 3 3 0.014 0.014 4 4-0.015 0.015 5 5 0.034 0.034 6 6.5 0.034 0.034 7 6.5 0.048 0.048 8 8 0.056 0.056 9 9 0.06 0.06 10 10 0.070 0.070 11 11 0.073 0.073 1 1 Mike Cox 6.4 Version

Now separate the contributions for positive (negative) differences difference absolute difference true rank total -0.015 0.015 5-0.009 0.009 3-0.00 0.00 1.5 9.5 0.00 0.00 1.5 0.014 0.014 4 0.034 0.034 6.5 0.034 0.034 6.5 0.048 0.048 8 0.056 0.056 9 0.06 0.06 10 0.070 0.070 11 0.073 0.073 1 68.5 W calc W + =68.5 W - =9.5 min W, W 9. 5 W ( 0.05) 14 W ( 0.01) 7 crit crit Two Tail Probability n 0.10 0.05 0.0 0.01 1 17 14 10 7 Therefore the result is significant at the 5% level, so the null hypothesis can be rejected. The median is apparently not consistent with 0.50 seconds. Mike Cox 6.5 Version

Normal approximation Employing the normal approximation, 1 n( n 1) 1 1(1 1) W 9.5 z 4 4.7 using normal tables the p n( n 1)(n 1) 1(1 1)(1 1) 4 4 value is p 0.0115 0. 0 remarkably close to the exact value reported by software. Z 0.00-0.01-0.0-0.03-0.04-0.05-0.06-0.07-0.08-0.09 -. 0.014 0.014 0.013 0.013 0.013 0.01 0.01 0.01 0.011 0.011 Mike Cox 6.6 Version

Wilcoxon Matched Pairs Signed Ranks Test Example Certain mental tasks are performed before and after exercise. The scores were recorded. Subject 1 3 4 5 6 7 8 9 10 Exercise 46 38 6 54 4 37 55 5 41 39 Relaxed 53 46 60 58 49 34 65 53 47 43 Is there any evidence of a significant difference in the levels of performance under the two conditions? Still effectively a single sample since we seek a change. Mike Cox 6.7 Version

Subject Exercise Relaxed Absolute Difference Difference 1 46 53 7 7 38 46 8 8 3 6 60-4 54 58 4 4 5 4 49 7 7 6 37 34-3 3 7 55 65 10 10 8 5 53 1 1 9 41 47 6 6 10 39 43 4 4 Now rank the absolute differences Subject Absolute Difference Difference Rank True Rank 8 1 1 1 1 3-6 -3 3 3 3 4 4 4 4 4.5 10 4 4 5 4.5 9 6 6 6 6 1 7 7 7 7.5 5 7 7 8 7.5 8 8 9 9 7 10 10 10 10 Mike Cox 6.8 Version

Now separate the contributions for positive (negative) differences Subject Difference Absolute Difference True Rank Total 6-3 3 3 3-5 8 1 1 1 4 4 4 4.5 10 4 4 4.5 9 6 6 6 1 7 7 7.5 5 7 7 7.5 8 8 9 7 10 10 10 50 W calc W + =50 W - =5 min W, W 5 W ( 0.05) 8 W ( 0.01) 3 crit crit Two Tail Probability n 0.10 0.05 0.0 0.01 10 11 8 5 3 Therefore the result is significant at the 5% level, so the null hypothesis can be rejected. The scores differ. Mike Cox 6.9 Version

A two sample test Wilcoxon Rank Sum Test/ Mann-Whitney Test 1. Combine the observations from the two samples (sizes n 1 and n ).. Rank the sorted data from 1 to (n 1 +n ). 3. Calculate R 1, as the sum of the ranks of the first sample and R for the second. 4. Form U U 1 n 1 n 1 n n n 1 n n1 1 R n 1 1 R (mid-point ½ n 1 n so only need calculate one) U calc min U U or U 1, the Mann-Whitney test statistic For n 1 and n greater than 8 a normal approximation may be employed where z 1 n1n U n1n n1 n 1 1 In this case the continuity correction is added since a lower tail is being considered. Mike Cox 6.10 Version

Wilcoxon Rank Sum Test A study of patients suffering from Parkinsons disease was conducted. An operation was performed on 8 of them, while it improved their general condition it might adversely affect their speech. In the data a higher value indicates a greater difficulty in speaking. Operated.6.0 1.7.7.5.6.5 3.0 Others 1. 1.8 1.9.3 1.3 3.0. 1.3 1.5 1.6 1.3 1.5.7.0 Mike Cox 6.11 Version

Speech Source Rank True Rank Others Operated 1. Others 1 1 1 1.3 Others 3 3 1.3 Others 3 3 3 1.3 Others 4 3 3 1.5 Others 5 5.5 5.5 1.5 Others 6 5.5 5.5 1.6 Others 7 7 7 1.7 Operated 8 8 8 1.8 Others 9 9 9 1.9 Others 10 10 10.0 Operated 11 11.5 11.5.0 Others 1 11.5 11.5. Others 13 13 13.3 Others 14 14 14.5 Operated 15 15.5 15.5.5 Operated 16 15.5 15.5.6 Operated 17 17.5 17.5.6 Operated 18 17.5 17.5.7 Operated 19 19.5 19.5.7 Others 0 19.5 19.5 3.0 Operated 1 1.5 1.5 3.0 Others 1.5 1.5 Total 16.5 16.5 R 1 =16.5, n 1 = 8, n = 14 n1 n1 1 8 9 U 1 n1 n R1 8 14 16.5 1.5 R =16.5, n = 14, n 1 = 8 n n 1 14 15 U n1 n R 814 16.5 90.5 Mike Cox 6.1 Version

R 1 =16.5, n1 n1 1 8 9 U 1 n1 n R1 814 16.5 1.5 R =16.5, n n 1 14 15 U n1 n R 8 14 16.5 90.5 U 1 = 1.5 U = 90.5 (mid-point ½ n 1 n = 56 so only need calculate one) U calc U, U 1. 5 min 1 For n 1 =8, n =14, the critical value from the tables for p=0.05 is 6. The result is significant at the 5% level, the two samples appear to differ. n 9 10 11 1 13 14 15 16 17 18 19 0 n 1 8 15 17 19 4 6 9 31 34 36 38 41 Mike Cox 6.13 Version

Normal approximation Employing the normal approximation z U n n 1 1 n1n 1.5 1 814 n n 1 814 8 14 1 1 1 1.3, using normal tables the p value is p 0.010 0.0 remarkably close to the exact value reported by software. Z 0.00-0.01-0.0-0.03-0.04-0.05-0.06-0.07-0.08-0.09 -.3 0.011 0.010 0.010 0.010 0.010 0.009 0.009 0.009 0.009 0.008 Mike Cox 6.14 Version

Parametric vs Non-Parametric Tests Parametric Tests They are robust with respect to violations of their assumptions. They are more powerful- more likely to detect an effect when one is present. They are more versatile there are tests for every experimental design. Non-Parametric Tests They make fewer assumptions. They are ideal for ordinal data, which is common in Psychology, whereas parametric tests require interval or ratio data. Read Howitt and Cramer pages 154-164, 167-173 Read Russo pages 168-175 Read Davis and Smith pages 448-459 Mike Cox 6.15 Version

Critical Values For Wilcoxon's Signed-Rank Test The body of the table contains the critical values for Wilcoxon's signed-rank test. Always enter the table with W+, the sum of the ranks of the positive deviations. If a critical value is missing, the hypothesis can not be rejected for this combination of n and α. One Tail Probability OneTail Probability 0.05 0.05 0.01 0.005 n 0.05 0.05 0.01 0.005 Two Tail Probability TwoTail Probability n 0.10 0.05 0.0 0.01 n 0.10 0.05 0.0 0.01 5 1 8 130 117 10 9 6 1 9 141 17 111 100 7 4 0 30 15 137 10 109 8 6 4 0 31 163 148 130 118 9 8 6 3 3 175 159 141 18 10 11 8 5 3 33 188 171 151 138 11 14 11 7 5 34 01 183 16 149 1 17 14 10 7 35 14 195 174 160 13 1 17 13 10 36 8 08 186 171 14 6 1 16 13 37 4 198 183 15 30 5 0 16 38 56 35 11 195 16 36 30 4 19 39 71 50 4 08 17 41 35 8 3 40 87 64 38 1 18 47 40 33 8 41 303 79 5 34 19 54 46 38 3 4 319 95 67 48 0 60 5 43 37 43 336 311 81 6 1 68 59 49 43 44 353 37 97 77 75 66 56 49 45 371 344 313 9 3 83 73 6 55 46 389 361 39 307 4 9 81 69 61 47 408 379 345 33 5 101 90 77 68 48 47 397 36 339 6 110 98 85 76 49 446 415 380 356 7 10 107 93 84 50 466 434 398 373 Mike Cox 6.16 Version

Critical Values Of U In The Mann-Whitney Test Critical Values of U at α = 0.05 with direction predicted or at α = 0.05 with direction not predicted. n 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 n 1 0 0 0 0 0 0 0 1 1 1 1 1 3 0 1 1 3 3 4 4 5 5 6 6 7 7 8 4 1 3 4 4 5 6 7 8 9 10 11 11 1 13 13 5 3 5 6 7 8 9 11 1 13 14 15 17 18 19 0 6 3 5 6 8 10 11 13 14 16 17 19 1 4 5 7 7 5 6 8 10 1 14 16 18 0 4 6 8 30 3 34 8 6 8 10 13 15 17 19 4 6 9 31 34 36 38 41 9 7 10 1 15 17 0 3 6 8 31 34 37 39 4 45 48 10 8 11 14 17 0 3 6 9 33 36 39 4 45 48 5 55 11 9 13 16 19 3 6 30 33 37 40 44 47 51 55 58 6 1 11 14 18 6 9 33 37 41 45 49 53 57 61 65 69 13 1 16 0 4 8 33 37 41 45 50 54 59 63 67 7 76 14 13 17 6 31 36 40 45 50 55 59 64 67 74 78 83 15 14 19 4 9 34 39 44 49 54 59 64 70 75 80 85 90 16 15 1 6 31 37 4 47 53 59 64 70 75 81 86 9 98 17 17 8 34 39 45 51 57 63 67 75 81 87 93 99 105 18 18 4 30 36 4 48 55 61 67 74 80 86 93 99 106 11 19 19 5 3 38 45 5 58 65 7 78 85 9 99 106 113 119 0 0 7 34 41 48 55 6 69 76 83 90 98 105 11 119 17 Mike Cox 6.17 Version

Critical Values Of U In The Mann-Whitney Test Critical Values of U at α = 0.05 with direction predicted or at α = 0.10 with direction not predicted. n 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 n 1 0 0 0 0 1 1 1 3 3 3 4 4 4 3 1 3 4 4 5 5 6 7 7 8 9 9 10 11 4 3 4 5 6 7 8 9 10 11 1 14 15 16 17 18 5 4 5 6 8 9 11 1 13 15 16 18 19 0 3 5 6 5 7 8 10 1 14 16 17 19 1 3 5 6 8 30 3 7 6 8 11 13 15 17 19 1 4 6 8 30 33 35 37 39 8 8 10 13 15 18 0 3 6 8 31 33 36 39 41 44 47 9 9 1 15 18 1 4 7 30 33 36 39 4 45 48 51 54 10 11 14 17 0 4 7 31 34 37 41 44 48 51 55 58 6 11 1 16 19 3 7 31 34 38 4 46 50 54 57 61 65 69 1 13 17 1 6 30 34 38 4 47 51 55 60 64 68 7 77 13 15 19 4 8 33 37 4 47 51 56 61 65 70 75 80 84 14 16 1 6 31 36 41 46 51 56 61 66 71 77 8 87 9 15 18 3 8 33 39 44 50 55 61 66 7 77 83 88 94 100 16 19 5 30 36 4 48 54 60 65 71 77 83 89 95 101 107 17 0 6 33 39 45 51 57 64 70 77 83 89 96 10 109 115 18 8 35 41 48 55 61 68 75 8 88 95 10 109 116 13 19 3 30 37 44 51 58 65 7 80 87 94 101 109 116 13 130 0 5 3 39 47 54 6 69 77 84 9 100 107 115 13 130 138 Mike Cox 6.18 Version

Critical Values Of U In The Mann-Whitney Test Critical Values of U at α = 0.005 with direction predicted or at α = 0.01 with direction not predicted. n 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 n 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 1 1 3 3 4 0 0 0 1 1 3 3 4 5 5 6 6 7 8 5 0 1 1 3 4 5 6 7 7 8 9 10 11 1 13 6 1 3 4 5 6 7 9 10 11 1 13 15 16 17 18 7 1 3 4 6 7 9 10 1 13 15 16 18 19 1 4 8 4 6 7 9 11 13 15 17 18 0 4 6 8 30 9 3 5 7 9 11 13 16 18 0 4 7 9 31 33 36 10 4 6 9 11 13 16 18 1 4 6 9 31 34 37 39 4 11 5 7 10 13 16 18 1 4 7 30 33 36 39 4 45 46 1 6 9 1 15 18 1 4 7 31 34 37 41 44 47 51 54 13 7 10 13 17 0 4 7 31 34 38 4 45 49 53 56 60 14 7 11 15 18 6 30 34 38 4 46 50 54 58 63 67 15 8 1 16 0 4 9 33 37 4 46 51 55 60 64 69 73 16 9 13 18 7 31 36 41 45 50 55 60 65 70 74 79 17 10 15 19 4 9 34 39 44 49 54 60 65 70 75 81 86 18 11 16 1 6 31 37 4 47 53 58 64 70 75 81 87 9 19 1 17 8 33 39 45 51 56 63 69 74 81 87 93 99 0 13 18 4 30 36 4 46 54 60 67 73 79 86 9 99 100 Mike Cox 6.19 Version