Appendices for: Statistical Power in Analyzing Interaction Effects: Questioning the Advantage of PLS with Product Indicators

Size: px
Start display at page:

Download "Appendices for: Statistical Power in Analyzing Interaction Effects: Questioning the Advantage of PLS with Product Indicators"

Transcription

1 Appendices for: Statistical Power in Analyzing Interaction Effects: Questioning the Advantage of PLS with Product Indicators Dale Goodhue Terry College of Business MIS Department University of Georgia Athens, GA William Lewis College of Administration and Business Louisiana Tech University P. O. Box Ruston, LA Ronald L. Thompson Babcock Graduate School of Management Wake Forest University Winston-Salem, NC

2 Appendices for: Statistical Power in Analyzing Interaction Effects: Questioning the Advantage of PLS with Product Indicators Abstract of Main Paper: A significant amount of IS research involves hypothesizing and testing for interaction effects. Chin, Marcolin and Newsted (2003) completed an extensive experiment using Monte Carlo simulation that compared two different techniques for detecting and estimating such interaction effects: Partial Least Squares (PLS) with a product indicator approach versus multiple regression with summated indicators. By varying the number of indicators for each construct and the sample size, they concluded that PLS using product indicators was better (at providing higher and presumably more accurate path estimates) than multiple regression using summated indicators. Although we view the Chin et al. (2003) study as an important step in using Monte Carlo analysis to investigate such issues, we believe their results give a misleading picture of the efficacy of the product indicator approach with PLS. By expanding the scope of the investigation to include statistical power, and by replicating and then extending their work, we reach a different conclusion -- that although PLS with the product indicator approach provides higher point estimates of interaction paths, it also produces wider confidence intervals, and thus provides less statistical power than multiple regression. This disadvantage increases with the number of indicators and (up to a point) with sample size. We explore the possibility that these surprising results can be explained by capitalization on chance. Regardless of the explanation, our analysis leads us to recommend that, if sample size or statistical significance is a concern, regression or PLS with product of the sums should be used instead of PLS with product indicators for testing interaction effects. APPENDICES Table of Contents Page 2. Appendix A: A SAS program to generate data for 500 samples of 100 questionnaires each Page 3. Page 5. Page 6. Page 11. Page 12. Page 14. Appendix B: Determinants of statistical power Appendix C: Real versus integer data Appendix D: Capitalization on chance Appendix E: Comparing Bootstrapping with 100 and 500 resamples Appendix F: PLS with normal theory significance testing Appendix G: Normality and Kurtosis Page 15. Appendix H: Results using significance level of

3 Appendix A: A SAS program to Generate Data for 500 Samples of 100 Questionnaires Each (Two Indicators For Each Main Effect Construct, Four Indicators for the Interaction). Retain Seed Seed Seed Seed Seed Seed Seed Seed Seed ; Do I = 1 to 50000; Call Rannor(Seed1, KSI1); Call Rannor(Seed2, X1Err); Call Rannor(Seed3, X2Err); X1 = round(.70*ksi *X1Err) ; X2 = round(.70*ksi *X2Err) ; Call Rannor(Seed4, KSI2); Call Rannor(Seed5, Z1Err); Call Rannor(Seed6, Z2Err); Z1 = round(.70*ksi *Z1Err) ; Z2 = round(.70*ksi *Z2Err) ; Call Rannor(Seed7,Eta1Err); Call Rannor(Seed8,Y1Err); Call Rannor(Seed9,Y2Err); ETA1 =.30*KSI1 +.50*KSI2 +.30*KSI1*KSI *Eta1Err; Y1 = round(.70*eta *Y1Err) ; Y2 = round(.70*eta *Y2Err) ; I1 = X1*Z1; I2 = X1*Z2; I3 = X2*Z1; I4 = X2*Z2; Put Y1 4.0 Y2 4.0 X1 4.0 X2 4.0 Z1 4.0 Z2 4.0 I1 4.0 I2 4.0 I3 4.0 I4 4.0; end; 2

4 Appendix B. Determinants of Statistical Power When the underlying relationships of the various constructs and indicators are explicitly known (as in data generated for a Monte Carlo Analysis using Figure 1 and Appendix A), it gives us a unique opportunity to see how measurement error, indicator reliability, number of indicators, construct reliability, path strength, effect size and statistical power are all related. This Appendix walks the reader through the logic and calculations needed, using several basic statistics equations and tables from Cohen (1988). Measurement error is a general term referring to error in the score of an indicator of a construct. Reliability (as measured by alpha; Cronbach, 1951) is more precisely defined as the true score variance of a measure divided by its total variance (Carmines & Zeller 1979, page 31). Since we specifically designed the model in Figure 1 this way, we know that each indicator and each of the X, Z and Y constructs is normally distributed, mean 0 and variance 1.0. In Figure 1 and Appendix A, for example, x 1 is generated as.7*x +.714*RANNOR, where X and RANNOR are independent, normal (0,1) distributions. Therefore the true score variance of x 1 is (.7) 2 times the variance of X, or.49 (Larsen and Marx, 1981, Theorem 3.12). The error variance is (.714) 2 times the variance of RANNOR or.51. So total variance = 1. Reliability of x 1 is.49 / 1.0 =.49. The same calculations hold for each of the indicators in Figure 1, so the reliability of each indicator is.49. (If we changed all the indicator loadings to.6, adjusting the error variance accordingly, the reliability of each indicator would be.36, and so on.) We can also calculate the reliability of the two-indicator measure of X, by writing its equation, and using a standard equation for the variance of a sum of variables 1. Accordingly, if each of two indicators has a.7 loading, the true score variance X is.49 and the total variance is.745. The reliability of the two-indicator measure for X (and of Z and or Y) is therefore.49 /.745 =.66. If we moved to 4 indicators to measure each construct, the reliability of each construct would be.79, etc. So the reliability of each construct is a function of the relative size of the true score and the error loading in each indicator, and the number of indicators. Finally, though the math is more extensive 2 if not more difficult, we can calculate the covariances for X and Y, for Z and Y, and for I and Y. Since X, Z, and X*Z are all independent (see footnote 5), the partial R 2 (or incremental variance explained) for each is simply the covariance with Y, squared. Finally, the effect size of I is equal to the square of the partial R of I, divided by the total unexplained variance, or: f 2 = ΔR 2 I / (1 Σ j = X,Y,I (ΔR 2 j)) (Cohen 1988, page 410, Equation 9.2.3) Note that effect size is not equivalent to the value of b 3, nor are b 3 and effect size related in any straightforward way (Carte and Russell 2003, p.482-3). The effect size of I is increased by greater reliability of the constructs X, Z, and Y, (which can be brought about by more reliable indicators, or by more indicators, or both) by stronger b 3 (the true interaction path), and by more total explained variance (from all three independent variables. Finally, Cohen (1988) gives tables (i.e. Tables and 9.3.2, pages ) that allow us to predict the power of any given independent variable in a regression if we know its effect size, the sample size n and the number of other independent variables. Table B-1 below shows the expected power for Figure 1, at six different sample sizes and three different number-of-indicators. Since power is a proportion (number of significant t 1 Var[(X 1 /2 + X 2 /2] = Σ i=1,n Var(X i /2) + 2 Σ j<k Cov(X j /2,X k /2), (Larsen and Marx 1981, equation 10.3) 2 We use Cov(X,Y) = E(XY) - μ X μ Y, and E(XY) = E(X)E(Y) if X,Y independent (Larsen & Marx 1981, Theorems 3.10 and 10.1) 3

5 statistics divided by number of datasets analyzed), we can determine a 95% confidence interval around our estimate for each condition, by using a standard equation for confidence intervals around a proportion. Of course our advance calculations apply only to regression, since PLS will modify the indicator loadings to maximize the variance explained, and this changes all the calculations unpredictably. However, these power calculations will serve as a way to check our Monte Carlo results for regression, and to provide a target value to compare with PLS results. Sample Size Table B-1. Estimated Power for the Figure 1 Model For Regression. Values in square brackets show the 95 % confidence interval. 2 indicators 4 indicators 6 indicators Main Effect Reliab =.66 Interaction Reliab. =.44 Effect Size = OOR* 50 OOR* Main Effect Reliab. =.79 Interaction Reliab. =.62 Effect Size = OOR* Main Effect Reliab. =.85 Interaction Reliab. =.72 Effect Size = OOR* 17 [14-20] 24 [20-28] [10-23] 41 [37-46] 56 [52-61] [28-36] 63 [59-67] 79 [76-83] [41-49] 79 [75-82] 92 [89-94] [88-93] OOR* OOR* *OOR = Out of Range in Cohen s table. i.e. power is predicted to be either less than 10 or

6 APPENDIX C: REAL VERSUS INTEGER DATA To test the impact on the pattern of results of moving from input data that is real numbers specified to four decimal places (as used by Chin et al. 2003) to integer data (as used by most IS researchers), we rounded the X, Z, and Y indicator values from the A1 through A7 datasets to integers, and repeated the analysis. Table C-1 shows the results. Obviously there is some loss of information as we go from real to integer data -- power for both regression and PLS-PI is reduced by about 3 to 5 percentage points -- but the pattern of results is unchanged. PLS-PI still has an advantage over regression in terms of higher point estimates for beta; regression still has an advantage over PLS-PI in terms of power. Widely different indicator loadings (for example A4, A5, or A7) reduce regression's advantage, but do not eliminate it. So the pattern we found in Figure 2 and Table 1 will, in fact, be faced by researchers using questionnaires with integer data as well. Table C-1 -- Integer Values (Indicator Values Rounded to Integers) Chin et al. s A1- A7 Data Reliability (main Power Beta Values Main Effect Construct effect Regression DataSet Indicator Loadings constructs) PLS Reg Advantage PLS Reg PLS Advantage A1 2@.8; 2@.7; 2@ A2 3@.8; 3@ A3 3@.8; 3@ A4 2@.8; 2@.6; 2@ A5 3@.8; 3@ A6 3@.7; 3@ A7 2@.7; 2@.6; 2@

7 6 APPENDIX D: EXPLORING THE POSSIBILITY THAT PLS-PI CAPITALIZES ON CHANCE We said in the beginning of the Discussion section of the main paper that our results in Tables 2 and 3 seem to suggest the possibility that PLS-PI capitalized on chance. In this appendix we examine this contention in more detail. We first describe the PLS estimation process and show where in that process capitalization on chance could occur. With as background, we use a hypothetical example of how PLS-PI and regression would respond to a single extreme outlier data point causing an interaction indicator to be highly correlated with Y. We then move away from the idea of single outlier data points and argue that random variation in general could have the same effect. We then support these contentions with a detailed look at one specific sample where PLS-PI generated a large beta estimate that was not statistically significant. We conclude that all of our ad hoc analysis gives results that are consistent with PLS-PI capitalizing on chance, especially when there are many indicators for given constructs. The PLS Estimation Process. The algorithm for PLS estimation, as described by Chin (1998), involves 3 stages. Stage One is an iterative estimation process to determine the set of indicator weights for all constructs, computed in such a way to maximize the amount of variance explained. Once the final indicator weights are determined, Stages Two and Three are simple noniterative applications of OLS regression for obtaining loadings, path coefficients, and mean scores and location parameters for the LV [latent variable] and observed variables. (Chin 1998, p. 302). To be complete we might say that there is a Stage Four to determine the statistical significance of the parameter estimates, which typically involves the use of bootstrapping or jackknifing. The key to understanding how PLS might capitalize on chance is to look closely at how Stage One operates. Let s look only at the interaction indicators and the link between the interaction (X*Z) and Y (from Figure 1). The following steps are performed (Barclay et al., 1995): 1. PLS starts by assuming equal weights for all indicators (the loadings are set to 1.0). An initial estimate for Y is calculated by summing the values for y1 and y2. 2. To estimate the weights for the interaction terms (x1*z1, etc.), a regression is completed with the initial estimate of Y as the dependent variable, and the interaction indicator values (x1*z1, etc.) as the independent variables. 3. These weights are then used in a linear combination of x1*z1 x2*z2 to give an initial estimate for X*Z (the interaction term construct). 4. The loadings for y1 and y2 are then estimated by a pair of simple regressions of y1 and y2 on X*Z. 5. The next step uses the estimated loadings, transformed into weights, to form a linear combination of y1 and y2 as a new estimate for the value of Y. This process continues until the difference between the stop criterion (e.g., the average of R 2 s of all endogenous constructs) in consecutive iterations is extremely small. Put differently, at each Stage One cycle PLS calculates new construct scores for the interaction construct and for Y using their existing weights. It then seeks to increase the R 2 of Y by adjusting the interaction indicator weights. It does this by looking through the immediate construct (X*Z) to see the next proximal construct (Y). In other words, for the interaction construct indicators, it ignores the interaction construct score just calculated, and looks instead at the construct score for Y, the proximal construct. In effect it runs a regression analysis with the

8 construct score for Y as the dependent variable and the many interaction product indicators as the independent variables. The betas for the indicators in this regression provide the basis for adjusting the indicator weights for the next round. In effect, PLS selects those indicators with the highest correlation with Y and gives them the highest weights. PLS-PI And an Extreme Outlier Point. How would this process react to a single extreme outlier data point for one of the interaction indicators? Consider our underlying true model in Figure 1 with two indicators for each construct and all indicator loadings equal to.7. Further, suppose that there was an extreme outlier point that created an unusually high correlation between one of the interaction indicators (say x 1 *z 2 ) and the Y construct. In this case the Stage One process described above would tend to give a higher than normal weight to that x 1 *z 2 indicator, in order to maximize the R 2. (We note that maximizing R 2 is what both PLS and regression try to do, but by being able to adjust the indicator weights as well as the beta weights, PLS has more ability to capitalize on chance than regression). If the x 1 *z 2 indicator is weighted more heavily in the score for X*Z, then X*Z would be more than normally correlated with Y, which would result in a higher than normal estimate for the path from X*Z to Y. What would regression do in response to this extreme outlier point? Because our regression analysis approach typically uses equal loadings for all indicators, the effect of the outlier data point would be dampened by the fact that the x 1 *z 2 indicator would be averaged in with three other indicators that did not have such a high correlation with Y. The result is that for regression there would be a smaller increase in the path from X*Z to Y, and a smaller increase in R 2. So far this explanation is consistent with the results from both the Chin et al. analysis and our analysis we saw higher average interaction paths (and, though not shown here, higher average R 2 ) in PLS than in regression. But how would the statistical significance be affected in the two different techniques? Regression would calculate the standard error of the interaction coefficient using the assumption that all indicators had equal weightings, and would include all data points, including the outlier, in the single calculation. To test for statistical significance in PLS, on the other hand, the recommended technique is to employ bootstrapping. The bootstrapping approach is relatively conservative with respect to statistical significance caused by outlier data points, as explained below. Using sampling with replacement, bootstrapping creates a number of resamples (say 400) of the original data points, each with the same n as the original sample, but not necessarily the same data points. (Some data points may be omitted; others may appear more than once in any given resample.) For each of these 400 resamples, there is a complete PLS analysis (Stages one through three) and a new estimate of the interaction beta is made for each. Since some of the 400 resamples would have the outlier point included and some would not, the values of the 400 beta estimates would tend to jump around they would be higher when the outlier was included; lower when it was not included. Bootstrapping uses these beta estimates from the 400 bootstrapping resamples to determine the standard error of the original beta estimate, so this larger variation in the beta estimates due to the outlier data point would result in the bootstrapping program generating a larger standard error for the interaction beta. This larger standard error would tend to offset the larger interaction beta value, perhaps even leading to smaller t-statistics for the PLS interaction beta than the regression interaction beta. This could make it more difficult to achieve statistical significance, even with a higher interaction beta estimate. 7

9 PLS-PI With the Usual Random Variation But Without Extreme Outliers Although above we talked about the effect of a single extreme outlier data point, the same general effect might come about from the normal random variation present in virtually all questionnaire data, even without extreme outliers. In fact it is virtually certain that when there are multiple interaction indicators, some will have higher correlations with Y than others. PLS would give larger weights to each indicator that had a higher correlation with Y, and lower weights to those with smaller correlations with Y. The more indicators there are, the greater the possibility for variation in correlation with Y, and the more scope PLS has to "capitalize on chance" by assigning higher weights to those interaction indicators highly correlated with Y. Thus the impact from any single indicator would not need to be large, if it was added to the impact from other indicators as well. If PLS-PI does sometimes capitalize on chance, we would expect that we could identify samples in our own analysis that had capitalized on chance by looking for large differences between the regression and PLS-PI estimates of the interaction beta. If it were true that these samples also did not have statistically significant interaction terms for PLS, it would be consistent with our conjecture above. To test this conjecture, we identified the five samples (of the 500 in the six indicator, n=100 cell) with the largest differences between the PLS and regression interaction estimates. In all five cases, the PLS runs have a much larger absolute value for the interaction estimate. However, none of the five samples has a statistically significant interaction term not for PLS nor for regression. Table D-1 shows the results for the five samples. An In Depth Look At One Particular Sample To investigate this further, in Table D-2 we look at sample #497 (from the bottom of Table D-1) in more detail. (We completed a similar analysis with sample #87, and obtained similar results). We focus not on the estimate of the interaction path from X*Z to Y, but on the weights PLS assigns to the 36 product indicators to come up with a value for I, the interaction construct. According to our explanation above, if sample #497 has some interaction indicators with especially high correlations with Y, we should see PLS assigning a higher weight to these indicators. We certainly see that in Table D-2. The six indicators with the highest absolute value correlation with Y (shown in bold in column 2 of Table D-2) have the six highest absolute value weights assigned by PLS (shown in bold in column 3 of Table D-2.) Note also that three of the highest absolute value correlations are positive, and three are negative. We know that in the underlying model of Figure 1, the true relationship between Y and the interaction term (or any of its indicators) is positive. Of course there is also random error Table D-1. Examples Of Large Path Estimates In PLS Not Translating to Statistical Significance The five samples from the 6 indicators with n = 100 cell with the largest absolute difference between interaction point estimates for PLS-PI and regression. PLS-PI Regression 8 Interaction Path Estimate R 2 t statistic Interaction Path Estimate R 2 t statistic Example 1 (#87) Example 2 (#422) Example 3 (#471) Example 4 (#477) Example 5 (#497)

10 variance, which is why we see quite a scattering between negative and positive correlations with Y in Table D-2. What is interesting is that PLS has responded to this chance scattering of correlations by assigning higher positive weights to indicators with high positive correlations with Y (for example I8), and assigning high negative weights to indicators with higher negative correlations (for example I6). This is evidence that PLS is capitalizing on large negative and positive chance correlations. PLS is doing this to maximize the correlation between the resulting interaction construct (the weighted average of all 36 product indicators) and Y. But as we can see, the result is a strange mixture of positive and negative weights for the 36 product indicators, even though the underlying model would suggest all positive (or all negative) weights. If the higher correlations between product indicator scores and Y are caused by a small collection of data points, there are a couple of things we might expect to see in the bootstrap results. First, we might expect a large difference (column 5) between the original PLS estimate for the weights based on the full sample (column 3) and the average of the bootstrap resample estimates for the weights (column 4), since the bootstrap resamples would sometimes include these chance data points, and sometimes not. In fact we do see that -- the six highest absolute value correlations in column 2, are also the 6 highest differences (column 5) between the overall PLS determined weight and the bootstrap average weight. Secondly, we would expect that the bootstrap would produce a larger standard error for the weights (column 6) when an indicator had these chance data points (i.e. those causing a high positive or negative correlation with Y). In fact, the six highest absolute value correlations in column 2 are also the 6 highest standard errors as determined by bootstrapping (column 6). Finally, of the six highest absolute value correlations, only two have a t statistic higher than All of this is consistent with our conjecture that PLS capitalizes on random high positive or negative correlations with Y in determining its beta values, but during significance testing, bootstrapping compensates for that by penalizing beta values caused by only a few data points. Extreme Outliers or Random Chance? Are the characteristics shown in Table D-2 really due to chance variations and not to recognizable outliers in the data? Because if the latter, perhaps the outliers could be removed prior to PLS analysis. Looking at frequency distributions for the X, Z and Y indicators, we saw mostly values between -2 and +2, with a lesser amount between -3 and +3, and only one possible outlier, a Z value of 4. Removing that data point had no material impact on the PLS results -- beta was still high (.367) and not statistically significant. Though most of the values for interaction indicators were between -4 and +4, we found a collection of data points with values of +6 or -6, which might be construed to be outliers 3. Removing these left 95 data points, and PLS again returned a high beta interaction (.407), again not significant. There were 10 more data points with values of + 4 or - 4. While it is not clear that any researcher would be comfortable removing as outliers 15 out of 100 questionnaires on this evidence (Carte and Russell 2003, p. 489 caution against this), we did remove those questionnaires and performed a PLS analysis on the remaining 85 questionnaires. The result was still a large beta value for the interaction term (.298, within the 10% accuracy hurdle), but still no statistical significance. We conclude that this phenomenon cannot be blamed on identifiable outlier data points, but is due to capitalization on chance in a more general sense. The more 3 We note that it would be possible to run a regression predicting Y with all of the X and Z predictors, and then use standard regression diagnostics, such as hat values to identify multivariate outliers (Bollen, 1989; Bollen and Arminger, 1991). We would like to thank an anonymous reviewer for this suggestion. 9

11 interaction indicators there are, the more opportunity there will be for PLS to capitalize on chance, and the more bootstrapping will penalize the findings in terms of loss of power. Table D-2. Sample #497 Statistics on PLS-PI Product Indicators (Six Indicator, N=100) Higher + or indicator correlations to Y leads to increased PLS indicator weights, increased differences from bootstrap means, and increased bootstrap standard error. Interaction Indicators Correlation with Y From SAS Run Bold if (abs(corr)>=.13) Weights Full Sample From PLS Bold if (abs(wt) >.195) Mean Wt. of Subsamples From PLS (Bootstrap) Difference (Wt - Mean) Calculated Bold if (abs(diff) >.14) Standard Error of Wt. From PLS (Bootstrap) Bold if (Std Err >.105) T-Statistic From PLS (Bootstrap) Bold if (t > 1.98) i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i

12 Appendix E: Comparing Bootstrapping With 100 and 500 Resamples To address the issue of whether a move to 500 resamples (rather than 100) would affect our results, we selected two cells (N=20 and N=50, with 4 indicators) and reran the analysis using 500 resamples. Recall that in each cell we have 500 samples of N=20 (or N=50). Using 500 resamples for each sample results in 250,000 resamples total in each cell, all drawn from the same underlying population. We limited the number cells for this analysis because given the technique we were using, the computational complexity of using 500 resamples made this very challenging. To illustrate the difficulty involved, one of the researchers spent approximately 16 hours to complete the analyses for the N=20 cell alone. The results were as we expected. At two decimal places, the path estimates and power computations were identical across the 100 resamples and 500 resamples. At three decimal places, very minor differences were observed in power, as shown in Table E1 below. Table E1: Comparing 100 and 500 Bootstrapping Resamples 100 Resamples 500 Resamples N=20, path estimate for I N= 20, power for I (p<.01) N=50, path estimate for I N=50, power for I (p<.01)

13 Appendix F: PLS with Normal Theory Significance Testing A concern was raised that comparing statistical significance tests based on normal theory testing with regression, versus bootstrapping with PLS, is like comparing apples and oranges -- the two are not equivalent. To address this, we conducted some additional analyses where we compared regression with normal theory testing to PLS with normal theory testing. To create PLS with normal theory testing, we used indicator loadings generated by PLS to calculate construct scores, and then ran regression with normal theory testing on these construct scores. These results were compared with the straight regression (equal indicator loadings) significance results. To describe the process in more detail, first, we modified the model to have the same effect sizes as previously for X and Z, but no effect for I (the interaction term). In this manner, we could test the efficacy of both approaches in terms of both Type I and Type II errors. We used four items to measure each of the constructs X and Z, with sixteen indicators for the interaction term, I. We used sample sizes of 50, 100, and 150, and we generated 500 datasets for each of the sample sizes. We ran the PLS analysis without bootstrapping for all 500 datasets in each sample size condition, then stripped off the indicator weights for each construct and used those and the raw data to determine construct scores for each "questionnaire". These construct scores were then fed into a regression analysis which estimated the betas and t statistics for each of the 500 datasets. The proportion of t-statistics that are significant (i.e. the power) for each effect size and N are displayed in Table F-1 below (titled PLS-NTT, for PLS using Normal Theory Testing), alongside the results for Regression (using normal theory testing), and PLS with bootstrapping (titled PLS-B). Two things are worthy of note in the results. First, for the medium effect size path (X to Y) the power of PLS with regression significance testing (labeled PLS-NTT in the table) dominates the other approaches at N=50 and N=100. At sample sizes of 150, this advantage seems to have disappeared and the power of PLS-NTT is generally similar to the other techniques. On the face of it, this is evidence that PLS with regression significance testing is a more efficacious technique (has more power) than the other techniques at small sample sizes. But see below. Second, PLS-NTT also finds far more significant betas for the interaction term (for which there is no actual effect). The other techniques both find between 5 and 7% of these false positives, within a.05 confidence interval of.03 to.07 around the allowable amount of.05. PLS-NTT finds between 31% and 36% of these false positives for all three sample sizes. This is strong evidence that PLS-NTT is capitalizing on chance. Our interpretation (developed more fully in Appendix D) is the following. PLS has more "levers" available to it to capitalize on chance than regression. Regression can only vary the beta coefficients, while PLS can vary both the beta coefficients and the indicator weights. This gives PLS a stronger ability to capitalize on any chance high correlations of a particular indicator and the dependent construct. Especially with small sample size, often these chance high correlations come about through one or a few outlier data points. Bootstrapping, because of the way it determines the standard error for significance testing, will react to such outliers with a larger standard error, since in the resamples sometimes the outlier data point will be included and sometimes it will not. However, PLS-NTT allows the PLS algorithm to capitalize on chance, and does not correct for this using bootstrapping. The result is an unacceptably high percent of high false positives with PLS-NTT. 12

14 This suggests that the approach of using PLS to determine indicator weightings and then using those weightings and indicators scores as input to a regression analysis is not appropriate, at least without considerable further investigation. Further, no published work that we are aware of has advocated this approach. In addition, we note that Goodhue, Lewis and Thompson (2006) obtained similar results when they conducted analyses to test the impact of moving from the use of bootstrapping for PLS to the use of normal theory testing with PLS. Table F-1: Power at each Effect Size and Sample Size (Proportion of statistically significant betas) Including PLS Analysis Followed By Regression Significance Testing X to Y True beta =.30 n= MR PLS-B* PLS-NTT** Z to Y True beta =.50 n= MR PLS-B PLS-NTT Interaction True beta =.00 n= MR PLS-B PLS-NTT For medium effect size, PLS-NTT dominates the other techniques at small sample size, but is about the same at sample size of 150. For strong effect size at all three sample sizes, the proportion that are statistically significant are roughly equal for all three techniques. When there is no actual effect, PLS-NTT finds false positives over 30% of the time, with N=50, 100 or 150. The other techniques (MR and PLS with bootstrapping) are in line with expectations. PLS-B -- using bootstrapping (100 resamples) to assess statistical significance PLS-NTT, using normal theory testing (employing weights from PLS to compute weighted scores for constructs, and then using regression in SAS to compute estimates for path coefficients and to calculate t-statistic values) 13

15 Appendix G. Normality and Kurtosis Although the data that we (and Chin et al. 2003) generated for the main effect and dependent constructs (X, Z and Y) was normally distributed (N(0,1) by design), the interaction data that was computed by multiplying two normally distributed values together had zero skew but fairly high kurtosis. This is always the result of multiplying two N(0,1) values together, so kurtosis will generally be present in this type of interaction construct. To see if this had any impact on the PLS-PI and the regression results, we selected three cells (N=20, 50 and 100 for the 4-indicator data) that we believed would be representative of the type of data normally used by IS researchers. We then transformed the interaction data (for each of the PLS product indicators and for the regression single interaction value) by taking the square root, reducing the level of kurtosis to within the normally accepted range. We re-ran PLS and regression with the transformed data, and compared the statistical power results to our original results (see Table G- 1). At most, the power changed by.02 (e.g., at N= with the original data the power for regression was.50, and with the transformed data it was.48). These differences were negligible, suggesting that the non-normality of the interaction term did not affect the pattern of our results. Table G-1: Power (Proportion statistically significant at p <.01) for 4 Indicators Sample Size Regression Original Data Regression Transformed Data PLS-PI Original Data PLS-PI Transformed Data

16 Appendix H: Results using Significance Level of 0.05 Comparing the PLS-PI results in Table 3a (using α = 0.01) to those in Table H-1 below (using the more common α = 0.05 level), it is clear that relaxing significance level constraints does not change the pattern of results, yet it does reduce the sample sizes needed to achieve power of.80 across various conditions. The same is true for regression (Table H-2). In addition, the pattern of regression having an advantage with respect to PLS-PI for most conditions (except at very large sample sizes) holds at α = 0.05 as well (Table H-3). Table H-1: PLS-PI, Power at p <.05 Equal Loadings at.70; Bold = Power >.80 Number of Indicators for Main Effect Constructs Sample 2 i 4 i 6 i 8 i 10 i 12 i Size Table H-2: Regression, Power at p <.05 Table H-3: Regression Advantage Over PLS-PI Equal Loadings at.70; Bold = Power >.80 At p <.05 Number of Indicators for Main Effect Constructs Number of Indicators for Main Effect Constructs Sample Size 2 i 4 i 6 i 8 i 10 i 12 i Sample Size 2 i 4 i 6 i 8 i 10 i 12 i Row Avg Col. Avg

17 References For the Online Appendices Barclay, D., Higgins, C. and Thompson R. The Partial Least Squares (PLS) Approach to Causal Modeling: Personal Computer Adoption and Use as an Illustration, Technology Studies, 2(2), 1995, pp Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Bollen, K. A., & Arminger, G. 1991, Observational residuals in factor analysis and structural equation models. Sociological Methodology Carmines, E.G. and Zeller, R.A Reliability and Validity Assessment, Sage Publications, Beverly Hills, CA. Carte, T. and C. Russell In Pursuit of Moderation: Nine Common Problems and Their Solutions. MIS Quarterly 27(3) Chin, W.W The Partial Least Squares Approach to Structural Equation Modeling, in G.A. Marcoulides (Ed.) Modern Methods for Business Research, London Chin, W.W., Marcolin, B. and P. Newsted A Partial Least Squares Latent Variable Modeling Approach for Measuring Interaction Effects: Results from a Monte Carlo Simulation Study and an Electronic-Mail Emotion/Adoption Study. Information Systems Research 14(2) Cronbach, L.J Coefficient Alpha and the Internal Structure of Tests. Psychometrica Cohen, J Statistical Power Analysis for the Behavioral Sciences, L. Erlbaum Associates, Hillside, NJ. Goodhue, D., Lewis, W. and Thompson, R Small Sample Size and Statistical Power in MIS Research. Proceedings of the 39th Hawaii International Conference on Systems Sciences, (CD), R. Sprague (Ed) IEEE Computer Society Press, Los Alamitos, CA, (January 4-7) Larsen, R.J. and Marx, M.L An Introduction to Mathematical Statistics and Its Applications. Prentice-Hall, Inc. Englewood Cliffs, NJ. 16

PLS Pluses and Minuses In Path Estimation Accuracy

PLS Pluses and Minuses In Path Estimation Accuracy PLS Pluses and Minuses In Path Estimation Accuracy Full Paper Dale Goodhue Terry College of Business, MIS Department, University of Georgia dgoodhue@terry.uga.edu William Lewis william.w.lewis@gmail.com

More information

PLS: New Directions, New Challenges, and New Understandings

PLS: New Directions, New Challenges, and New Understandings Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2012 Proceedings Proceedings PLS: New Directions, New Challenges, and New Understandings Ron Thompson Schools of Business Administration,

More information

CONSTRUCT VALIDITY IN PARTIAL LEAST SQUARES PATH MODELING

CONSTRUCT VALIDITY IN PARTIAL LEAST SQUARES PATH MODELING Association for Information Systems AIS Electronic Library (AISeL) ICIS 2010 Proceedings International Conference on Information Systems (ICIS) 1-1-2010 CONSTRUCT VALIDITY IN PARTIAL LEAST SQUARES PATH

More information

COMPARING THE PREDICTIVE ABILITY OF PLS AND COVARIANCE MODELS

COMPARING THE PREDICTIVE ABILITY OF PLS AND COVARIANCE MODELS COMPARING THE PREDICTIVE ABILITY OF PLS AND COVARIANCE MODELS Completed Research Paper Joerg Evermann Memorial University of Newfoundland St. John's, Canada jevermann@mun.ca Mary Tate Victoria University

More information

Improving CERs building

Improving CERs building Improving CERs building Getting Rid of the R² tyranny Pierre Foussier pmf@3f fr.com ISPA. San Diego. June 2010 1 Why abandon the OLS? The ordinary least squares (OLS) aims to build a CER by minimizing

More information

Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O Halloran I. Introduction A. Overview 1. Ways to describe, summarize and display data. 2.Summary statements: Mean Standard deviation Variance

More information

Student-Level Growth Estimates for the SAT Suite of Assessments

Student-Level Growth Estimates for the SAT Suite of Assessments Student-Level Growth Estimates for the SAT Suite of Assessments YoungKoung Kim, Tim Moses and Xiuyuan Zhang November 2017 Disclaimer: This report is a pre-published version. The version that will eventually

More information

Technical Papers supporting SAP 2009

Technical Papers supporting SAP 2009 Technical Papers supporting SAP 29 A meta-analysis of boiler test efficiencies to compare independent and manufacturers results Reference no. STP9/B5 Date last amended 25 March 29 Date originated 6 October

More information

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data A Research Report Submitted to the Maryland State Department of Education (MSDE)

More information

PVP Field Calibration and Accuracy of Torque Wrenches. Proceedings of ASME PVP ASME Pressure Vessel and Piping Conference PVP2011-

PVP Field Calibration and Accuracy of Torque Wrenches. Proceedings of ASME PVP ASME Pressure Vessel and Piping Conference PVP2011- Proceedings of ASME PVP2011 2011 ASME Pressure Vessel and Piping Conference Proceedings of the ASME 2011 Pressure Vessels July 17-21, & Piping 2011, Division Baltimore, Conference Maryland PVP2011 July

More information

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018 Review of Linear Regression I Statistics 211 - Statistical Methods II Presented January 9, 2018 Estimation of The OLS under normality the OLS Dan Gillen Department of Statistics University of California,

More information

WHITE PAPER. Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard

WHITE PAPER. Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard WHITE PAPER Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard August 2017 Introduction The term accident, even in a collision sense, often has the connotation of being an

More information

Modeling Ignition Delay in a Diesel Engine

Modeling Ignition Delay in a Diesel Engine Modeling Ignition Delay in a Diesel Engine Ivonna D. Ploma Introduction The object of this analysis is to develop a model for the ignition delay in a diesel engine as a function of four experimental variables:

More information

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR Tutorial 1 Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR Dataset for running Correlated Component Regression This tutorial 1 is based on data provided by Michel Tenenhaus and

More information

Meeting product specifications

Meeting product specifications Optimisation of a diesel hydrotreating unit A model based on operating data is used to meet sulphur product specifications at lower DHT reactor temperatures with longer catalyst life Jose Bird Valero Energy

More information

DRIVER SPEED COMPLIANCE WITHIN SCHOOL ZONES AND EFFECTS OF 40 PAINTED SPEED LIMIT ON DRIVER SPEED BEHAVIOURS Tony Radalj Main Roads Western Australia

DRIVER SPEED COMPLIANCE WITHIN SCHOOL ZONES AND EFFECTS OF 40 PAINTED SPEED LIMIT ON DRIVER SPEED BEHAVIOURS Tony Radalj Main Roads Western Australia DRIVER SPEED COMPLIANCE WITHIN SCHOOL ZONES AND EFFECTS OF 4 PAINTED SPEED LIMIT ON DRIVER SPEED BEHAVIOURS Tony Radalj Main Roads Western Australia ABSTRACT Two speed surveys were conducted on nineteen

More information

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics ST7003-1 TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN Faculty of Engineering, Mathematics and Science School of Computer Science and Statistics Postgraduate Certificate in Statistics Hilary Term 2015

More information

A REPORT ON THE STATISTICAL CHARACTERISTICS of the Highlands Ability Battery CD

A REPORT ON THE STATISTICAL CHARACTERISTICS of the Highlands Ability Battery CD A REPORT ON THE STATISTICAL CHARACTERISTICS of the Highlands Ability Battery CD Prepared by F. Jay Breyer Jonathan Katz Michael Duran November 21, 2002 TABLE OF CONTENTS Introduction... 1 Data Determination

More information

5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS

5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS 5. CONSTRUCTION OF THE WEIGHT-FOR-LENGTH AND WEIGHT-FOR- HEIGHT STANDARDS 5.1 Indicator-specific methodology The construction of the weight-for-length (45 to 110 cm) and weight-for-height (65 to 120 cm)

More information

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content...

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content... Contents Preface... xi A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content... xii Chapter 1 Introducing Partial Least Squares...

More information

LET S ARGUE: STUDENT WORK PAMELA RAWSON. Baxter Academy for Technology & Science Portland, rawsonmath.

LET S ARGUE: STUDENT WORK PAMELA RAWSON. Baxter Academy for Technology & Science Portland, rawsonmath. LET S ARGUE: STUDENT WORK PAMELA RAWSON Baxter Academy for Technology & Science Portland, Maine pamela.rawson@gmail.com @rawsonmath rawsonmath.com Contents Student Movie Data Claims (Cycle 1)... 2 Student

More information

(Refer Slide Time: 00:01:10min)

(Refer Slide Time: 00:01:10min) Introduction to Transportation Engineering Dr. Bhargab Maitra Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture - 11 Overtaking, Intermediate and Headlight Sight Distances

More information

LECTURE 6: HETEROSKEDASTICITY

LECTURE 6: HETEROSKEDASTICITY LECTURE 6: HETEROSKEDASTICITY Summary of MLR Assumptions 2 MLR.1 (linear in parameters) MLR.2 (random sampling) the basic framework (we have to start somewhere) MLR.3 (no perfect collinearity) a technical

More information

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1 Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population C. B. Paulk, G. L. Highland 2, M. D. Tokach, J. L. Nelssen, S. S. Dritz 3, R. D.

More information

Driving Tests: Reliability and the Relationship Between Test Errors and Accidents

Driving Tests: Reliability and the Relationship Between Test Errors and Accidents University of Iowa Iowa Research Online Driving Assessment Conference 2001 Driving Assessment Conference Aug 16th, 12:00 AM Driving Tests: Reliability and the Relationship Between Test Errors and Accidents

More information

Linking the Alaska AMP Assessments to NWEA MAP Tests

Linking the Alaska AMP Assessments to NWEA MAP Tests Linking the Alaska AMP Assessments to NWEA MAP Tests February 2016 Introduction Northwest Evaluation Association (NWEA ) is committed to providing partners with useful tools to help make inferences from

More information

Linking the Mississippi Assessment Program to NWEA MAP Tests

Linking the Mississippi Assessment Program to NWEA MAP Tests Linking the Mississippi Assessment Program to NWEA MAP Tests February 2017 Introduction Northwest Evaluation Association (NWEA ) is committed to providing partners with useful tools to help make inferences

More information

Robust alternatives to best linear unbiased prediction of complex traits

Robust alternatives to best linear unbiased prediction of complex traits Robust alternatives to best linear unbiased prediction of complex traits WHY BEST LINEAR UNBIASED PREDICTION EASY TO EXPLAIN FLEXIBLE AMENDABLE WELL UNDERSTOOD FEASIBLE UNPRETENTIOUS NORMALITY IS IMPLICIT

More information

1. Tolerance Allocation to Optimize Process Capability

1. Tolerance Allocation to Optimize Process Capability 1. Tolerance Allocation to Optimize Process Capability by Andrew M. Terry 1 A. Background The product considered in this example is part of an industrial air conditioning system compressor made by Carrier

More information

Fuel Economy and Safety

Fuel Economy and Safety Fuel Economy and Safety A Reexamination under the U.S. Footprint-Based Fuel Economy Standards Jiaxi Wang University of California, Irvine Abstract The purpose of this study is to reexamine the tradeoff

More information

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests * Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. February 2016 Introduction Northwest Evaluation Association

More information

2018 Linking Study: Predicting Performance on the NSCAS Summative ELA and Mathematics Assessments based on MAP Growth Scores

2018 Linking Study: Predicting Performance on the NSCAS Summative ELA and Mathematics Assessments based on MAP Growth Scores 2018 Linking Study: Predicting Performance on the NSCAS Summative ELA and Mathematics Assessments based on MAP Growth Scores November 2018 Revised December 19, 2018 NWEA Psychometric Solutions 2018 NWEA.

More information

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests * Linking the Virginia SOL Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association (NWEA

More information

Regression Models Course Project, 2016

Regression Models Course Project, 2016 Regression Models Course Project, 2016 Venkat Batchu July 13, 2016 Executive Summary In this report, mtcars data set is explored/analyzed for relationship between outcome variable mpg (miles for gallon)

More information

TECHNICAL REPORTS from the ELECTRONICS GROUP at the UNIVERSITY of OTAGO. Table of Multiple Feedback Shift Registers

TECHNICAL REPORTS from the ELECTRONICS GROUP at the UNIVERSITY of OTAGO. Table of Multiple Feedback Shift Registers ISSN 1172-496X ISSN 1172-4234 (Print) (Online) TECHNICAL REPORTS from the ELECTRONICS GROUP at the UNIVERSITY of OTAGO Table of Multiple Feedback Shift Registers by R. W. Ward, T.C.A. Molteno ELECTRONICS

More information

The Mark Ortiz Automotive

The Mark Ortiz Automotive August 2004 WELCOME Mark Ortiz Automotive is a chassis consulting service primarily serving oval track and road racers. This newsletter is a free service intended to benefit racers and enthusiasts by offering

More information

BAC and Fatal Crash Risk

BAC and Fatal Crash Risk BAC and Fatal Crash Risk David F. Preusser PRG, Inc. 7100 Main Street Trumbull, Connecticut Keywords Alcohol, risk, crash Abstract Induced exposure, a technique whereby not-at-fault driver crash involvements

More information

Linking the Florida Standards Assessments (FSA) to NWEA MAP

Linking the Florida Standards Assessments (FSA) to NWEA MAP Linking the Florida Standards Assessments (FSA) to NWEA MAP October 2016 Introduction Northwest Evaluation Association (NWEA ) is committed to providing partners with useful tools to help make inferences

More information

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests * Linking the Kansas KAP Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. February 2016 Introduction Northwest Evaluation Association (NWEA

More information

Hydro Plant Risk Assessment Guide

Hydro Plant Risk Assessment Guide September 2006 Hydro Plant Risk Assessment Guide Appendix E8: Battery Condition Assessment E8.1 GENERAL Plant or station batteries are key components in hydroelectric powerplants and are appropriate for

More information

International Aluminium Institute

International Aluminium Institute THE INTERNATIONAL ALUMINIUM INSTITUTE S REPORT ON THE ALUMINIUM INDUSTRY S GLOBAL PERFLUOROCARBON GAS EMISSIONS REDUCTION PROGRAMME RESULTS OF THE 2003 ANODE EFFECT SURVEY 28 January 2005 Published by:

More information

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian Sharif University of Technology Graduate School of Management and Economics Econometrics I Fall 2010 Seyed Mahdi Barakchian Textbook: Wooldridge, J., Introductory Econometrics: A Modern Approach, South

More information

Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion

Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion ByMICHAELL.ANDERSON AI. Mathematical Appendix Distance to nearest bus line: Suppose that bus lines

More information

Data envelopment analysis with missing values: an approach using neural network

Data envelopment analysis with missing values: an approach using neural network IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.2, February 2017 29 Data envelopment analysis with missing values: an approach using neural network B. Dalvand, F. Hosseinzadeh

More information

Problem Set 3 - Solutions

Problem Set 3 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis January 22, 2011 John Parman Problem Set 3 - Solutions This problem set will be due by 5pm on Monday, February 7th. It may be turned

More information

Burn Characteristics of Visco Fuse

Burn Characteristics of Visco Fuse Originally appeared in Pyrotechnics Guild International Bulletin, No. 75 (1991). Burn Characteristics of Visco Fuse by K.L. and B.J. Kosanke From time to time there is speculation regarding the performance

More information

Supervised Learning to Predict Human Driver Merging Behavior

Supervised Learning to Predict Human Driver Merging Behavior Supervised Learning to Predict Human Driver Merging Behavior Derek Phillips, Alexander Lin {djp42, alin719}@stanford.edu June 7, 2016 Abstract This paper uses the supervised learning techniques of linear

More information

Use of Flow Network Modeling for the Design of an Intricate Cooling Manifold

Use of Flow Network Modeling for the Design of an Intricate Cooling Manifold Use of Flow Network Modeling for the Design of an Intricate Cooling Manifold Neeta Verma Teradyne, Inc. 880 Fox Lane San Jose, CA 94086 neeta.verma@teradyne.com ABSTRACT The automatic test equipment designed

More information

Performance of the Mean- and Variance-Adjusted ML χ 2 Test Statistic with and without Satterthwaite df Correction

Performance of the Mean- and Variance-Adjusted ML χ 2 Test Statistic with and without Satterthwaite df Correction FORDHAM UNIVERSITY THE JESUIT UNIVERSITY OF NEW YORK Performance of the Mean- and Variance-Adjusted ML χ 2 Test Statistic with and without Satterthwaite df Correction Jonathan M. Lehrfeld Heining Cham

More information

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests * Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association

More information

Linking the Indiana ISTEP+ Assessments to NWEA MAP Tests

Linking the Indiana ISTEP+ Assessments to NWEA MAP Tests Linking the Indiana ISTEP+ Assessments to NWEA MAP Tests February 2017 Introduction Northwest Evaluation Association (NWEA ) is committed to providing partners with useful tools to help make inferences

More information

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests * Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Tilburt JC, Wynia MK, Sheeler RD, et al. Views of US physicians about controlling health care costs. JAMA. doi:10.1001/jama.2013.8278. Appendix A. Survey Items from Physicians,

More information

sponsoring agencies.)

sponsoring agencies.) DEPARTMENT OF HIGHWAYS AND TRANSPORTATION VIRGINIA TESTING EQUIPMENT CORRELATION RESULTS SKID 1974, 1975, and 1978 N. Runkle Stephen Analyst Research opinions, findings, and conclusions expressed in this

More information

Chapter 5 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL

Chapter 5 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL Chapter 5 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL 87 ESTIMATION OF MAINTENANCE COST PER HOUR USING AGE REPLACEMENT COST MODEL 5.1 INTRODUCTION Maintenance is usually carried

More information

Understanding the benefits of using a digital valve controller. Mark Buzzell Business Manager, Metso Flow Control

Understanding the benefits of using a digital valve controller. Mark Buzzell Business Manager, Metso Flow Control Understanding the benefits of using a digital valve controller Mark Buzzell Business Manager, Metso Flow Control Evolution of Valve Positioners Digital (Next Generation) Digital (First Generation) Analog

More information

Linking the Indiana ISTEP+ Assessments to the NWEA MAP Growth Tests. February 2017 Updated November 2017

Linking the Indiana ISTEP+ Assessments to the NWEA MAP Growth Tests. February 2017 Updated November 2017 Linking the Indiana ISTEP+ Assessments to the NWEA MAP Growth Tests February 2017 Updated November 2017 2017 NWEA. All rights reserved. No part of this document may be modified or further distributed without

More information

9.3 Tests About a Population Mean (Day 1)

9.3 Tests About a Population Mean (Day 1) Bellwork In a recent year, 73% of first year college students responding to a national survey identified being very well off financially as an important personal goal. A state university finds that 132

More information

Investigation of Relationship between Fuel Economy and Owner Satisfaction

Investigation of Relationship between Fuel Economy and Owner Satisfaction Investigation of Relationship between Fuel Economy and Owner Satisfaction June 2016 Malcolm Hazel, Consultant Michael S. Saccucci, Keith Newsom-Stewart, Martin Romm, Consumer Reports Introduction This

More information

Fourth Grade. Multiplication Review. Slide 1 / 146 Slide 2 / 146. Slide 3 / 146. Slide 4 / 146. Slide 5 / 146. Slide 6 / 146

Fourth Grade. Multiplication Review. Slide 1 / 146 Slide 2 / 146. Slide 3 / 146. Slide 4 / 146. Slide 5 / 146. Slide 6 / 146 Slide 1 / 146 Slide 2 / 146 Fourth Grade Multiplication and Division Relationship 2015-11-23 www.njctl.org Multiplication Review Slide 3 / 146 Table of Contents Properties of Multiplication Factors Prime

More information

Identify Formula for Throughput with Multi-Variate Regression

Identify Formula for Throughput with Multi-Variate Regression DECISION SCIENCES INSTITUTE Using multi-variate regression and simulation to identify a generic formula for throughput of flow manufacturing lines with identical stations Samrawi Berhanu Gebermedhin and

More information

Investigation in to the Application of PLS in MPC Schemes

Investigation in to the Application of PLS in MPC Schemes Ian David Lockhart Bogle and Michael Fairweather (Editors), Proceedings of the 22nd European Symposium on Computer Aided Process Engineering, 17-20 June 2012, London. 2012 Elsevier B.V. All rights reserved

More information

Who has trouble reporting prior day events?

Who has trouble reporting prior day events? Vol. 10, Issue 1, 2017 Who has trouble reporting prior day events? Tim Triplett 1, Rob Santos 2, Brian Tefft 3 Survey Practice 10.29115/SP-2017-0003 Jan 01, 2017 Tags: missing data, recall data, measurement

More information

SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING. Oliver Rose

SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING. Oliver Rose Proceedings of the 22 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING Oliver Rose

More information

Cost-Efficiency by Arash Method in DEA

Cost-Efficiency by Arash Method in DEA Applied Mathematical Sciences, Vol. 6, 2012, no. 104, 5179-5184 Cost-Efficiency by Arash Method in DEA Dariush Khezrimotlagh*, Zahra Mohsenpour and Shaharuddin Salleh Department of Mathematics, Faculty

More information

PROCEDURES FOR ESTIMATING THE TOTAL LOAD EXPERIENCE OF A HIGHWAY AS CONTRIBUTED BY CARGO VEHICLES

PROCEDURES FOR ESTIMATING THE TOTAL LOAD EXPERIENCE OF A HIGHWAY AS CONTRIBUTED BY CARGO VEHICLES PROCEDURES FOR ESTIMATING THE TOTAL LOAD EXPERIENCE OF A HIGHWAY AS CONTRIBUTED BY CARGO VEHICLES SUMMARY REPORT of Research Report 131-2F Research Study Number 2-10-68-131 A Cooperative Research Program

More information

Transmission Error in Screw Compressor Rotors

Transmission Error in Screw Compressor Rotors Purdue University Purdue e-pubs International Compressor Engineering Conference School of Mechanical Engineering 2008 Transmission Error in Screw Compressor Rotors Jack Sauls Trane Follow this and additional

More information

Diagnostic. Enlightenment. The Path to

Diagnostic. Enlightenment. The Path to The Path to Diagnostic Enlightenment BY JORGE MENCHU If you don t know where you re going, any road will take you there. When it comes to automotive troubleshooting, the right road is the shortest path

More information

A Cost Benefit Analysis of Faster Transmission System Protection Schemes and Ground Grid Design

A Cost Benefit Analysis of Faster Transmission System Protection Schemes and Ground Grid Design A Cost Benefit Analysis of Faster Transmission System Protection Schemes and Ground Grid Design Presented at the 2018 Transmission and Substation Design and Operation Symposium Revision presented at the

More information

Relating your PIRA and PUMA test marks to the national standard

Relating your PIRA and PUMA test marks to the national standard Relating your PIRA and PUMA test marks to the national standard We have carried out a detailed statistical analysis between the results from the PIRA and PUMA tests for Year 2 and Year 6 and the scaled

More information

Relating your PIRA and PUMA test marks to the national standard

Relating your PIRA and PUMA test marks to the national standard Relating your PIRA and PUMA test marks to the national standard We have carried out a detailed statistical analysis between the results from the PIRA and PUMA tests for Year 2 and Year 6 and the scaled

More information

Using MATLAB/ Simulink in the designing of Undergraduate Electric Machinery Courses

Using MATLAB/ Simulink in the designing of Undergraduate Electric Machinery Courses Using MATLAB/ Simulink in the designing of Undergraduate Electric Machinery Courses Mostafa.A. M. Fellani, Daw.E. Abaid * Control Engineering department Faculty of Electronics Technology, Beni-Walid, Libya

More information

Effects of speed distributions on the Harmonoise model predictions

Effects of speed distributions on the Harmonoise model predictions The 33 rd International Congress and Exposition on Noise Control Engineering Effects of speed distributions on the Harmonoise model predictions G Watts a, D van Maercke b, H van Leeuwen c, R Barelds c,

More information

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications Vehicle Scrappage and Gasoline Policy By Mark R. Jacobsen and Arthur A. van Benthem Online Appendix Appendix A Alternative First Stage and Reduced Form Specifications Reduced Form Using MPG Quartiles The

More information

INVESTIGATION ONE: WHAT DOES A VOLTMETER DO? How Are Values of Circuit Variables Measured?

INVESTIGATION ONE: WHAT DOES A VOLTMETER DO? How Are Values of Circuit Variables Measured? How Are Values of Circuit Variables Measured? INTRODUCTION People who use electric circuits for practical purposes often need to measure quantitative values of electric pressure difference and flow rate

More information

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here. From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here. About this Book... ix About the Author... xiii Acknowledgments...xv Chapter 1 Introduction...

More information

A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89

A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89 International Journal of Networks and Communications 2012, 2(1): 11-16 DOI: 10.5923/j.ijnc.20120201.02 A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89 Hung-Peng Lee Department of

More information

Fourth Grade. Slide 1 / 146. Slide 2 / 146. Slide 3 / 146. Multiplication and Division Relationship. Table of Contents. Multiplication Review

Fourth Grade. Slide 1 / 146. Slide 2 / 146. Slide 3 / 146. Multiplication and Division Relationship. Table of Contents. Multiplication Review Slide 1 / 146 Slide 2 / 146 Fourth Grade Multiplication and Division Relationship 2015-11-23 www.njctl.org Table of Contents Slide 3 / 146 Click on a topic to go to that section. Multiplication Review

More information

Fractional Factorial Designs with Admissible Sets of Clear Two-Factor Interactions

Fractional Factorial Designs with Admissible Sets of Clear Two-Factor Interactions Statistics Preprints Statistics 11-2008 Fractional Factorial Designs with Admissible Sets of Clear Two-Factor Interactions Huaiqing Wu Iowa State University, isuhwu@iastate.edu Robert Mee University of

More information

Appendix B STATISTICAL TABLES OVERVIEW

Appendix B STATISTICAL TABLES OVERVIEW Appendix B STATISTICAL TABLES OVERVIEW Table B.1: Proportions of the Area Under the Normal Curve Table B.2: 1200 Two-Digit Random Numbers Table B.3: Critical Values for Student s t-test Table B.4: Power

More information

Introducing the OMAX Generation 4 cutting model

Introducing the OMAX Generation 4 cutting model Introducing the OMAX Generation 4 cutting model 8/11/2014 It is strongly recommend that OMAX machine owners and operators read this document in its entirety in order to fully understand and best take advantage

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data 2-3 Pictures of Data 2-4 Measures of Central Tendency 2-5 Measures of Variation 2-6 Measures of Position 2-7 Exploratory Data Analysis

More information

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size blu38582_if_1-8.qxd 9/27/10 9:19 PM Page 1 Important Formulas Chapter 3 Data Description Mean for individual data: Mean for grouped data: Standard deviation for a sample: X2 s X n 1 or Standard deviation

More information

Predicting Solutions to the Optimal Power Flow Problem

Predicting Solutions to the Optimal Power Flow Problem Thomas Navidi Suvrat Bhooshan Aditya Garg Abstract Predicting Solutions to the Optimal Power Flow Problem This paper discusses an implementation of gradient boosting regression to predict the output of

More information

The Degrees of Freedom of Partial Least Squares Regression

The Degrees of Freedom of Partial Least Squares Regression The Degrees of Freedom of Partial Least Squares Regression Dr. Nicole Krämer TU München 5th ESSEC-SUPELEC Research Workshop May 20, 2011 My talk is about...... the statistical analysis of Partial Least

More information

Busy Ant Maths and the Scottish Curriculum for Excellence Foundation Level - Primary 1

Busy Ant Maths and the Scottish Curriculum for Excellence Foundation Level - Primary 1 Busy Ant Maths and the Scottish Curriculum for Excellence Foundation Level - Primary 1 Number, money and measure Estimation and rounding Number and number processes Fractions, decimal fractions and percentages

More information

Optimization of Three-stage Electromagnetic Coil Launcher

Optimization of Three-stage Electromagnetic Coil Launcher Sensors & Transducers 2014 by IFSA Publishing, S. L. http://www.sensorsportal.com Optimization of Three-stage Electromagnetic Coil Launcher 1 Yujiao Zhang, 1 Weinan Qin, 2 Junpeng Liao, 3 Jiangjun Ruan,

More information

PLS score-loading correspondence and a bi-orthogonal factorization

PLS score-loading correspondence and a bi-orthogonal factorization PLS score-loading correspondence and a bi-orthogonal factorization Rolf Ergon elemark University College P.O.Box, N-9 Porsgrunn, Norway e-mail: rolf.ergon@hit.no telephone: ++ 7 7 telefax: ++ 7 7 Published

More information

The Value of Travel-Time: Estimates of the Hourly Value of Time for Vehicles in Oregon 2007

The Value of Travel-Time: Estimates of the Hourly Value of Time for Vehicles in Oregon 2007 The Value of Travel-Time: Estimates of the Hourly Value of Time for Vehicles in Oregon 2007 Oregon Department of Transportation Long Range Planning Unit June 2008 For questions contact: Denise Whitney

More information

Low Speed Rear End Crash Analysis

Low Speed Rear End Crash Analysis Low Speed Rear End Crash Analysis MARC1 Use in Test Data Analysis and Crash Reconstruction Rudy Limpert, Ph.D. Short Paper PCB2 2015 www.pcbrakeinc.com e mail: prosourc@xmission.com 1 1.0. Introduction

More information

Components of Hydronic Systems

Components of Hydronic Systems Valve and Actuator Manual 977 Hydronic System Basics Section Engineering Bulletin H111 Issue Date 0789 Components of Hydronic Systems The performance of a hydronic system depends upon many factors. Because

More information

ESTIMATING THE LIVES SAVED BY SAFETY BELTS AND AIR BAGS

ESTIMATING THE LIVES SAVED BY SAFETY BELTS AND AIR BAGS ESTIMATING THE LIVES SAVED BY SAFETY BELTS AND AIR BAGS Donna Glassbrenner National Center for Statistics and Analysis National Highway Traffic Safety Administration Washington DC 20590 Paper No. 500 ABSTRACT

More information

EMaSM. Principles Of Sensors & transducers

EMaSM. Principles Of Sensors & transducers EMaSM Principles Of Sensors & transducers Introduction: At the heart of measurement of common physical parameters such as force and pressure are sensors and transducers. These devices respond to the parameters

More information

Improving Analog Product knowledge using Principal Components Variable Clustering in JMP on test data.

Improving Analog Product knowledge using Principal Components Variable Clustering in JMP on test data. Improving Analog Product knowledge using Principal Components Variable Clustering in JMP on test data. Yves Chandon, Master BlackBelt at Freescale Semiconductor F e b 2 7. 2015 TM External Use We Touch

More information

The following output is from the Minitab general linear model analysis procedure.

The following output is from the Minitab general linear model analysis procedure. Chapter 13. Supplemental Text Material 13-1. The Staggered, Nested Design In Section 13-1.4 we introduced the staggered, nested design as a useful way to prevent the number of degrees of freedom from building

More information

Analyzing Crash Risk Using Automatic Traffic Recorder Speed Data

Analyzing Crash Risk Using Automatic Traffic Recorder Speed Data Analyzing Crash Risk Using Automatic Traffic Recorder Speed Data Thomas B. Stout Center for Transportation Research and Education Iowa State University 2901 S. Loop Drive Ames, IA 50010 stouttom@iastate.edu

More information

Open Discussion Topic: Potential Pitfalls in the Use of Coefficient of Variation as a Measure of Trial Validity

Open Discussion Topic: Potential Pitfalls in the Use of Coefficient of Variation as a Measure of Trial Validity Open Discussion Topic: Potential Pitfalls in the Use of Coefficient of Variation as a Measure of Trial Validity Calvin Trostle, Ph.D. Extension Agronomy Texas A&M AgriLife, Lubbock (806) 746-6101, ctrostle@ag.tamu.edu

More information

PREDICTION OF FUEL CONSUMPTION

PREDICTION OF FUEL CONSUMPTION PREDICTION OF FUEL CONSUMPTION OF AGRICULTURAL TRACTORS S. C. Kim, K. U. Kim, D. C. Kim ABSTRACT. A mathematical model was developed to predict fuel consumption of agricultural tractors using their official

More information

Assessing Feeder Hosting Capacity for Distributed Generation Integration

Assessing Feeder Hosting Capacity for Distributed Generation Integration 21, rue d Artois, F-75008 PARIS CIGRE US National Committee http : //www.cigre.org 2015 Grid of the Future Symposium Assessing Feeder Hosting Capacity for Distributed Generation Integration D. APOSTOLOPOULOU*,

More information

Large Electric Motor Reliability: What Did the Studies Really Say? Howard W Penrose, Ph.D., CMRP President, MotorDoc LLC

Large Electric Motor Reliability: What Did the Studies Really Say? Howard W Penrose, Ph.D., CMRP President, MotorDoc LLC Large Electric Motor Reliability: What Did the Studies Really Say? Howard W Penrose, Ph.D., CMRP President, MotorDoc LLC One of the most frequently quoted studies related to electric motor reliability

More information