Model Combination in Multiclass Classification

Size: px
Start display at page:

Download "Model Combination in Multiclass Classification"

Transcription

1 Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic Department of Computer Science University of Colorado at Boulder USA April 5, 2010 Sam Reid Model Combination in Multiclass Classification 1/ 76

2 Multiclass Classification From examples, make multiclass predictions on unseen data. Applications in: Heartbeat arrythmia monitoring Protein structure classification Handwritten digit recognition Part of speech tagging Vehicle identification Many others... Our approach: model combination Sam Reid Model Combination in Multiclass Classification 2/ 76

3 Multiclass Classification: Example Heartbeat Arrhythmia Monitoring Data Set (truncated) age gender height weight BPM QRS 274 other wave class (yrs) (cm) (kg) duration (ms) characteristics 75 m Supraventricular Pre. 56 f Sinus bradycardy 54 m Right bundle block 55 m normal 75 m ?... Ventricular Pre. 13 m Left ventricule hyper. 40 f normal 49 f normal 44 m normal 50 f Right bundle block m ? 45 f ? Sam Reid Model Combination in Multiclass Classification 3/ 76

4 Model Combination Combine multiclass classifiers (e.g. KNN, Decision Trees, Random Forests) Voting Averaging Linear Nonlinear Combine binary classifiers (e.g. SVM, AdaBoost) to solve multiclass One vs. All Pairwise Classification Error Correcting Output Coding Sam Reid Model Combination in Multiclass Classification 4/ 76

5 Outline Regularization in Linear Combinations of Multiclass Classifiers Model Discussion Our Method Sam Reid Model Combination in Multiclass Classification 5/ 76

6 Model Outline Regularization in Linear Combinations of Multiclass Classifiers Model Discussion Our Method Sam Reid Model Combination in Multiclass Classification 6/ 76

7 Model Classifier Combination Goal: optimize predictions on test data Maintain diversity without sacrificing accuracy Train many classifiers with different algorithms/hyperparameters Combine with a linear combination function Ting & Witten, 1999 Seewald, 2002 Caruana et al., 2004 Sam Reid Model Combination in Multiclass Classification 7/ 76

8 Model Linear StackingC 1/2 Stacked Generalization Predictions on validation data are meta-training data Linear StackingC, class-conscious stacked generalization ˆp j ( x) = w ij y ij ( x) i=1..l ˆp j ( x) is the predicted probability for class c j w ij is the weight corresponding to classifier y i and class c j y ij ( x) is the i th classifier s output on class c j Training set = classifier predictions on unseen data + labels Determine weights using linear regression Sam Reid Model Combination in Multiclass Classification 8/ 76

9 Model Linear StackingC 2/2 ŷ y A (x A ) y B (x B ) y C (x C ) x y 1 (x) y 2 (x) x Sam Reid Model Combination in Multiclass Classification 9/ 76

10 Model Problems Caruana et al., 2004: Stacking [linear] performs poorly because regression overfits dramatically when there are 2000 highly correlated input models and only 1k points in the validation set. How can we scale up stacking to a large number of classifiers? Sam Reid Model Combination in Multiclass Classification 10/ 76

11 Model Problems Caruana et al., 2004: Stacking [linear] performs poorly because regression overfits dramatically when there are 2000 highly correlated input models and only 1k points in the validation set. How can we scale up stacking to a large number of classifiers? Our hypothesis: regularized linear combiner will reduce variance & prevent overfitting on indicator subproblems increase accuracy on multiclass problem Penalty terms in our studies: Ridge Regression: L = y X β 2 + λ β 2 Lasso Regression: L = y X β 2 + λ β 1 Elastic Net Regression: L = y X β 2 + (1 α) β 2 + α β 1 Sam Reid Model Combination in Multiclass Classification 10/ 76

12 Model Thesis Statement - Part I In linear combinations of multiclass classifiers, regularization significantly improves performance. Sam Reid Model Combination in Multiclass Classification 11/ 76

13 Model Multiclass Classification Data Sets Dataset Att.(numeric) Instances Classes balance-scale glass letter mfeat-morphological optdigits sat-image segment vehicle waveform yeast Sam Reid Model Combination in Multiclass Classification 12/ 76

14 Model Algorithms About 1000 base classifiers for each problem 1. Neural Network 2. Support Vector Machine (C-SVM from LibSVM) 3. K-Nearest Neighbor 4. Decision Stump 5. Decision Tree 6. AdaBoost.M1 7. Bagging classifier 8. Random Forest (Weka) 9. Random Forest (R) Sam Reid Model Combination in Multiclass Classification 13/ 76

15 Model Results: Average Accuracy Accuracy (%) sg-linear sg-lasso vote average select-best sg-ridge Sam Reid Model Combination in Multiclass Classification 14/ 76

16 Model Statistical Analysis Ridge outperforms unregularized at p Validates hypothesis: regularization improves accuracy Ridge outperforms lasso at p Dense better than sparse Voting and averaging all models not competitive Sam Reid Model Combination in Multiclass Classification 15/ 76

17 Model Multiclass Accuracy Binary Accuracy 1/3 RMSE Ridge Parameter..... Root mean squared error for the first (class-1) indicator subproblem in sat-image, over 10 folds of Dietterich s 5x2 CV. Sam Reid Model Combination in Multiclass Classification 16/ 76

18 Model Multiclass Accuracy Binary Accuracy 2/3 Accuracy Ridge Parameter Multiclass classification accuracy as a function of the regularization hyperparameter λ ridge. Sam Reid Model Combination in Multiclass Classification 17/ 76

19 Model Multiclass Accuracy Binary Accuracy 3/3 Accuracy RMSE on Subproblem 1 Accuracy vs RMSE on the first (class-1) indicator subproblem.... Multiclass Accuracy Binary Accuracy Sam Reid Model Combination in Multiclass Classification 18/ 76

20 Model Ridge More Effective than Lasso Accuracy alpha=0.95 alpha=0.5 alpha=0.05 select-best Penalty Overall accuracy on sat-image with various parameters for elastic-net. Sam Reid Model Combination in Multiclass Classification 19/ 76

21 Model Focus on Subproblems Choose from classifiers and predictions Allow classifiers to focus on subproblems Example: Benefit from a classifier that predicts well-calibrated probabilities for class A but has B & C backwards This advantage possible on multiclass classification but not binary classification, since k i=1 p i( x) = 1 Sam Reid Model Combination in Multiclass Classification 20/ 76

22 Model Sparse Linear Combinations Log Lambda Class Log Lambda Class Log Lambda Class Class Class Class Coefficient profiles for the first three subproblems in StackingC for the sat-image dataset with elastic net regression at α = 0.95 Sam Reid Model Combination in Multiclass Classification 21/ 76

23 Model Selected Classifiers Classifier red cotton grey damp veg v.damp total adaboost ann ann ann ann knn Weights (%) for the sat-image problem in elastic net StackingC with α = 0.95 for the 6 models with highest total weights. Sam Reid Model Combination in Multiclass Classification 22/ 76

24 Model Conclusions & Future Work Regularization is essential in linear combinations of multiclass classifiers Dense combiners outperform sparse combiners One-weight-per-output (instead of one-weight-per-classifier) allows classifiers to specialize in subproblems Future Work Bayesian treatment, Gaussian/Laplacian priors over weights Constrain coefficients to be positive This work published as: Regularized Linear Models in Stacked Generalization, Sam Reid and Greg Grudic, Multiple Classifier Systems, 2009, Springer LNCS Sam Reid Model Combination in Multiclass Classification 23/ 76

25 Discussion Outline Regularization in Linear Combinations of Multiclass Classifiers Model Discussion Our Method Sam Reid Model Combination in Multiclass Classification 24/ 76

26 Discussion Reducing Multiclass to Binary Some classifiers designed for binary (e.g. SVM, Adaboost) Transform multiclass set of binary problems Combine binary predictions predict multiclass A vs B,C in one-vs-all A vs C in all-pairs Sam Reid Model Combination in Multiclass Classification 25/ 76

27 Discussion Model Selection in Reducing Multiclass to Binary No Model Selection Dietterich and Bakiri, 1995 Allwein et al., 2000 Sam Reid Model Combination in Multiclass Classification 26/ 76

28 Discussion Model Selection in Reducing Multiclass to Binary No Model Selection Dietterich and Bakiri, 1995 Allwein et al., 2000 Shared Hyperparameters Rifkin uses greedy 1d hillclimbing, with OVA + LBD, Rifkin & Klautau, 2004 Model selection in LibSVM, Chang & Lin, 2001 Sam Reid Model Combination in Multiclass Classification 26/ 76

29 Discussion Model Selection in Reducing Multiclass to Binary No Model Selection Dietterich and Bakiri, 1995 Allwein et al., 2000 Shared Hyperparameters Rifkin uses greedy 1d hillclimbing, with OVA + LBD, Rifkin & Klautau, 2004 Model selection in LibSVM, Chang & Lin, 2001 Optimize Subproblems Independently Homogeneous, Friedman 1996 Heterogeneous, Szepannek et al Sam Reid Model Combination in Multiclass Classification 26/ 76

30 Discussion Model Selection in Reducing Multiclass to Binary No Model Selection Dietterich and Bakiri, 1995 Allwein et al., 2000 Shared Hyperparameters Rifkin uses greedy 1d hillclimbing, with OVA + LBD, Rifkin & Klautau, 2004 Model selection in LibSVM, Chang & Lin, 2001 Optimize Subproblems Independently Homogeneous, Friedman 1996 Heterogeneous, Szepannek et al Optimize the Joint Distribution Evolutionary search, de Souza et al., 2006, Lebrun et al., 2007 Sam Reid Model Combination in Multiclass Classification 26/ 76

31 Discussion Shared Hyperparameters vs Independent Optimization Shared Hyperparameters Optimizes to the target multiclass metric Increases bias and reduces variance for model selection Independent Optimization Accommodate subproblems with different structure Improved subproblem performance improved performance Sam Reid Model Combination in Multiclass Classification 27/ 76

32 Discussion Thesis Statement - Part II When solving a multiclass problem with a set of binary classifiers, it is more effective to constrain subproblems to use the same hyperparameters than to optimize each independently. Sam Reid Model Combination in Multiclass Classification 28/ 76

33 Discussion Multiclass Classification Data Sets 1/2 dataset classes numeric train test sampled-from anneal arrhythmia authorship autos cars collins dj ecoli eucalyptus halloffame Sam Reid Model Combination in Multiclass Classification 29/ 76

34 Discussion Multiclass Classification Data Sets 2/2 dataset classes numeric train test sampled-from hypothyroid letter mfeat-morphological optdigits page-blocks segment synthetic-control vehicle vowel waveform Sam Reid Model Combination in Multiclass Classification 30/ 76

35 Discussion Methods Reductions: {one-vs-all, all-pairs} {hamming, squared} Model selection: {shared, independent} Base classifier: LibSVM with 2-phase grid search Sam Reid Model Combination in Multiclass Classification 31/ 76

36 Discussion Shared vs Independent: Test Set Accuracy Average accuracy (%) p <= p <= p <= p <= shared independent one-vs-all all-pairs one-vs-all-hamming all-pairs-squared Sam Reid Model Combination in Multiclass Classification 32/ 76

37 Discussion Subproblems are Similar - Vehicle, one-vs-all vehicle: one-vs-all accuracy log2(g) subproblem 0 subproblem 1 subproblem 2 subproblem 3 Independent model selection curves for one-vs-all on vehicle Sam Reid Model Combination in Multiclass Classification 33/ 76

38 Discussion Subproblems are Similar - Vehicle, all-pairs accuracy vehicle: all-pairs log2(g) subproblem 0 subproblem 1 subproblem 2 subproblem 3 subproblem 4 subproblem 5 Independent model selection curves for all-pairs on vehicle Sam Reid Model Combination in Multiclass Classification 34/ 76

39 log2(g) subproblem 0 subproblem 1 subproblem log2(g) subproblem 0 subproblem 1 subproblem subproblem 0 subproblem 1 subproblem 2 subproblem 3 subproblem 4 Discussion log2(g) log2(g) log2(g) log2(g) Subproblems are Similar - Examples cars: one-vs-all page-blocks: one-vs-all letter: one-vs-all accuracy accuracy accuracy cars: one-vs-all page-blocks: one-vs-all letter: one-vs-all accuracy cars: all-pairs accuracy page-blocks: all-pairs accuracy letter: all-pairs cars: all-pairs page-blocks: all-pairs letter: all-pairs Sam Reid Model Combination in Multiclass Classification 35/ 76

40 Discussion Subproblems are Similar - Aggregate Results 1/3 Define γ s = optimal shared hyperparameter γ i = optimal independent hyperparameter Compute accuracy difference d = ā(γ i ) a(γ s ) Where ā indicates an average over subproblems Sam Reid Model Combination in Multiclass Classification 36/ 76

41 Discussion Subproblems are Similar - Aggregate Results 2/3 Average Accuracy Loss (%) halloffame vehicle synthetic-control Average Subproblem Loss at Selected Optimum For each dataset i, d i < 0.80% Average d = 0.30% authorship optdigits anneal waveform vowel letter dj collins segment page-blocks mfeat-morphological hypothyroid eucalyptus ecoli cars arrhythmia autos one-vs-all Sam Reid Model Combination in Multiclass Classification 37/ 76

42 Discussion Subproblems are Similar - Aggregate Results 3/3 Average Accuracy Loss (%) Average Subproblem Loss at Selected Optimum anneal waveform authorship halloffame hypothyroid optdigits eucalyptus segment ecoli page-blocks vowel vehicle collins cars synthetic-control arrhythmia autos mfeat-morphological dj letter all-pairs Largest values: 36.6% (letter), 29.4% (dj ) Average d = 4.24% Sam Reid Model Combination in Multiclass Classification 38/ 76

43 Discussion Differing Subproblems Favor Independent Construct a synthetic problem with different shapes of decision boundaries Requires different hyperparameters Requires independent optimization First, a control experiment with only linear decision boundaries Sam Reid Model Combination in Multiclass Classification 39/ 76

44 Discussion Differing Subproblems Favor Independent - Linear Synthetic Data 1/2 Linear Decision Boundaries with Varying Noise y x Class_0 Class_1 Class_2 Sam Reid Model Combination in Multiclass Classification 40/ 76

45 Discussion Differing Subproblems Favor Independent - Linear Synthetic Data 2/2 Accuracy (%) results for linear decision boundaries. Standard error over 10 random samplings is indicated in parentheses. reduction shared independent one-vs-all 66.7 (1.3) 66.1 (1.3) one-vs-all-hamming 58.2 (2.5) 58.1 (1.9) all-pairs 67.6 (1.3) 66.5 (1.9) Sam Reid Model Combination in Multiclass Classification 41/ 76

46 Discussion Differing Subproblems Favor Independent - Mixed Synthetic Data 1/2 Linear and Nonlinear Decision Boundaries y x A B C Sam Reid Model Combination in Multiclass Classification 42/ 76

47 Discussion Differing Subproblems Favor Independent - Mixed Synthetic Data 2/2 Accuracy (%) results for mixed linear and nonlinear decision boundaries. Standard error over 10 random samplings is indicated in parentheses. reduction shared independent one-vs-all 82.4 (0.6) 83.5 (0.9) one-vs-all-hamming 78.5 (1.3) 79.5 (1.3) all-pairs 82.4 (1.3) 84.2 (0.9) Sam Reid Model Combination in Multiclass Classification 43/ 76

48 one-vs-all-shared one-vs-all-sharedsub one-vs-all-shared-oracle all-pairs-shared all-pairs-sharedsub all-pairs-shared-oracle Discussion one-vs-all all-pairs one-vs-all-hamming all-pairs-squared Multiclass Accuracy Binary Accuracy + Noise arrhythmia: one-vs-all arrhythmia: one-vs-all accuracy multiclass accuracy (%) log2(g) average binary accuracy (%) one-vs-all one-vs-all multi vs binary arrhythmia: all-pairs arrhythmia: all-pairs accuracy multiclass accuracy (%) log2(g) average binary accuracy (%) all-pairs all-pairs multi vs binary Sam Reid Model Combination in Multiclass Classification 44/ 76

49 one-vs-all-shared one-vs-all-sharedsub one-vs-all-shared-oracle all-pairs-shared all-pairs-sharedsub all-pairs-shared-oracle Discussion one-vs-all one-vs-all-hamming all-pairs all-pairs-squared Multiclass Accuracy Binary Accuracy + Noise: Anneal accuracy anneal: one-vs-all log2(g) multiclass accuracy (%) anneal: one-vs-all average binary accuracy (%) one-vs-all one-vs-all multi vs binary anneal: all-pairs anneal: all-pairs accuracy multiclass accuracy (%) log2(g) average binary accuracy (%) all-pairs all-pairs multi vs binary Sam Reid Model Combination in Multiclass Classification 45/ 76

50 Discussion Multiclass Accuracy Binary Accuracy + Noise: Aggregate R-Squared Value for One-vs-All dj collins mfeat-morphological optdigits letter autos segment vowel authorship synthetic-control eucalyptus vehicle arrhythmia page-blocks ecoli cars hypothyroid waveform halloffame anneal Average R-Squared Value: One-vs-all=0.791, All-pairs=0.910 Sam Reid Model Combination in Multiclass Classification 46/ 76

51 Discussion Multiclass Metric Non-Essential Hypothesis: Advantage of shared due to selection on target multiclass metric To test, implement shared-sub Constraints models to be shared But selected based on average binary accuracy Results comparing shared vs shared-sub one-vs-all: p 0.65 all-pairs: p 0.10 ova-hamming: p 0.57 No statistically significant differences Conclusion: Sharing hyperparameters valuable whether you use avg binary or multiclass metric Sam Reid Model Combination in Multiclass Classification 47/ 76

52 Discussion Oracle Selection favors Shared To rule out sampling problems, use an oracle to select the optimal model Use oracle for both shared and independent one-vs-all all-pairs one-vs-all-hamming all-pairs-squared accuracy shared indep indep indep P-values from the Wilcoxon signed-ranks test are indicated by the winning strategy. For one-vs-all, shared still beats independent Independent wins for all-pairs and one-vs-all-hamming No difference for all-pairs-squared Sam Reid Model Combination in Multiclass Classification 48/ 76

53 Discussion Supplementary Result: Comparing Methods Average ranks of the 7 algorithms under study (omitted ova-ham-indep); algorithms not statistically significantly different from the top-scoring algorithm are connected to it with a vertical line. Sam Reid Model Combination in Multiclass Classification 49/ 76

54 Discussion Conclusions Shared hyperparameters often better than independent optimization Subproblems often similar, especially in one-vs-all If there are different decision boundary shapes, use independent Future Work Multiclass metrics with no binary analog in independent optimization? (e.g. multiclass cost matrix) Relationship to regret transform, Langford & Beygelzimer, 2005 Sam Reid Model Combination in Multiclass Classification 50/ 76

55 Our Method Outline Regularization in Linear Combinations of Multiclass Classifiers Model Discussion Our Method Sam Reid Model Combination in Multiclass Classification 51/ 76

56 Our Method Pairwise Classification Assuming a classification problem with k 3 classes k(k 1)/2 subproblems, one for each pair of classes Estimate ˆµ ij ( x) µ ij ( x) = P(y = c i y = c i or c j, x) Note that µ ij = p i p i +p j Combine: p = {p 1, p 2,..., p k } = f (ˆµ ij ( x)) Sam Reid Model Combination in Multiclass Classification 52/ 76

57 Our Method Pairwise Classification Subproblem Example Illustration of an A-C decision boundary in a 2D, 3-class example of pairwise classification. Sam Reid Model Combination in Multiclass Classification 53/ 76

58 Our Method Pairwise Classification Methods Voted pairwise classification (VPC): Friedman, 1996 ŷ( x) = argmaxi j:j i 1(ˆµ ij( x) > ˆµ ji ( x)) Equivalent to Bayes optimal prediction if ˆµij ( x) = µ ij ( x) Sam Reid Model Combination in Multiclass Classification 54/ 76

59 Our Method Pairwise Classification Methods Voted pairwise classification (VPC): Friedman, 1996 ŷ( x) = argmaxi j:j i 1(ˆµ ij( x) > ˆµ ji ( x)) Equivalent to Bayes optimal prediction if ˆµij ( x) = µ ij ( x) Hastie & Tibshirani (HT), 1996 Iteratively update p = {p1, p 2,..., p k } Min KL-Divergence between µ and ˆµ, l(p) = i j n ij ˆµ ij ˆµ ij µ ij Converges to minimum of KL divergence Sam Reid Model Combination in Multiclass Classification 54/ 76

60 Our Method Pairwise Classification Methods Voted pairwise classification (VPC): Friedman, 1996 ŷ( x) = argmaxi j:j i 1(ˆµ ij( x) > ˆµ ji ( x)) Equivalent to Bayes optimal prediction if ˆµij ( x) = µ ij ( x) Hastie & Tibshirani (HT), 1996 Iteratively update p = {p1, p 2,..., p k } Min KL-Divergence between µ and ˆµ, l(p) = i j n ij ˆµ ij ˆµ ij µ ij Converges to minimum of KL divergence Wu, Lin, Weng (WLW), 2004 µij = p i p i +p j µ ij µ ji = p i p j k Approx min p i=1 Guaranteed convergence j i (ˆµ jip i ˆµ ij p j ) 2 s.t. k i=1 p i = 1, p i 0 Sam Reid Model Combination in Multiclass Classification 54/ 76

61 Our Method Pairwise Classification Pros (Furnkranz, 2002) Smaller Subproblems Simpler Subproblems Improved Accuracy (disputed by Rifkin & Klautau, 2004) Cons Larger number of subproblems than one-vs-all Each pairwise classifier is trained on only two of the classes but makes predictions for instances from any class (Hastie & Tibshirani, 1996, Cutzu, 2003) e.g. a classifier trained on c A and c B may have unpredictable behavior for instances with y( x) = c C Sam Reid Model Combination in Multiclass Classification 55/ 76

62 Our Method Thesis Statement - Part III When solving a multiclass problem with a set of pairwise binary classifiers, incorporation of the probability of membership in each pair improves performance. Sam Reid Model Combination in Multiclass Classification 56/ 76

63 Our Method : Derivation 1/2 Theorem of Total Probability: Assumes p(b x) = N p(b a i, x)p(a i x) (1) i=1 a 1..a N mutually exclusive and exhaustive so N i=1 p(a i x) = 1 Let b = c i N = 2 a 1 = c i c j a 2 = L c i c j, for L = {c 1..c k } p(c i L, x) = p(c i c i c j, x)p(c i c j L, x) +p(c i L c i c j, x)p(l c i c j L, x) Sam Reid Model Combination in Multiclass Classification 57/ 76

64 Our Method : Derivation 2/2 But p(c i L c i c j, x) = 0 (2) Average over all j i p(c i x) = p(c i c i c j, x)p(c i c j L, x) (3) ˆp(c i L, x) = 1 k 1 ˆp(c i c i c j, x)ˆp(c i c j L, x) (4) j i Normalize so that i ˆp(c i L, x) = 1. Sam Reid Model Combination in Multiclass Classification 58/ 76

65 Our Method Comparison to Other Pairwise Classification Methods PPC Solves for each term pi ( x) independently Models pi + p j = p(i or j L, x) directly Conceptually simpler Easier to implement Theoretically well motivated Hastie-Tibshirani (HT) method approximates p i = j i ( 2 k(k 1) )µ ij (Wu et al., 2004) Equivalent to our method with the assumption pi + p j = 2/k Sam Reid Model Combination in Multiclass Classification 59/ 76

66 Our Method Computational Complexity Computational complexity of one-vs-all (OVA), pairwise coupling (PC) and probabilistic pairwise classification (PPC) OVA PC PPC subproblems k k(k-1)/2 k(k-1) instances per subproblem N 2N/k N (half) + 2N/k (other half) computational complexity/svm O(kN 3 ) O(k 1 N 3 ) O(k 2 N 3 ) Sam Reid Model Combination in Multiclass Classification 60/ 76

67 Our Method Base Classifiers Decision Tree (J48) K-Nearest Neighbor (KNN) Random Forests (RF-100) Support Vector Machines (SVM-121) Sam Reid Model Combination in Multiclass Classification 61/ 76

68 Our Method Base Classifiers Decision Tree (J48) K-Nearest Neighbor (KNN) Random Forests (RF-100) Support Vector Machines (SVM-121) Multiclass Classification Methods Multi (for J48, KNN, RF-100) Voted Pairwise Classification (VPC) Hastie-Tibshirani (HT) Wu, Lin, Weng (WLW) (PPC) Sam Reid Model Combination in Multiclass Classification 61/ 76

69 Our Method Base Classifiers Decision Tree (J48) K-Nearest Neighbor (KNN) Random Forests (RF-100) Support Vector Machines (SVM-121) Multiclass Classification Methods Multi (for J48, KNN, RF-100) Voted Pairwise Classification (VPC) Hastie-Tibshirani (HT) Wu, Lin, Weng (WLW) (PPC) Metrics Accuracy Brier 1 b( x) = 1 1 d j (t j( x) ˆp j ( x)) 2, t j ( x) = 1(y( x) = c j ) Sam Reid Model Combination in Multiclass Classification 61/ 76

70 Our Method Average Accuracy accuracy (%) multiclass vpc ht wlw ppc j48 knn rf100 svm121 Accuracy averaged over all 20 data sets. Sam Reid Model Combination in Multiclass Classification 62/ 76

71 Our Method Average Brier Score 95.5 Rectified Brier score (%) multiclass vpc ht wlw ppc 91.5 j48 knn rf100 svm121 Rectified Brier score averaged over all 20 data sets. Sam Reid Model Combination in Multiclass Classification 63/ 76

72 Our Method Average Ranks Sam Reid Model Combination in Multiclass Classification 64/ 76

73 Our Method Varying Base Classifier Accuracy Accuracy vs. Number of Trees Averaged over 20 Data Sets Accuracy (%) log_10(number of Trees) multi vpc ht wlw ppc Accuracy vs number of trees in random forest Sam Reid Model Combination in Multiclass Classification 65/ 76

74 Our Method Learning Curves Accuracy (%) Learning Curves ,000 Number of Data Points multi vpc ht wlw ppc Accuracy vs sample size for 10 largest data sets Sam Reid Model Combination in Multiclass Classification 66/ 76

75 Our Method Duplicate Decision Boundaries Favors MULTI Hypothesis: Direct multiclass method will outperform PPC when decision boundaries are shared Construct a synthetic data set meant to favor multi-j48 Decision boundaries are shared Sam Reid Model Combination in Multiclass Classification 67/ 76

76 y Our Method Duplicate Decision Boundaries: Noiseless Synthetic Data 1.6 Noiseless Synthetic Data Set x A B C D multi-j48 ppc-j (0.08) 98.7 (0.10) Sam Reid Model Combination in Multiclass Classification 68/ 76

77 y Our Method Duplicate Decision Boundaries: Noisy Synthetic Data 2.00 Noisy Synthetic Data Set x A B C D multi-j48 ppc-j (0.34) 86.0 (0.31) Sam Reid Model Combination in Multiclass Classification 69/ 76

78 Our Method PPC More Accurate at Large Number of Classes 1/2 Relative Accuracy (%) Accuracy relative to random forest vs. # classes Number of Classes Method accuracy relative to RF-100 vpc ht wlw ppc Sam Reid Model Combination in Multiclass Classification 70/ 76

79 Our Method PPC More Accurate at Large Number of Classes 2/2 Relative accuracy (%) Accuracy for Discretized Regression Data Sets Number of Classes housing autompg meta pbc quake sensory strike cholesterol cleveland average PPC relative to RF-100 for discretized regression data sets Sam Reid Model Combination in Multiclass Classification 71/ 76

80 Our Method Terms in PPC estimate equally important Hypothesis: Both terms in the PPC estimate are equally important ˆp(c i L, x) = 1 ˆp(c i c i c j, x)ˆp(c i c j L, x) k 1 j i Pairwise term: ˆp(c i c i c j, x) Weight (pair-vs-rest) term: ˆp(c i c j L, x) Use J48 decision trees, 100 replications, 20 data sets. Adjusted p-values under various degradations. hypothesis p Holm both vs. none 2.25E-10 no-pair vs. none 6.87E-05 no-weight vs. none 7.49E-04 both vs. no-weight both vs. no-pair no-weight vs. no-pair Sam Reid Model Combination in Multiclass Classification 72/ 76

81 Our Method PPC Summary & Conclusions Introduced new pairwise classification algorithm, PPC Based on Theorem of Total Probability Explicitly models p(c i c j L, x) Outperforms or ties related methods For several base classifiers, metrics, data sets Some data sets benefit from direct multiclass methods PPC works well at large # classes Future Work Faster but less accurate pair-vs-rest classifier? Independent vs. shared in PPC? Sam Reid Model Combination in Multiclass Classification 73/ 76

82 Our Method Thesis Statement Multiclass classification problems can be productively solved by combining multiple classifiers. Specifically: In linear combinations of multiclass classifiers, regularization significantly improves performance. When solving a multiclass problem with a set of binary classifiers, it is more effective to constrain subproblems to use the same hyperparameters than to optimize each independently. When solving a multiclass problem with a set of pairwise binary classifiers, incorporation of the probability of membership in each pair improves performance. Sam Reid Model Combination in Multiclass Classification 74/ 76

83 Our Method Acknowledgments PhET Interactive Simulations NSF Grants SBE Science of Learning Center (Garrison Cottrell, PI) BCS BCS SBE Mike Mozer, Greg Grudic Dissertation Support Group/CAPS Turing Institute UCI Repository Sam Reid Model Combination in Multiclass Classification 75/ 76

84 Our Method Questions? Questions? Sam Reid Model Combination in Multiclass Classification 76/ 76

Regularized Linear Models in Stacked Generalization

Regularized Linear Models in Stacked Generalization Regularized Linear Models in Stacked Generalization Sam Reid and Greg Grudic Department of Computer Science University of Colorado at Boulder USA June 11, 2009 Reid & Grudic (Univ. of Colo. at Boulder)

More information

Predicting Solutions to the Optimal Power Flow Problem

Predicting Solutions to the Optimal Power Flow Problem Thomas Navidi Suvrat Bhooshan Aditya Garg Abstract Predicting Solutions to the Optimal Power Flow Problem This paper discusses an implementation of gradient boosting regression to predict the output of

More information

Antonio Olmos Priyalatha Govindasamy Research Methods & Statistics University of Denver

Antonio Olmos Priyalatha Govindasamy Research Methods & Statistics University of Denver Antonio Olmos Priyalatha Govindasamy Research Methods & Statistics University of Denver American Evaluation Association Conference, Chicago, Ill, November 2015 AEA 2015, Chicago Ill 1 Paper overview Propensity

More information

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here. From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here. About this Book... ix About the Author... xiii Acknowledgments...xv Chapter 1 Introduction...

More information

Supervised Learning to Predict Human Driver Merging Behavior

Supervised Learning to Predict Human Driver Merging Behavior Supervised Learning to Predict Human Driver Merging Behavior Derek Phillips, Alexander Lin {djp42, alin719}@stanford.edu June 7, 2016 Abstract This paper uses the supervised learning techniques of linear

More information

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018 Review of Linear Regression I Statistics 211 - Statistical Methods II Presented January 9, 2018 Estimation of The OLS under normality the OLS Dan Gillen Department of Statistics University of California,

More information

Appendix B STATISTICAL TABLES OVERVIEW

Appendix B STATISTICAL TABLES OVERVIEW Appendix B STATISTICAL TABLES OVERVIEW Table B.1: Proportions of the Area Under the Normal Curve Table B.2: 1200 Two-Digit Random Numbers Table B.3: Critical Values for Student s t-test Table B.4: Power

More information

Optimal Vehicle to Grid Regulation Service Scheduling

Optimal Vehicle to Grid Regulation Service Scheduling Optimal to Grid Regulation Service Scheduling Christian Osorio Introduction With the growing popularity and market share of electric vehicles comes several opportunities for electric power utilities, vehicle

More information

The Degrees of Freedom of Partial Least Squares Regression

The Degrees of Freedom of Partial Least Squares Regression The Degrees of Freedom of Partial Least Squares Regression Dr. Nicole Krämer TU München 5th ESSEC-SUPELEC Research Workshop May 20, 2011 My talk is about...... the statistical analysis of Partial Least

More information

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size blu38582_if_1-8.qxd 9/27/10 9:19 PM Page 1 Important Formulas Chapter 3 Data Description Mean for individual data: Mean for grouped data: Standard deviation for a sample: X2 s X n 1 or Standard deviation

More information

AGENT-BASED MODELING, SIMULATION, AND CONTROL SOME APPLICATIONS IN TRANSPORTATION

AGENT-BASED MODELING, SIMULATION, AND CONTROL SOME APPLICATIONS IN TRANSPORTATION AGENT-BASED MODELING, SIMULATION, AND CONTROL SOME APPLICATIONS IN TRANSPORTATION Montasir Abbas, Virginia Tech (with contributions from past and present VT-SCORES students, including: Zain Adam, Sahar

More information

A Personalized Highway Driving Assistance System

A Personalized Highway Driving Assistance System A Personalized Highway Driving Assistance System Saina Ramyar 1 Dr. Abdollah Homaifar 1 1 ACIT Institute North Carolina A&T State University March, 2017 aina Ramyar, Dr. Abdollah Homaifar (NCAT) A Personalized

More information

Statistical Learning Examples

Statistical Learning Examples Statistical Learning Examples Genevera I. Allen Statistics 640: Statistical Learning August 26, 2013 (Stat 640) Lecture 1 August 26, 2013 1 / 19 Example: Microarrays arrays High-dimensional: Goals: Measures

More information

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content...

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content... Contents Preface... xi A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content... xii Chapter 1 Introducing Partial Least Squares...

More information

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh Statistic Methods in in Data Mining Business Understanding Data Understanding Data Preparation Deployment Modelling Evaluation Data Mining Process (Part 2) 2) Professor Dr. Gholamreza Nakhaeizadeh Professor

More information

Performance of DC Motor Supplied From Single Phase AC-DC Rectifier

Performance of DC Motor Supplied From Single Phase AC-DC Rectifier Performance of DC Motor Supplied From Single Phase AC-DC Rectifier Dr Othman A. Alnatheer Energy Research Institute-ENRI King Abdulaziz City for Science and Technology- KACST P O Box 6086, Riyadh 11442,

More information

Project Summary Fuzzy Logic Control of Electric Motors and Motor Drives: Feasibility Study

Project Summary Fuzzy Logic Control of Electric Motors and Motor Drives: Feasibility Study EPA United States Air and Energy Engineering Environmental Protection Research Laboratory Agency Research Triangle Park, NC 277 Research and Development EPA/600/SR-95/75 April 996 Project Summary Fuzzy

More information

Example #1: One-Way Independent Groups Design. An example based on a study by Forster, Liberman and Friedman (2004) from the

Example #1: One-Way Independent Groups Design. An example based on a study by Forster, Liberman and Friedman (2004) from the Example #1: One-Way Independent Groups Design An example based on a study by Forster, Liberman and Friedman (2004) from the Journal of Personality and Social Psychology illustrates the SAS/IML program

More information

The Stochastic Energy Deployment Systems (SEDS) Model

The Stochastic Energy Deployment Systems (SEDS) Model The Stochastic Energy Deployment Systems (SEDS) Model Michael Leifman US Department of Energy, Office of Energy Efficiency and Renewable Energy Walter Short and Tom Ferguson National Renewable Energy Laboratory

More information

Optimal Policy for Plug-In Hybrid Electric Vehicles Adoption IAEE 2014

Optimal Policy for Plug-In Hybrid Electric Vehicles Adoption IAEE 2014 Optimal Policy for Plug-In Hybrid Electric Vehicles Adoption IAEE 2014 June 17, 2014 OUTLINE Problem Statement Methodology Results Conclusion & Future Work Motivation Consumers adoption of energy-efficient

More information

Cost-Efficiency by Arash Method in DEA

Cost-Efficiency by Arash Method in DEA Applied Mathematical Sciences, Vol. 6, 2012, no. 104, 5179-5184 Cost-Efficiency by Arash Method in DEA Dariush Khezrimotlagh*, Zahra Mohsenpour and Shaharuddin Salleh Department of Mathematics, Faculty

More information

Workshop on Frame Theory and Sparse Representation for Complex Data June 1, 2017

Workshop on Frame Theory and Sparse Representation for Complex Data June 1, 2017 Workshop on Frame Theory and Sparse Representation for Complex Data June 1, 2017 Xiaoming Huo Georgia Institute of Technology School of industrial and systems engineering I. Statistical Dependence II.

More information

Data envelopment analysis with missing values: an approach using neural network

Data envelopment analysis with missing values: an approach using neural network IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.2, February 2017 29 Data envelopment analysis with missing values: an approach using neural network B. Dalvand, F. Hosseinzadeh

More information

On Using Storage and Genset for Mitigating Power Grid Failures

On Using Storage and Genset for Mitigating Power Grid Failures 1 / 27 On Using Storage and Genset for Mitigating Power Grid Failures Sahil Singla ISS4E lab University of Waterloo Collaborators: S. Keshav, Y. Ghiassi-Farrokhfal 1 / 27 Outline Introduction Background

More information

Inventory Routing for Bike Sharing Systems

Inventory Routing for Bike Sharing Systems Inventory Routing for Bike Sharing Systems mobil.tum 2016 Transforming Urban Mobility Technische Universität München, June 6-7, 2016 Jan Brinkmann, Marlin W. Ulmer, Dirk C. Mattfeld Agenda Motivation Problem

More information

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media Xinyue Liu, Xiangnan Kong, Yanhua Li Worcester Polytechnic Institute February 22, 2017 1 / 34 About

More information

What do autonomous vehicles mean to traffic congestion and crash? Network traffic flow modeling and simulation for autonomous vehicles

What do autonomous vehicles mean to traffic congestion and crash? Network traffic flow modeling and simulation for autonomous vehicles What do autonomous vehicles mean to traffic congestion and crash? Network traffic flow modeling and simulation for autonomous vehicles FINAL RESEARCH REPORT Sean Qian (PI), Shuguan Yang (RA) Contract No.

More information

Data Mining Approach for Quality Prediction and Improvement of Injection Molding Process

Data Mining Approach for Quality Prediction and Improvement of Injection Molding Process Data Mining Approach for Quality Prediction and Improvement of Injection Molding Process Dr. E.V.Ramana Professor, Department of Mechanical Engineering VNR Vignana Jyothi Institute of Engineering &Technology,

More information

Enhancing a Vehicle Re-Identification Methodology based on WIM Data to Minimize the Need for Ground Truth Data

Enhancing a Vehicle Re-Identification Methodology based on WIM Data to Minimize the Need for Ground Truth Data Enhancing a Vehicle Re-Identification Methodology based on WIM Data to Minimize the Need for Ground Truth Data Andrew P. Nichols, PhD, PE Director of ITS, Rahall Transportation Institute Associate Professor,

More information

Integrating remote sensing and ground monitoring data to improve estimation of PM 2.5 concentrations for chronic health studies

Integrating remote sensing and ground monitoring data to improve estimation of PM 2.5 concentrations for chronic health studies Integrating remote sensing and ground monitoring data to improve estimation of PM 2.5 concentrations for chronic health studies Chris Paciorek and Yang Liu Departments of Biostatistics and Environmental

More information

United Power Flow Algorithm for Transmission-Distribution joint system with Distributed Generations

United Power Flow Algorithm for Transmission-Distribution joint system with Distributed Generations rd International Conference on Mechatronics and Industrial Informatics (ICMII 20) United Power Flow Algorithm for Transmission-Distribution joint system with Distributed Generations Yirong Su, a, Xingyue

More information

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. Faculty of Engineering, Mathematics and Science. School of Computer Science and Statistics ST7003-1 TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN Faculty of Engineering, Mathematics and Science School of Computer Science and Statistics Postgraduate Certificate in Statistics Hilary Term 2015

More information

Robust alternatives to best linear unbiased prediction of complex traits

Robust alternatives to best linear unbiased prediction of complex traits Robust alternatives to best linear unbiased prediction of complex traits WHY BEST LINEAR UNBIASED PREDICTION EASY TO EXPLAIN FLEXIBLE AMENDABLE WELL UNDERSTOOD FEASIBLE UNPRETENTIOUS NORMALITY IS IMPLICIT

More information

Rule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata

Rule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata 1 Robotics Rule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata 2 Motivation Construction of mobile robot controller Evolving neural networks using genetic algorithm (Floreano,

More information

Approach for determining WLTPbased targets for the EU CO 2 Regulation for Light Duty Vehicles

Approach for determining WLTPbased targets for the EU CO 2 Regulation for Light Duty Vehicles Approach for determining WLTPbased targets for the EU CO 2 Regulation for Light Duty Vehicles Brussels, 17 May 2013 richard.smokers@tno.nl norbert.ligterink@tno.nl alessandro.marotta@jrc.ec.europa.eu Summary

More information

Calibration. DOE & Statistical Modeling

Calibration. DOE & Statistical Modeling ETAS Webinar - ASCMO Calibration. DOE & Statistical Modeling Injection Consumption Ignition Torque AFR HC EGR P-rail NOx Inlet-cam Outlet-cam 1 1 Soot T-exhaust Roughness What is Design of Experiments?

More information

Technical Papers supporting SAP 2009

Technical Papers supporting SAP 2009 Technical Papers supporting SAP 29 A meta-analysis of boiler test efficiencies to compare independent and manufacturers results Reference no. STP9/B5 Date last amended 25 March 29 Date originated 6 October

More information

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data

Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data Investigating the Concordance Relationship Between the HSA Cut Scores and the PARCC Cut Scores Using the 2016 PARCC Test Data A Research Report Submitted to the Maryland State Department of Education (MSDE)

More information

The MathWorks Crossover to Model-Based Design

The MathWorks Crossover to Model-Based Design The MathWorks Crossover to Model-Based Design The Ohio State University Kerem Koprubasi, Ph.D. Candidate Mechanical Engineering The 2008 Challenge X Competition Benefits of MathWorks Tools Model-based

More information

Accelerated Testing of Advanced Battery Technologies in PHEV Applications

Accelerated Testing of Advanced Battery Technologies in PHEV Applications Page 0171 Accelerated Testing of Advanced Battery Technologies in PHEV Applications Loïc Gaillac* EPRI and DaimlerChrysler developed a Plug-in Hybrid Electric Vehicle (PHEV) using the Sprinter Van to reduce

More information

Modeling and Optimization of a Linear Electromagnetic Piston Pump

Modeling and Optimization of a Linear Electromagnetic Piston Pump Fluid Power Innovation & Research Conference Minneapolis, MN October 10 12, 2016 ing and Optimization of a Linear Electromagnetic Piston Pump Paul Hogan, MS Student Mechanical Engineering, University of

More information

Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test

Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test Using Statistics To Make Inferences 6 Summary Non-parametric tests Wilcoxon Signed Ranks Test Wilcoxon Matched Pairs Signed Ranks Test Wilcoxon Rank Sum Test/ Mann-Whitney Test Goals Perform and interpret

More information

Real-time Bus Tracking using CrowdSourcing

Real-time Bus Tracking using CrowdSourcing Real-time Bus Tracking using CrowdSourcing R & D Project Report Submitted in partial fulfillment of the requirements for the degree of Master of Technology by Deepali Mittal 153050016 under the guidance

More information

Simulated Annealing Algorithm for Customer-Centric Location Routing Problem

Simulated Annealing Algorithm for Customer-Centric Location Routing Problem Simulated Annealing Algorithm for Customer-Centric Location Routing Problem May 22, 2018 Eugene Sohn Advisor: Mohammad Moshref-Javadi, PhD 1 Agenda Why this research? What is this research? Methodology

More information

MODELING SUSPENSION DAMPER MODULES USING LS-DYNA

MODELING SUSPENSION DAMPER MODULES USING LS-DYNA MODELING SUSPENSION DAMPER MODULES USING LS-DYNA Jason J. Tao Delphi Automotive Systems Energy & Chassis Systems Division 435 Cincinnati Street Dayton, OH 4548 Telephone: (937) 455-6298 E-mail: Jason.J.Tao@Delphiauto.com

More information

The Session.. Rosaria Silipo Phil Winters KNIME KNIME.com AG. All Right Reserved.

The Session.. Rosaria Silipo Phil Winters KNIME KNIME.com AG. All Right Reserved. The Session.. Rosaria Silipo Phil Winters KNIME 2016 KNIME.com AG. All Right Reserved. Past KNIME Summits: Merging Techniques, Data and MUSIC! 2016 KNIME.com AG. All Rights Reserved. 2 Analytics, Machine

More information

What s new. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

What s new. Bernd Wiswedel KNIME.com AG. All Rights Reserved. What s new Bernd Wiswedel 2016 KNIME.com AG. All Rights Reserved. What s new 2+1 feature releases last year: 2.12, (3.0), 3.1 (only KNIME Analytics Platform + Server) Changes documented online 2016 KNIME.com

More information

PREDICTION OF FUEL CONSUMPTION

PREDICTION OF FUEL CONSUMPTION PREDICTION OF FUEL CONSUMPTION OF AGRICULTURAL TRACTORS S. C. Kim, K. U. Kim, D. C. Kim ABSTRACT. A mathematical model was developed to predict fuel consumption of agricultural tractors using their official

More information

Optimal Decentralized Protocol for Electrical Vehicle Charging. Presented by: Ran Zhang Supervisor: Prof. Sherman(Xuemin) Shen, Prof.

Optimal Decentralized Protocol for Electrical Vehicle Charging. Presented by: Ran Zhang Supervisor: Prof. Sherman(Xuemin) Shen, Prof. Optimal Decentralized Protocol for Electrical Vehicle Charging Presented by: Ran Zhang Supervisor: Prof. Sherman(Xuemin) Shen, Prof. Liang-liang Xie Main Reference Lingwen Gan, Ufuk Topcu, and Steven Low,

More information

Improving CERs building

Improving CERs building Improving CERs building Getting Rid of the R² tyranny Pierre Foussier pmf@3f fr.com ISPA. San Diego. June 2010 1 Why abandon the OLS? The ordinary least squares (OLS) aims to build a CER by minimizing

More information

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR Tutorial 1 Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR Dataset for running Correlated Component Regression This tutorial 1 is based on data provided by Michel Tenenhaus and

More information

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests * Linking the Virginia SOL Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association (NWEA

More information

Online Learning and Optimization for Smart Power Grid

Online Learning and Optimization for Smart Power Grid 1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical

More information

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications

Vehicle Scrappage and Gasoline Policy. Online Appendix. Alternative First Stage and Reduced Form Specifications Vehicle Scrappage and Gasoline Policy By Mark R. Jacobsen and Arthur A. van Benthem Online Appendix Appendix A Alternative First Stage and Reduced Form Specifications Reduced Form Using MPG Quartiles The

More information

Operations Research & Advanced Analytics 2015 INFORMS Conference on Business Analytics & Operations Research

Operations Research & Advanced Analytics 2015 INFORMS Conference on Business Analytics & Operations Research Simulation Approach for Aircraft Spare Engines & Engine Parts Planning Operations Research & Advanced Analytics 2015 INFORMS Conference on Business Analytics & Operations Research 1 Outline Background

More information

MIT ICAT M I T I n t e r n a t i o n a l C e n t e r f o r A i r T r a n s p o r t a t i o n

MIT ICAT M I T I n t e r n a t i o n a l C e n t e r f o r A i r T r a n s p o r t a t i o n M I T I n t e r n a t i o n a l C e n t e r f o r A i r T r a n s p o r t a t i o n Standard Flow Abstractions as Mechanisms for Reducing ATC Complexity Jonathan Histon May 11, 2004 Introduction Research

More information

Automatic Optimization of Wayfinding Design Supplementary Material

Automatic Optimization of Wayfinding Design Supplementary Material TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL.??, NO.??,???? 1 Automatic Optimization of Wayfinding Design Supplementary Material 1 ADDITIONAL EXAMPLES We use our approach to generate wayfinding

More information

Leveraging AI for Self-Driving Cars at GM. Efrat Rosenman, Ph.D. Head of Cognitive Driving Group General Motors Advanced Technical Center, Israel

Leveraging AI for Self-Driving Cars at GM. Efrat Rosenman, Ph.D. Head of Cognitive Driving Group General Motors Advanced Technical Center, Israel Leveraging AI for Self-Driving Cars at GM Efrat Rosenman, Ph.D. Head of Cognitive Driving Group General Motors Advanced Technical Center, Israel Agenda The vision From ADAS (Advance Driving Assistance

More information

CITY OF EDMONTON COMMERCIAL VEHICLE MODEL UPDATE USING A ROADSIDE TRUCK SURVEY

CITY OF EDMONTON COMMERCIAL VEHICLE MODEL UPDATE USING A ROADSIDE TRUCK SURVEY CITY OF EDMONTON COMMERCIAL VEHICLE MODEL UPDATE USING A ROADSIDE TRUCK SURVEY Matthew J. Roorda, University of Toronto Nico Malfara, University of Toronto Introduction The movement of goods and services

More information

ME scope Application Note 25 Choosing Response DOFs for a Modal Test

ME scope Application Note 25 Choosing Response DOFs for a Modal Test ME scope Application Note 25 Choosing Response DOFs for a Modal Test The steps in this Application Note can be duplicated using any ME'scope Package that includes the VES-3600 Advanced Signal Processing

More information

How to: Test & Evaluate Motors in Your Application

How to: Test & Evaluate Motors in Your Application How to: Test & Evaluate Motors in Your Application Table of Contents 1 INTRODUCTION... 1 2 UNDERSTANDING THE APPLICATION INPUT... 1 2.1 Input Power... 2 2.2 Load & Speed... 3 2.2.1 Starting Torque... 3

More information

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests * Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association

More information

Optimization of Three-stage Electromagnetic Coil Launcher

Optimization of Three-stage Electromagnetic Coil Launcher Sensors & Transducers 2014 by IFSA Publishing, S. L. http://www.sensorsportal.com Optimization of Three-stage Electromagnetic Coil Launcher 1 Yujiao Zhang, 1 Weinan Qin, 2 Junpeng Liao, 3 Jiangjun Ruan,

More information

Implementing Dynamic Retail Electricity Prices

Implementing Dynamic Retail Electricity Prices Implementing Dynamic Retail Electricity Prices Quantify the Benefits of Demand-Side Energy Management Controllers Jingjie Xiao, Andrew L. Liu School of Industrial Engineering, Purdue University West Lafayette,

More information

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian Sharif University of Technology Graduate School of Management and Economics Econometrics I Fall 2010 Seyed Mahdi Barakchian Textbook: Wooldridge, J., Introductory Econometrics: A Modern Approach, South

More information

PROACTIVE PRODUCT SERVICING

PROACTIVE PRODUCT SERVICING 1 PROACTIVE PRODUCT SERVICING Necip Doganaksoy, GE Global Research Gerry Hahn, GE Global Research, Retired Bill Meeker, Iowa State University 2009 QUALITY & PRODUCTIVITY RESEARCH CONFERENCE 2 STATISTICALLY

More information

Queuing Models to Analyze Electric Vehicle Usage Patterns

Queuing Models to Analyze Electric Vehicle Usage Patterns Queuing Models to Analyze Electric Vehicle Usage Patterns Ken Lau Data Scientist Alberta Gaming and Liquor Commission About Me Completed Master s in Statistics at University of British Columbia (2015)

More information

The State of Charge Estimation of Power Lithium Battery Based on RBF Neural Network Optimized by Particle Swarm Optimization

The State of Charge Estimation of Power Lithium Battery Based on RBF Neural Network Optimized by Particle Swarm Optimization Journal of Applied Science and Engineering, Vol. 20, No. 4, pp. 483 490 (2017) DOI: 10.6180/jase.2017.20.4.10 The State of Charge Estimation of Power Lithium Battery Based on RBF Neural Network Optimized

More information

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests * Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. February 2016 Introduction Northwest Evaluation Association

More information

Supporting Information

Supporting Information 1 Supporting Information 2 3 4 5 6 7 8 9 10 11 12 Daily estimation of ground-level PM 2.5 concentrations over Beijing using 3 km resolution MODIS AOD Yuanyu Xie 1, Yuxuan Wang* 1,2,3, Kai Zhang 4, Wenhao

More information

Modeling Strategies for Design and Control of Charging Stations

Modeling Strategies for Design and Control of Charging Stations Modeling Strategies for Design and Control of Charging Stations George Michailidis U of Michigan www.stat.lsa.umich.edu/ gmichail NSF Workshop, 11/15/2013 Michailidis EVs and Charging Stations NSF Workshop,

More information

Integrated Operations Knut Hovda UiO, May 20th 2011 ABB Industry Examples Calculations and engineering software. ABB Group June 17, 2011 Slide 1

Integrated Operations Knut Hovda UiO, May 20th 2011 ABB Industry Examples Calculations and engineering software. ABB Group June 17, 2011 Slide 1 Integrated Operations Knut Hovda UiO, May 20th 2011 ABB Industry Examples Calculations and engineering software ABB Group June 17, 2011 Slide 1 Contents About the speaker Introduction to ABB Oil, Gas &

More information

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests * Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. March 2016 Introduction Northwest Evaluation Association

More information

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests * Linking the Kansas KAP Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. February 2016 Introduction Northwest Evaluation Association (NWEA

More information

Level of Service Classification for Urban Heterogeneous Traffic: A Case Study of Kanapur Metropolis

Level of Service Classification for Urban Heterogeneous Traffic: A Case Study of Kanapur Metropolis Level of Service Classification for Urban Heterogeneous Traffic: A Case Study of Kanapur Metropolis B.R. MARWAH Professor, Department of Civil Engineering, I.I.T. Kanpur BHUVANESH SINGH Professional Research

More information

Prediction Model of Driving Behavior Based on Traffic Conditions and Driver Types

Prediction Model of Driving Behavior Based on Traffic Conditions and Driver Types Proceedings of the 12th International IEEE Conference on Intelligent Transportation Systems, St. Louis, MO, USA, October 3-7, 29 WeAT4.2 Prediction Model of Driving Behavior Based on Traffic Conditions

More information

Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion

Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion Online Appendix for Subways, Strikes, and Slowdowns: The Impacts of Public Transit on Traffic Congestion ByMICHAELL.ANDERSON AI. Mathematical Appendix Distance to nearest bus line: Suppose that bus lines

More information

Linking the Alaska AMP Assessments to NWEA MAP Tests

Linking the Alaska AMP Assessments to NWEA MAP Tests Linking the Alaska AMP Assessments to NWEA MAP Tests February 2016 Introduction Northwest Evaluation Association (NWEA ) is committed to providing partners with useful tools to help make inferences from

More information

Rolling resistance as a part of total resistance plays a

Rolling resistance as a part of total resistance plays a Rolling resistance plays a critical role in fuel consumption of mining haul trucks A. Soofastaei, L. Adair, S.M. Aminossadati, M.S. Kizil and P. Knights Mining3, The University of Queensland Australia.

More information

A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89

A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89 International Journal of Networks and Communications 2012, 2(1): 11-16 DOI: 10.5923/j.ijnc.20120201.02 A Viewpoint on the Decoding of the Quadratic Residue Code of Length 89 Hung-Peng Lee Department of

More information

Does V50 Depend on Armor Mass?

Does V50 Depend on Armor Mass? REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-088 Public reporting burden for this collection of information is estimated to average hour per response, including the time for reviewing instructions,

More information

LOCAL VERSUS CENTRALIZED CHARGING STRATEGIES FOR ELECTRIC VEHICLES IN LOW VOLTAGE DISTRIBUTION SYSTEMS

LOCAL VERSUS CENTRALIZED CHARGING STRATEGIES FOR ELECTRIC VEHICLES IN LOW VOLTAGE DISTRIBUTION SYSTEMS LOCAL VERSUS CENTRALIZED CHARGING STRATEGIES FOR ELECTRIC VEHICLES IN LOW VOLTAGE DISTRIBUTION SYSTEMS Presented by: Amit Kumar Tamang, PhD Student Smart Grid Research Group-BBCR aktamang@uwaterloo.ca

More information

MARINE FOUR-STROKE DIESEL ENGINE CRANKSHAFT MAIN BEARING OIL FILM LUBRICATION CHARACTERISTIC ANALYSIS

MARINE FOUR-STROKE DIESEL ENGINE CRANKSHAFT MAIN BEARING OIL FILM LUBRICATION CHARACTERISTIC ANALYSIS POLISH MARITIME RESEARCH Special Issue 2018 S2 (98) 2018 Vol. 25; pp. 30-34 10.2478/pomr-2018-0070 MARINE FOUR-STROKE DIESEL ENGINE CRANKSHAFT MAIN BEARING OIL FILM LUBRICATION CHARACTERISTIC ANALYSIS

More information

Building Fast and Accurate Powertrain Models for System and Control Development

Building Fast and Accurate Powertrain Models for System and Control Development Building Fast and Accurate Powertrain Models for System and Control Development Prasanna Deshpande 2015 The MathWorks, Inc. 1 Challenges for the Powertrain Engineering Teams How to design and test vehicle

More information

SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING. Oliver Rose

SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING. Oliver Rose Proceedings of the 22 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. SOME ISSUES OF THE CRITICAL RATIO DISPATCH RULE IN SEMICONDUCTOR MANUFACTURING Oliver Rose

More information

Oil Palm Ripeness Detector (OPRID) and Non-Destructive Thermal Method of Palm Oil Quality Estimation

Oil Palm Ripeness Detector (OPRID) and Non-Destructive Thermal Method of Palm Oil Quality Estimation Oil Palm Ripeness Detector (OPRID) and Non-Destructive Thermal Method of Palm Oil Quality Estimation Abdul Rashid Mohamed Shariff, Shahrzad Zolfagharnassab, Alhadi Aiad H. Ben Dayaf, Goh Jia Quan, Adel

More information

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1

Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population 1 Effect of Sample Size and Method of Sampling Pig Weights on the Accuracy of Estimating the Mean Weight of the Population C. B. Paulk, G. L. Highland 2, M. D. Tokach, J. L. Nelssen, S. S. Dritz 3, R. D.

More information

Meeting product specifications

Meeting product specifications Optimisation of a diesel hydrotreating unit A model based on operating data is used to meet sulphur product specifications at lower DHT reactor temperatures with longer catalyst life Jose Bird Valero Energy

More information

INTRODUCTION. I.1 - Historical review.

INTRODUCTION. I.1 - Historical review. INTRODUCTION. I.1 - Historical review. The history of electrical motors goes back as far as 1820, when Hans Christian Oersted discovered the magnetic effect of an electric current. One year later, Michael

More information

Deliverables. Genetic Algorithms- Basics. Characteristics of GAs. Switch Board Example. Genetic Operators. Schemata

Deliverables. Genetic Algorithms- Basics. Characteristics of GAs. Switch Board Example. Genetic Operators. Schemata Genetic Algorithms Deliverables Genetic Algorithms- Basics Characteristics of GAs Switch Board Example Genetic Operators Schemata 6/12/2012 1:31 PM copyright @ gdeepak.com 2 Genetic Algorithms-Basics Search

More information

Risk-Based Collision Avoidance in Semi-Autonomous Vehicles

Risk-Based Collision Avoidance in Semi-Autonomous Vehicles Independent Work Report Spring, 2016 Risk-Based Collision Avoidance in Semi-Autonomous Vehicles Christopher Hay 17 Adviser: Thomas Funkhouser Abstract Although there have been a number of advances in active

More information

Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear. Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method

Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear. Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method Econometrics for Health Policy, Health Economics, and Outcomes Research Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method

More information

Adaptive diversification metaheuristic for the FSMVRPTW

Adaptive diversification metaheuristic for the FSMVRPTW Overview Adaptive diversification metaheuristic for the FSMVRPTW Olli Bräysy, University of Jyväskylä Pekka Hotokka, University of Jyväskylä Yuichi Nagata, Advanced Institute of Science and Technology

More information

Scaling of Betweenness Centrality in Weighted Complex Networks

Scaling of Betweenness Centrality in Weighted Complex Networks Scaling of Betweenness Centrality in Weighted Complex Networks B K k k PHYSICAL REVIEW E 70, 026109 (2004) FIG. 4. Algebraic scaling between B K k and k for a weighted Characterization of weighted complex

More information

Assignment 3 solutions

Assignment 3 solutions Assignment 3 solutions Question 1: SVM on the OJ data (a) [2 points] Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations. library(islr)

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 33 PLS Dimension Reduction for Classification with Microarray Data Anne-Laure Boulesteix Department of Statistics,

More information

Online Learning and Optimization for Smart Power Grid

Online Learning and Optimization for Smart Power Grid 1 2016 IEEE PES General Meeting Panel on Domain-Specific Big Data Analytics Tools in Power Systems Online Learning and Optimization for Smart Power Grid Seung-Jun Kim Department of Computer Sci. and Electrical

More information

Reliability of Hybrid Vehicle System

Reliability of Hybrid Vehicle System Reliability of Hybrid Vehicle System 2004 Toyota Prius hybrid vehicle Department of Industrial and Manufacturing Systems Engineering Iowa State University December 13, 2016 1 Hybrid Vehicles 2 Motivation

More information

Online appendix for "Fuel Economy and Safety: The Influences of Vehicle Class and Driver Behavior" Mark Jacobsen

Online appendix for Fuel Economy and Safety: The Influences of Vehicle Class and Driver Behavior Mark Jacobsen Online appendix for "Fuel Economy and Safety: The Influences of Vehicle Class and Driver Behavior" Mark Jacobsen A. Negative Binomial Specification Begin by stacking the model in (7) and (8) to write the

More information

Support for the revision of the CO 2 Regulation for light duty vehicles

Support for the revision of the CO 2 Regulation for light duty vehicles Support for the revision of the CO 2 Regulation for light duty vehicles and #3 for - No, Maarten Verbeek, Jordy Spreen ICCT-workshop, Brussels, April 27, 2012 Objectives of projects Assist European Commission

More information