Statistical Learning Examples

Similar documents
Regularized Linear Models in Stacked Generalization

PARTIAL LEAST SQUARES: WHEN ORDINARY LEAST SQUARES REGRESSION JUST WON T WORK

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Integrating remote sensing and ground monitoring data to improve estimation of PM 2.5 concentrations for chronic health studies

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content...

Optimal Vehicle to Grid Regulation Service Scheduling

Grade 3: Houghton Mifflin Math correlated to Riverdeep Destination Math

Prediction Model of Driving Behavior Based on Traffic Conditions and Driver Types

EVS28 KINTEX, Korea, May 3-6, 2015

PARTIAL LEAST SQUARES: APPLICATION IN CLASSIFICATION AND MULTIVARIABLE PROCESS DYNAMICS IDENTIFICATION

Relating your PIRA and PUMA test marks to the national standard

Relating your PIRA and PUMA test marks to the national standard

KNIME Software Pieces KNIME.com AG. All Rights Reserved. 1

CSE 40171: Artificial Intelligence. Artificial Neural Networks: Neural Network Architectures

A computational Approach to Behavior

Predicting Solutions to the Optimal Power Flow Problem

Oil Palm Ripeness Detector (OPRID) and Non-Destructive Thermal Method of Palm Oil Quality Estimation

The Session.. Rosaria Silipo Phil Winters KNIME KNIME.com AG. All Right Reserved.

The Degrees of Freedom of Partial Least Squares Regression

What s cooking. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

PREDICTION OF REMAINING USEFUL LIFE OF AN END MILL CUTTER SEOW XIANG YUAN

What s new. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

COMP 776: Computer Vision

Multiple Imputation of Missing Blood Alcohol Concentration (BAC) Values in FARS

The pathway to self-driving vehicles: Disconnects between human capabilities and advanced vehicle systems?

Data Mining Approach for Quality Prediction and Improvement of Injection Molding Process

Inventory Routing for Bike Sharing Systems

Analysis of Fault Diagnosis of Bearing using Supervised Learning Method

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

Houghton Mifflin MATHEMATICS. Level 1 correlated to Chicago Academic Standards and Framework Grade 1

Software for Data-Driven Battery Engineering. Battery Intelligence. AEC 2018 New York, NY. Eli Leland Co-Founder & Chief Product Officer 4/2/2018

Data envelopment analysis with missing values: an approach using neural network

Smart Operation for AC Distribution Infrastructure Involving Hybrid Renewable Energy Sources

Method for the estimation of the deformation frequency of passenger cars with the German In-Depth Accident Study (GIDAS)

Rule-based Integration of Multiple Neural Networks Evolved Based on Cellular Automata

Detection of Braking Intention in Diverse Situations during Simulated Driving based on EEG Feature Combination: Supplement

IDEA for GOES-R ABI. Presented by S. Kondragunta, NESDIS/STAR. Team Members: R. Hoff and H. Zhang, UMBC

Topic 5 Lecture 3 Estimating Policy Effects via the Simple Linear. Regression Model (SLRM) and the Ordinary Least Squares (OLS) Method

Non-destructive, portable, handheld spectroscopic devices for screening purposes

The digitalization of the energy system will computers take over? Michael Weinhold CTO Siemens Energy Management

Supervised Learning to Predict Human Driver Merging Behavior

Driving Pattern Recognition for Adaptive Hybrid Vehicle Control

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

Statistical Applications in Genetics and Molecular Biology

Bioconductor s sva package

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Road Surface characteristics and traffic accident rates on New Zealand s state highway network

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

Robust alternatives to best linear unbiased prediction of complex traits

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Linking the Alaska AMP Assessments to NWEA MAP Tests

Assignment 3 solutions

Elements of Applied Stochastic Processes

SYSTEM CONFIGURATION OF INTELLIGENT PARKING ASSISTANT SYSTEM

Leveraging AI for Self-Driving Cars at GM. Efrat Rosenman, Ph.D. Head of Cognitive Driving Group General Motors Advanced Technical Center, Israel

An Introduction to Partial Least Squares Regression

Linking the Florida Standards Assessments (FSA) to NWEA MAP

Improving Analog Product knowledge using Principal Components Variable Clustering in JMP on test data.

Queuing Models to Analyze Electric Vehicle Usage Patterns

Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media

Up scaling Agent Based Discrete Choice Transportation Models using Artificial Neural Networks

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 8

A Distributed Neurocomputing Approach for Infrasound Event Classification

Developments in Electrification and Implications for the United States Electric Industry U.S. Department of Energy Perspective

UKSM: Swift Memory Deduplication via Hierarchical and Adaptive Memory Region Distilling

USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKING

Regression Analysis of Count Data

The New ISO/CD Standard

Intelligent Fault Analysis in Electrical Power Grids

Utility Rate Design for Solar PV Customers

M a s t e r s d e g r e e p r o j e c t r e p o r t

Journal of Emerging Trends in Computing and Information Sciences

Light-Duty Automotive Technology and Fuel Economy Trends: 1975 Through Appendixes

ARTIFICIAL NEURAL NETWORK SIMULATION OF PRIME MOVER FOR THE ROLLING PROCESS IN THE THREE HIGH ROLLING MILLS

Domain-invariant Partial Least Squares (di-pls) Regression: A novel method for unsupervised and semi-supervised calibration model adaptation

Missouri Learning Standards Grade-Level Expectations - Mathematics

EXPERIENCE IN A COMPANY-WIDE LONG DISTANCE CARPOOL PROGRAM IN SOUTH KOREA

What s new. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

Ultraviolet absorption spectra for biodiesel quality sensing

Fast and Robust Optimization Approaches for Pedestrian Detection

Linking the Mississippi Assessment Program to NWEA MAP Tests

CMPEN 411 VLSI Digital Circuits Spring Lecture 20: Multiplier Design

Post Crash Fire and Blunt Force Fatal Injuries in U.S. Registered, Type Certificated Rotorcraft

LECTURE 6: HETEROSKEDASTICITY

GPP PGS2 PARKING GUIDANCE SYSTEM

Improvements to the Hybrid2 Battery Model

Cooperative Autonomous Driving and Interaction with Vulnerable Road Users

OPTIMIZING COMMERCIAL SOLAR

Linking the Indiana ISTEP+ Assessments to NWEA MAP Tests

Algebra 2 Plus, Unit 10: Making Conclusions from Data Objectives: S- CP.A.1,2,3,4,5,B.6,7,8,9; S- MD.B.6,7

The DPM Detector. Code:

CarConnect Balancing Act Conference Thursday 8th September Ben Godfrey Innovation and Low Carbon Networks Engineer Western Power Distribution

State-of-the-Art and Future Trends in Testing of Active Safety Systems

Autonomous inverted helicopter flight via reinforcement learning

RESEARCH ON ASSESSMENTS

BASIC MECHATRONICS ENGINEERING

Fourth Grade. Multiplication Review. Slide 1 / 146 Slide 2 / 146. Slide 3 / 146. Slide 4 / 146. Slide 5 / 146. Slide 6 / 146

Transcription:

Statistical Learning Examples Genevera I. Allen Statistics 640: Statistical Learning August 26, 2013 (Stat 640) Lecture 1 August 26, 2013 1 / 19

Example: Microarrays arrays High-dimensional: Goals: Measures gene expression. Often tens of thousands of genes. Only a couple hundred subjects. Find genes indicative of a certain disease. Predict survival outcomes based on gene expression. Estimate regulatory pathways. genes (Stat 640) Lecture 1 August 26, 2013 2 / 19

Example: Face Recognition Goal: Recognize facial expression. Goal: Recognize faces from a database. (Stat 640) Lecture 1 August 26, 2013 3 / 19

Example: Handwritten Digits (Stat 640) Lecture 1 August 26, 2013 4 / 19

Example: Computer Vision Object Recognition: (Le Cun et al., 2004) (Stat 640) Lecture 1 August 26, 2013 5 / 19

Example: Netflix Movie Rating Data Rows: Movies. Columns: Customers. Measurement: Movie ratings (scale of 1-5). Anne Ben Charlie Doug Eve... Star Wars 2 5 4 4 3... Harry Potter 3 4 5 3?... Pretty Woman 4? 2? 5... Titanic 5? 2 1 3... Lord of the Rings? 5 5 4 4............ (Stat 640) Lecture 1 August 26, 2013 6 / 19

Netflix Prize Challenge: Predict un-rated movies with 10% improvement over Cinematch. Training Set: 480,000 customer ratings on 18,000 movies. Around 98.7% missing ratings! $1,000,000 prize! Contest: October 2006 - August 2009. Winners: Team led by Robert Bell and Yehuda Koren. Methods: Variations on the SVD and k-nearest neighbors (Bell & Koren, 2008). Fields: Recommender systems & Collaborative filtering. (Stat 640) Lecture 1 August 26, 2013 7 / 19

My Research (Stat 640) Lecture 1 August 26, 2013 8 / 19

Modern Multivariate Analysis (Stat 640) Lecture 1 August 26, 2013 9 / 19

Example: NMR Spectroscopy Goal: Find chemical signatures to classify neural cells. (Stat 640) Lecture 1 August 26, 2013 10 / 19

Example: EEG Data Goal: Find major brain activation patterns. (Stat 640) Lecture 1 August 26, 2013 11 / 19

Example: fmri Data Goal: Find major patterns & areas of brain activation. (Stat 640) Lecture 1 August 26, 2013 12 / 19

Markov Networks: Models and Inference (Stat 640) Lecture 1 August 26, 2013 13 / 19

hsa-mir-519c hsa-mir-518c hsa-mir-489 hsa-mir-512-2 hsa-mir-525 hsa-mir-138-2 hsa-mir-512-1 hsa-mir-526b hsa-mir-520b hsa-mir-517b hsa-mir-519a-2 hsa-mir-520a hsa-mir-518b hsa-mir-25 hsa-mir-138-1 hsa-mir-1224 hsa-mir-516a-1 hsa-mir-518e hsa-mir-7-2 hsa-mir-548s hsa-mir-548t hsa-mir-519a-1 hsa-mir-516a-2 hsa-mir-522 hsa-mir-767 hsa-let-7a-3 hsa-mir-527 hsa-mir-548o hsa-mir-92a-2 hsa-mir-3678 hsa-mir-518a-2 hsa-mir-30d hsa-mir-338 hsa-mir-149 hsa-let-7a-1 hsa-mir-29c hsa-mir-3682 hsa-mir-105-2 hsa-mir-105-1 hsa-mir-2114 hsa-let-7b hsa-mir-577 hsa-mir-3191 hsa-mir-30a hsa-let-7a-2 hsa-mir-375 hsa-mir-135b hsa-mir-135a-2 hsa-mir-7-3 hsa-mir-101-1 hsa-mir-103-1 hsa-mir-190b hsa-mir-153-1 hsa-mir-182 hsa-mir-663 hsa-mir-135a-1 hsa-mir-205 hsa-mir-10a hsa-mir-153-2 hsa-mir-29a hsa-mir-545 hsa-mir-200b hsa-mir-224 hsa-mir-342 hsa-let-7f-2 hsa-mir-934 hsa-mir-3687 hsa-mir-1251 hsa-mir-636 hsa-mir-203 hsa-mir-1237 hsa-mir-944 hsa-mir-3662 hsa-mir-3622a hsa-mir-183 hsa-mir-200c hsa-mir-184 hsa-mir-3648 hsa-mir-1269 hsa-mir-200a hsa-mir-99b hsa-mir-452 hsa-mir-202 hsa-mir-125b-1 hsa-mir-1254 hsa-mir-148a hsa-mir-21 hsa-mir-378c hsa-mir-585 hsa-mir-126 hsa-mir-551b hsa-let-7c hsa-mir-3176 hsa-mir-33b hsa-mir-1910 hsa-mir-488 hsa-mir-204 hsa-mir-196a-2 hsa-mir-210 hsa-mir-10b hsa-mir-22 hsa-mir-147b hsa-mir-639 hsa-mir-196a-1 hsa-mir-383 hsa-mir-3926-2 hsa-mir-99a hsa-mir-100 hsa-mir-301b hsa-mir-145 hsa-mir-3610 hsa-mir-1262 hsa-mir-1229 hsa-mir-1277 hsa-mir-139 hsa-mir-592 hsa-mir-3194 hsa-mir-658 hsa-mir-665 hsa-mir-3199-2 hsa-mir-3619 hsa-mir-142 hsa-mir-143 hsa-mir-141 hsa-mir-541 hsa-mir-486 hsa-mir-3652 hsa-mir-1258 hsa-mir-1247 hsa-mir-3174 hsa-mir-1295 hsa-mir-379 hsa-mir-1284 hsa-mir-144 hsa-mir-3187 hsa-mir-150 hsa-mir-206 hsa-mir-380 hsa-mir-433 hsa-mir-579 hsa-mir-1537 hsa-mir-211 hsa-mir-451 hsa-mir-3199-1 hsa-mir-605 hsa-mir-187 hsa-mir-487a hsa-mir-494 hsa-mir-376a-2 hsa-mir-607 hsa-mir-329-1 hsa-mir-543 hsa-mir-329-2 hsa-mir-3664 hsa-mir-3690 hsa-mir-891a hsa-mir-3136 hsa-mir-3150 hsa-mir-656 hsa-mir-1-2 hsa-mir-376a-1 hsa-mir-1292 hsa-mir-376b hsa-mir-3651 hsa-mir-2277 hsa-mir-133a-1 hsa-mir-378b hsa-mir-133a-2 hsa-mir-643 hsa-mir-556 hsa-mir-133b Example: Networks for Count Data Goal: Find associations between genes measured via RNA-sequencing. (Stat 640) Lecture 1 August 26, 2013 14 / 19

Example: Integrated Genomic Networks Goal: Find links between different types of genomic biomarkers. (Stat 640) Lecture 1 August 26, 2013 15 / 19

Example: Functional Brain Networks Goal: Find functional relationships between remote brain regions. (Stat 640) Lecture 1 August 26, 2013 16 / 19

Example: Inference for Functional Brain Networks How do functional connections differ between two populations of subjects? (Stat 640) Lecture 1 August 26, 2013 17 / 19

What will we cover in this class? Linear Regression & Penalized Regression. Linear Classification & Penalized Classification. Linear Discriminant Analysis. Support Vector Machines. Kernel Methods. Model Selection & Assessment (cross-validation, stability selection). Matrix Factorizations: PCA, Sparse PCA, ICA, NMF. Matrix Completion. Markov Networks (undirected graphical models). Clustering - K-means, Hierarchical & Spectral. Boosting. Ensemble Methods. Decision Trees & Random Forests. (Stat 640) Lecture 1 August 26, 2013 18 / 19

What is not covered in this class? Wavelets, Basis Transformations, Kernel Smoothing. Splines & Functional Data Methods. Mixture Models. Soft Clustering. SOM, Isomap, Laplacian embedding. Semi-Supervised methods. Manifold Learning. Generalized Additive Models & MARS. Neural Networks. Boltzmann machines. Bayesian Networks. Variational Inference. Markov Decision Processes. Hidden Markov Models. (Stat 640) Lecture 1 August 26, 2013 19 / 19