Optimization of Chromatogram Alignment Using A Class Separability Criterion

Similar documents
GC/MS Analysis of Trace Fatty Acid Methyl Esters (FAME) in Jet Fuel Using Energy Institute Method IP585

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Free and Total Glycerol in B100 Biodiesel by Gas Chromatography According to Methods EN and ASTM D6584

Increased sensitivity and reproducibility in the analysis of trace fatty acid methyl esters in jet fuel

PLS score-loading correspondence and a bi-orthogonal factorization

Gas Chromatographic Analysis of Diesel Fuel Dilution for In-Service Motor Oil Using ASTM Method D7593

DIFFERENTIATION OF CRUDE OILS, FUEL OILS, AND USED LUBRICATING OIL USING DIAGNOSTIC RATIOS

Technical Procedure for Gas Chromatography-Mass Spectrometry (GC-MS)

APPLICATION OF SOLID PHASE MICROEXTRACTION (SPME) IN PROFILING HYDROCARBONS IN OIL SPILL CASES

Determination of Free and Total Glycerin in Pure Biodiesel (B100) by GC in Compliance with EN 14105

Application Note. Author. Introduction. Energy and Fuels

Technical Procedure for Pyrolysis-Gas Chromatography/Mass Spectrometry (Py-GC-MS)

White Paper. Improving Accuracy and Precision in Crude Oil Boiling Point Distribution Analysis. Introduction. Background Information

Project Reference No.: 40S_B_MTECH_007

Simulated Distillation Analyzers, Software, Standards, Consumables, Training

GC Analysis of Total Fatty Acid Methyl Esters (FAME) and Methyl Linolenate in Biodiesel Using the Revised EN14103:2011 Method

Emissions from Heavy-Duty Diesel Engine with EGR using Oil Sands Derived Fuels

Application Note. Authors. Abstract. Energy & Chemicals

Methanol in Biodiesel by EN14110 with the HT3 and Versa Automated Headspace Analyzers. Versa HT3. Application Note. Abstract.

Technical Procedure for Gas Chromatography (GC-FID)

ME scope Application Note 29 FEA Model Updating of an Aluminum Plate

Biodistillate Fuels and Emissions in the U.S.

Background on Biodiesel

Determination of Free and Total Glycerin in B100 Biodiesel

Effects of Biodiesel on Plastics

POLLUTION CONTROL AND INCREASING EFFICIENCY OF DIESEL ENGINE USING BIODIESEL

Mineral Turpentine Adulterant in Lubricating Oil

Automated Screening of GC-TOFMS Chromatograms with Specific Detection for Chlorine, Bromine, and Sulfur Containing Compounds

Analysis of Fatty Acid Methyl Esters (FAMES), and Examination of Biodiesel Samples for these Components, by GCxGC-FID

Bomb Calorimetry and Viscometry: What Properties Make a Good Fuel?

Agilent Multimode Inlet for Gas Chromatography

Study of viscosity - temperature characteristics of rapeseed oil biodiesel and its blends

Optimized Method for Analysis of Commercial and Prepared Biodiesel using UltraPerformance Convergence Chromatography (UPC 2 )

Agilent 7696A Sample Prep WorkBench Automated Sample Preparation for the GC Analysis of Biodiesel Using Method EN14105:2011

Supplementary Material: Outlier analyses of the Protein Data Bank archive using a Probability- Density-Ranking approach

RESEARCH REPORT PRODUCTION OF BIODIESEL FROM CHICKEN FAT WITH COMBINATION SUBCRITICAL METHANOL AND WATER PROCESS

Detection of Sulfur Compounds in Natural Gas According to ASTM D5504 with an Agilent Dual Plasma Sulfur Chemiluminescence Detector

SELERITY TECHNOLOGIES SOLUTIONS FOR YOUR SUPERCRITICAL FLUID NEEDS

Application Note. Determination of Oxygenates in C2, C3, C4 and C5 hydrocarbon Matrices according ASTM D using AC OXYTRACER

Alternative Carrier Gases for ASTM D7213 Simulated Distillation Analysis

Improving the Quality and Production of Biogas from Swine Manure and Jatropha (Jatropha curcas) Seeds

Novel Quantitative Method for Biodiesel Analysis

Detection of Volatile Organic Compounds in Gasoline and Diesel Using the znose Edward J. Staples, Electronic Sensor Technology

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

AVFL-17 Biodiesel Emissions

Thermal Conversion of Fossil and Renewable Feedstocks

[ APPLICATION NOTE ] INTRODUCTION APPLICATION BENEFITS WATERS SOLUTIONS KEYWORDS

Simultaneous Determination of Fatty Acid Methyl Esters Contents in the Biodiesel by HPLC-DAD Method

CERTIFICATE OF ACCREDITATION

Data envelopment analysis with missing values: an approach using neural network

Beverage Grade Carbon Dioxide

About the authors xi. Woodhead Publishing Series in Energy. Preface

PREDICTION OF FUEL CONSUMPTION

High Throughput Mineral Oil Analysis (Hydrocarbon Oil Index) by GC-FID Using the Agilent Low Thermal Mass (LTM) System

Comparison of Karanja, Mahua and Polanga Biodiesel Production through Response Surface Methodology

Saddam H. Al-lwayzy. Supervisors: Dr. Talal Yusaf Dr. Paul Baker Dr. Troy Jensen 3/24/2013 1

Determination of Volume Correction Factors for FAME and FAME / Mineral-diesel blends

Production and Properties of Biodistillate Transportation Fuels

Experimental Investigation and Modeling of Liquid-Liquid Equilibria in Biodiesel + Glycerol + Methanol

Analysis on natural characteristics of four-stage main transmission system in three-engine helicopter

Performance Test of IC Engine Using Blends of Ethanol and Kerosene with Diesel

Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test

REPORT DOCUMENTATION PAGE

A Battery Smart Sensor and Its SOC Estimation Function for Assembled Lithium-Ion Batteries

High-Speed High-Performance Model Predictive Control of Power Electronics Systems

ASTM D for Denatured Fuel Ethanol Automating Calculations and Reports with Empower 2 Software

Study on crystallization mechanism of saturated fatty acid methyl ester in biodiesel

High Temperature Simulated Distillation Performance Using the Agilent 8890 Gas Chromatograph

Initial Development of an Advanced Test Method for Jet Fuel Identification and Characterization

A Single Method for the Direct Determination of Total Glycerols in All Biodiesels Using Liquid Chromatography and Charged Aerosol Detection

Online sample cleanup on the Agilent 1290 Infinity LC using a built in 2-position/6-port valve

Tennessee Department of Agriculture

Appendix A.1 Calculations of Engine Exhaust Gas Composition...9

The Analysis of Hydrocarbon Composition in LPG by Gas Chromatography using the DVLS Liquefied Gas Injector

Comparison of two Exhaust Manifold Pressure Estimation Methods

STUDY ON ENTREPRENEURIAL OPPORTUNITIES IN BIODIESEL PRODUCTION FROM WASTE COCONUT OIL AND ITS UTILIZATION IN DIESEL ENGINE

Estimation of Unmeasured DOF s on a Scaled Model of a Blade Structure

Biodiesel Production from Used Cooking Oil using Calcined Sodium Silicate Catalyst

Analysis of Glycerin and Glycerides in Biodiesel (B100) Using ASTM D6584 and EN Application. Author. Abstract. Introduction

GRD Journals- Global Research and Development Journal for Engineering Volume 1 Issue 12 November 2016 ISSN:

Fast and Reliable Trace Gas Analysis Improved Detection Limits for the Agilent 490 Micro GC

Prediction of Physical Properties and Cetane Number of Diesel Fuels and the Effect of Aromatic Hydrocarbons on These Entities

Biomass Fuel Applications in IC Engines

Optimization of the Temperature and Reaction Duration of One Step Transesterification

Analysis of Mahua Biodiesel Production with Combined Effects of Input Trans-Esterification Process Parameters

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

CHAPTER 2 LITERATURE REVIEW AND SCOPE OF THE PRESENT STUDY

Meeting the Requirements of EN12916:2006 (IP391/07) Using Agilent 1200 Series HPLC Systems

Vivek Pandey 1, V.K. Gupta 2 1,2 Department of Mechanical Engineering, College of Technology, GBPUA&T, Pantnagar, India

Forensic Identification of Gasoline Samples D.A. Birkholz 1, Michael Langdeau 1, Preston Kulmatycki, 1 and Tammy Henderson. Abstract.

Statistics for Social Research

Application Note. Abstract. Authors. Environmental Analysis

Using the PSD for Backflushing on the Agilent 8890 GC System

Performance Characteristics of Ethanol Derived From Food Waste As A Fuel in Diesel Engine

Lesson Plan. Time This lesson should take approximately 180 minutes (introduction 45 minutes, presentation 90 minutes, and quiz 45 minutes).

Approach for determining WLTPbased targets for the EU CO 2 Regulation for Light Duty Vehicles

Clean Fuels Symposium: Driving Alternative Transportation

The preparation of biodiesel from rape seed oil or other suitable vegetable oils

Modeling and Optimization of Trajectory-based HCCI Combustion

High-Temperature Simulated Distillation System Based on the 6890N GC Application

Transcription:

Optimization of Chromatogram Alignment Using A Class Separability Criterion Gopal Yalla Department of Mathematics and Computer Science Department of Chemistry College of the Holy Cross April 28, 2015 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 1 / 38

Outline 1 Introduction to Chromatography 2 Theory and Techniques 3 Experimental Data 4 Data Preprocessing 5 Results 6 Extended Results 7 Acknowledgements Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 2 / 38

Gas Chromatography The gas chromatograph (GC)) is the main instrument used for separating the components of a mixture. Two Phases: Mobile Phase and Stationary phase Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 3 / 38

Mass Spectrometry The mass spectrometer (MS) identifies the amount and type of chemicals present in a sample. Components are ionized and separated according mass. The mass spectrum is a definite pattern of the number of ions present at each mass level Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 4 / 38

Chromatograms GC + MS produces chromatograms. x-axis displays retention time in the GC column y-azis displays molecular abundance in sample Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38

Chromatograms GC + MS produces chromatograms. x-axis displays retention time in the GC column y-azis displays molecular abundance in sample Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38

Chromatographic Data Analysis Peak Area Extraction æ Judgement of number and type of chemical components must be made by the user. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis Peak Area Extraction æ Judgement of number and type of chemical components must be made by the user. æ Straightforward, but time consuming. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis Peak Area Extraction æ Judgement of number and type of chemical components must be made by the user. æ Straightforward, but time consuming. æ Sacrifice interesting trends. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis Peak Area Extraction æ Judgement of number and type of chemical components must be made by the user. æ Straightforward, but time consuming. æ Sacrifice interesting trends. æ Di cult with complex data... Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Peak Area Extraction (Con t) Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 7 / 38

Alignment Issue When dealing with multiple samples, fluctuations in peak height and peak location occur. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38

Alignment Issue When dealing with multiple samples, fluctuations in peak height and peak location occur. Without peak location alignment, trends determined by chemometric methods will be skewed or meaningless. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38

Alignment Techniques. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38

Alignment Techniques Correlation Optimized Warping (COW): Given two parameters segment size (m) andmax warp (t), a chromatogram P is aligned to a target chromatogram T. Dynamic Programming: Solves combinatorial optimization problems. COW uses two matrices, F and U of size (S + 1) (L + 1). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38

COW Algorithm Correlation Optimized Warping (COW): Given two parameters segment size (m) andmax warp (t), a chromatogram P is aligned to a target chromatogram T. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

COW Algorithm Correlation Optimized Warping (COW): Given two parameters segment size (m) andmax warp (t), a chromatogram P is aligned to a target chromatogram T. Choice of target chromatogram is based on similarity index, NŸ SI j = r(x j, x n ). n=1 Where r(, ) represents Pearson s correlation coe cient. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

COW Algorithm Correlation Optimized Warping (COW): Given two parameters segment size (m) andmax warp (t), a chromatogram P is aligned to a target chromatogram T. What is the optimal choice of COW parameters?. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

Nomenclature and Terminology a = scalars a = column vector A = data matrices Row index n corresponds to sample chromatogram Column index m corresponds to retention time M total retention times N total chromatogram N k total chromatograms in the kth class K total classes x (Q) kn is the nth chromatogram in the kth class processed with correction method Q. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 11 / 38

Alignment Metrics: Warping E ect Warping E ect = Simplicity + Peak Factor Simplicity ([0, 1]): How close is data to rank 1 matrix Q Q ˆ RR Rÿ ıÿ simplicity = asvd ax/ Ù K ÿn k Mÿ bb r=1 xknm 2 k=1 n=1 m=1 4 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38

Alignment Metrics: Warping E ect Warping E ect = Simplicity + Peak Factor Simplicity ([0, 1]): How close is data to rank 1 matrix Q Q ˆ RR Rÿ ıÿ simplicity = asvd ax/ Ù K ÿn k Mÿ bb r=1 xknm 2 k=1 n=1 m=1 Peak Factor ([0, 1]): How much the shape and peak area of chromatograms have been changed by warping peak factor = 1 Kÿ ÿn k (1 min(c kn, 1) 2 ) N k=1 n=1 Î x (COW) kn Î Îx kn Î where c kn = represents a relative error between - Î x kn Î - aligned and unaligned chromatogram. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38 4

Alignment Metric: Hotelling Trace Criterion Hotelling Trace Criterion HTC Incorporates both within class and between class variation in the data set. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Alignment Metric: Hotelling Trace Criterion Hotelling Trace Criterion HTC Incorporates both within class and between class variation in the data set. ø HTC Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Alignment Metric: Hotelling Trace Criterion Hotelling Trace Criterion HTC Incorporates both within class and between class variation in the data set. HTC Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Hotelling Trace Criterion Define the sample mean vector and sample covariance matrix for the kth class as: x k = 1 N k ÿn k n=1 x kn, S k = 1 N k 1 ÿn k n=1 (x kn x k )(x kn x k ) t. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion Define the sample mean vector and sample covariance matrix for the kth class as: x k = 1 N k ÿn k n=1 x kn, S k = 1 N k 1 ÿn k n=1 (x kn x k )(x kn x k ) t. Let P k = N k /N be the probability of occurrence of class k. Thegrand mean vector is given by: Kÿ x = P k x k. k=1 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion Define the sample mean vector and sample covariance matrix for the kth class as: x k = 1 N k ÿn k n=1 x kn, S k = 1 N k 1 ÿn k n=1 (x kn x k )(x kn x k ) t. Let P k = N k /N be the probability of occurrence of class k. Thegrand mean vector is given by: Kÿ x = P k x k. k=1 The within-class scatter matrix and between-class scatter matrix is defined as: Kÿ Kÿ S wc = P k S k, S bc = P k ( x k x)( x k x) t. k=1 k=1 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion (Con t) The HTC is defined as: J = tr! S 1 " wc S bc Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Hotelling Trace Criterion (Con t) The HTC is defined as: J = tr! S 1 " wc S bc When K = 2, HTC reduces to the Mahalanobis distance J =( x 1 x 2 ) t S 1 ( x 1 x 2 ) Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Hotelling Trace Criterion (Con t) The HTC is defined as: J = tr! S 1 " wc S bc When K = 2, HTC reduces to the Mahalanobis distance J =( x 1 x 2 ) t S 1 ( x 1 x 2 ) When K = 2andM = 1, HTC reduces to the square of a t-statistic 1 2 J = t 2 2 N Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Experimental Data 5 Classes of Biodiesel: Soy (6 di erent samples) Canola (3 di erent samples) Tallow (3 di erent samples) Waste Grease (2 di erent samples) Hybrid (1 sample) } Each sample tested 3di erentruns 45 Total Chromatograms Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data 5 Classes of Biodiesel: Soy (6 di erent samples) Canola (3 di erent samples) Tallow (3 di erent samples) Waste Grease (2 di erent samples) Hybrid (1 sample) Chemical Structure: FAMEs (Fatty acid methyl ester) } Each sample tested 3di erentruns 45 Total Chromatograms Variable length of carbon chain and number of double bonds. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data 5 Classes of Biodiesel: Soy (6 di erent samples) Canola (3 di erent samples) Tallow (3 di erent samples) Waste Grease (2 di erent samples) Hybrid (1 sample) Chemical Structure: FAMEs (Fatty acid methyl ester) } Each sample tested 3di erentruns 45 Total Chromatograms Variable length of carbon chain and number of double bonds. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data 5 Classes of Biodiesel: Soy (6 di erent samples) Canola (3 di erent samples) Tallow (3 di erent samples) Waste Grease (2 di erent samples) Hybrid (1 sample) Reaction Process: } Each sample tested 3di erentruns 45 Total Chromatograms Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data 5 Classes of Biodiesel: Soy (6 di erent samples) Canola (3 di erent samples) Tallow (3 di erent samples) Waste Grease (2 di erent samples) Hybrid (1 sample) Sample Chromatogram: } Each sample tested 3di erentruns 45 Total Chromatograms Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Data Preprocessing: Timeline 1 Baseline Correction 2 COW Alignment 3 Normalization & Mean Centering 4 Principal Component Transformation 5 Computed Metrics Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 17 / 38

Baseline Problem Need to correct for non-linear increase in baseline caused from: Gradual increase in oven temperature Column Bleeding Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 18 / 38

Baseline Correction Use asymmetric least squares smoothing to determine baseline vector b Õ that minimizes f (b Õ )=Îw t (b Õ x kn )Î 2 + ÎDb Õ Î 2 w is a vector of weights is a relaxation parameter D is a second di erence matrix Î Î is the Euclidean norm Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction Use asymmetric least squares smoothing to determine baseline vector b Õ that minimizes f (b Õ )=Îw t (b Õ x kn )Î 2 + ÎDb Õ Î 2 w is a vector of weights is a relaxation parameter D is a second di erence matrix Î Î is the Euclidean norm Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction Use asymmetric least squares smoothing to determine baseline vector b Õ that minimizes f (b Õ )=Îw t (b Õ x kn )Î 2 + ÎDb Õ Î 2 w is a vector of weights is a relaxation parameter D is a second di erence matrix Î Î is the Euclidean norm Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction Use asymmetric least squares smoothing to determine baseline vector b Õ that minimizes f (b Õ )=Îw t (b Õ x kn )Î 2 + ÎDb Õ Î 2 w is a vector of weights is a relaxation parameter D is a second di erence matrix Î Î is the Euclidean norm Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction: Finding Peaks Let x kn = s + b + where s is true peak height, b is true smooth basline, and is normal random error with small deviation. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: Finding Peaks Let x kn = s + b + where s is true peak height, b is true smooth basline, and is normal random error with small deviation. Let m i be median vector of points in x kn over an appropriate window centered at time index i. æ m b æ x kn b + æ 1.4826 median ( x kn m ). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: Finding Peaks Let x kn = s + b + where s is true peak height, b is true smooth basline, and is normal random error with small deviation. Let m i be median vector of points in x kn over an appropriate window centered at time index i. Y _] 0 if x kni > m i ± 2 w i = _[ 1 if x kni Æm i ± 2 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: Results Using b Õ to estimate b gives, x (BC) kn = x kn b Õ s + Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38

Baseline Correction: Results Using b Õ to estimate b gives, x (BC) kn = x kn b Õ s + Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38

Normalization and Mean Centering Each chromatogram x (BC,COW) kn should be normalized to account for variations in injection volume. x (BC,COW,NORM) kn = Ā x (BC,COW) kn A kn where A kn represents total area of each chromatogram, and Ā is average total area of all chromatograms. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38

Normalization and Mean Centering Each chromatogram x (BC,COW) kn should be normalized to account for variations in injection volume. x (BC,COW,NORM) kn = Ā x (BC,COW) kn A kn where A kn represents total area of each chromatogram, and Ā is average total area of all chromatograms. Each chromatogram should be mean centered to the origin. x (BC,COW,NORM,MC) kn = x (BC,COW,NORM) kn x (BC,COW,NORM) where x (BC,COW,NORM,MC) kn is the sample mean chromatogram. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38

Principal Component Analysis HTC was evaluated on the principal component transformed data. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

Principal Component Analysis HTC was evaluated on the principal component transformed data. Let S represent the the sample covariance matrix of the entire set of preprocessed data, with eigenvalue decomposition: S = U U t Then y kn, the vector of PC s, is computed via the transformation y kn = U t x (BC,COW,NORM,MC) kn Eigenvalues correspond to how much variation is explained in each PC. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

Principal Component Analysis HTC was evaluated on the principal component transformed data. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

HTC Evaluated on PCs Let z kn =(y kn1, y kn2,, y knl ) t denote the L 1 vector corresponding to the first L PCs of y kn.thesample mean vector and sample covariance matrix for the kth class are given respectively by z k = 1 N k ÿn k n=1 z kn, S k = The grand mean vector is given by z = 1 N k 1 ÿn k n=1 Kÿ P k z k. k=1 (z kn z k )(z kn z k ) t. The within-class scatter matrix and between-class scatter matrix is defined as: Kÿ Kÿ S wc = P k S k, S bc = P k ( z k z)( z k z) t. HTC is given by, k=1 k=1 J = tr (S 1 wc S bc ) Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 24 / 38

Computed Metrics Density Plots for Warp E ect & HTC: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics Density Plots for Warp E ect & HTC: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics Density Plots for Warp E ect & HTC: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics Density Plots for Warp E ect & HTC: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Results: PC1 vs. PC2 Max Warp E ect: (26,15) Max HTC (1 PC): (64,3) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC2 Max Warp E ect: (26,15) Max HTC (2 PC): (55,8) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC2 Max Warp E ect: (26,15) Max HTC (3 PC): (70,6) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC3 Max Warp E ect: (26,15) Max HTC (1 PC): (64,3) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC1 vs. PC3 Max Warp E ect: (26,15) Max HTC (2 PC): (55,8) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC1 vs. PC3 Max Warp E ect: (26,15) Max HTC (3 PC): (70,6) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC2 vs. PC3 Max Warp E ect: (26,15) Max HTC (1 PC): (64,3) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Results: PC2 vs. PC3 Max Warp E ect: (26,15) Max HTC (2 PC): (55,8) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Results: PC2 vs. PC3 Max Warp E ect: (26,15) Max HTC (3 PC): (70,6) soy ( ), canola (ù), tallow ( ), waste grease (ú), hybrid (+). Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Summary of Results Based on our data, HTC leads to better alignment than warping e ect æ Greater Euclidean Distance between class means Ratios for Segment Length/Max Warp (55,8) to (26,15) Class Soy Canola Tallow Waste Grease Soy 0 - - - Canola 1.18 0 - - Tallow 1.13 1.09 0 - Waste Grease 1.22 1.16 1.12 0 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Summary of Results Based on our data, HTC leads to better alignment than warping e ect æ Greater Euclidean Distance between class means Ratios for Segment Length/Max Warp (55,8) to (26,15) Class Soy Canola Tallow Waste Grease Soy 0 - - - Canola 1.18 0 - - Tallow 1.13 1.09 0 - Waste Grease 1.22 1.16 1.12 0 æ Smaller within-class variation. Ratios for Segment Length/Max Warp (55,8) to (26,15) Class 1st Major Axis 2nd Major Axis Soy 0.94 0.92 Canola 1.06 0.80 Tallow 0.86 1.30 Waste Grease 0.68 0.68 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Summary of Results Based on our data, HTC leads to better alignment than warping e ect æ Greater Euclidean Distance between class means Ratios for Segment Length/Max Warp (55,8) to (26,15) Class Soy Canola Tallow Waste Grease Soy 0 - - - Canola 1.18 0 - - Tallow 1.13 1.09 0 - Waste Grease 1.22 1.16 1.12 0 æ Smaller within-class variation. Ratios for Segment Length/Max Warp (55,8) to (26,15) Clear parametric distinction. Class 1st Major Axis 2nd Major Axis Soy 0.94 0.92 Canola 1.06 0.80 Tallow 0.86 1.30 Waste Grease 0.68 0.68 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Project Milestone! 1 Published Work in Journal of Chemometrics Soares Edward J., Yalla Gopal R., O Connor John B., Walsh Kevin A., and Hupp Amber M. (2015), Hotelling trace criterion as a figure of merit for the optimization of chromatogram alignment, J. Chemometrics, 29, pages 200-212. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 30 / 38

More Complex Data: Biodiesel-Diesel Blends 210 chromatograms with three di erent attributes æ Feedstock: Pure Diesel, Soy, Canola, IRE Tallow, Texas Tallow, Waste Grease æ Diesel Type: Flynn,Hess,Shell,Sunoco æ Blend Ratio: B2, B5, B10, B20 Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 31 / 38

Diesel Results Before Alignment: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

Diesel Results After Alignment and Optimization: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

Diesel Results After Alignment and Optimization: Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

Classification B10 Biodiesel Samples Shell Sunoco Texas Tallow 12 5( ) 5(*) IRE Tallow 12 5( ) 5(*) Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38

Classification B10 Biodiesel Samples Shell Sunoco Texas Tallow 12 5( ) 5(*) IRE Tallow 12 5( ) 5(*) Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38

Broader Impact 1 Determine chemical components that contribute the most to the energy content of fuel æ Create synthetic biomaterial with energy content? 2 Forensic / Environment Concerns æ Determine origins and consequence of oil spill Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 34 / 38

Future Work Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work 1 Algorithmic Development æ COW has very long computation time. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work 1 Algorithmic Development æ COW has very long computation time. æ No parametric pattern Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work 1 Algorithmic Development æ COW has very long computation time. æ No parametric pattern 2 Larger Sample Size for HTC Results Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Acknowledgements Thank you for listening! Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements Thank you for listening! Professor Amber Hupp Professor Kevin Walsh Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements Thank you for listening! Professor Amber Hupp Professor Kevin Walsh Colette Houssan Mike Comiskey Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements Thank you for listening! Professor Amber Hupp Professor Kevin Walsh Colette Houssan Mike Comiskey Department of Mathematics & Computer Science Department of Chemistry Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements Journal of Chemometrics Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38

Acknowledgements Journal of Chemometrics University Syringe Program Grant from Hamilton Company (AMH). Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH). College of the Holy Cross. Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38

Iowa Renewable Energy Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38 Acknowledgements Journal of Chemometrics University Syringe Program Grant from Hamilton Company (AMH). Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH). College of the Holy Cross. National Institute of Standards and Technology (NIST, Gaithersburg, MD) Western Dubuque Biodiesel ADM Company, Keystone Biofuels, TMT Biofuels, Texas Green Manufacturing

Thank you Professor Soares! Couldn t have done it without you Sauce! Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 38 / 38