From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here.

Similar documents
Appendix B STATISTICAL TABLES OVERVIEW

Preface... xi. A Word to the Practitioner... xi The Organization of the Book... xi Required Software... xii Accessing the Supplementary Content...

Statistics and Quantitative Analysis U4320. Segment 8 Prof. Sharyn O Halloran

Sharif University of Technology. Graduate School of Management and Economics. Econometrics I. Fall Seyed Mahdi Barakchian

The Session.. Rosaria Silipo Phil Winters KNIME KNIME.com AG. All Right Reserved.

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

LECTURE 6: HETEROSKEDASTICITY

Oil Palm Ripeness Detector (OPRID) and Non-Destructive Thermal Method of Palm Oil Quality Estimation

Novel Algorithms for Induction Motor Efficiency Estimation

PREDICTION OF REMAINING USEFUL LIFE OF AN END MILL CUTTER SEOW XIANG YUAN

A Personalized Highway Driving Assistance System

Voting Draft Standard

WHITE PAPER. Preventing Collisions and Reducing Fleet Costs While Using the Zendrive Dashboard

Regularized Linear Models in Stacked Generalization

CONTENTS Duct Jet Propulsion / Rocket Propulsion / Applications of Rocket Propulsion / 15 References / 25

Antonio Olmos Priyalatha Govindasamy Research Methods & Statistics University of Denver

Statistical Learning Examples

PARTIAL LEAST SQUARES: WHEN ORDINARY LEAST SQUARES REGRESSION JUST WON T WORK

ASTM Standard for Hit/Miss POD Analysis

Important Formulas. Discrete Probability Distributions. Probability and Counting Rules. The Normal Distribution. Confidence Intervals and Sample Size

The Stochastic Energy Deployment Systems (SEDS) Model

HASIL OUTPUT SPSS. Reliability Scale: ALL VARIABLES

Automotive Powertrain Assembly Analysis with Abaqus

Data envelopment analysis with missing values: an approach using neural network

Dassault Systèmes Automotive Powertrain Assembly Analysis with Abaqus

Predicting Solutions to the Optimal Power Flow Problem

PRISM TM Refining and Marketing Industry Analysis

Getting Started with Correlated Component Regression (CCR) in XLSTAT-CCR

Using Statistics To Make Inferences 6. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank Sum Test/ Mann-Whitney Test

CHAPTER 1 INTRODUCTION

Contents. Preface... xiii Introduction... xv. Chapter 1: The Systems Approach to Control and Instrumentation... 1

. Enter. Model Summary b. Std. Error. of the. Estimate. Change. a. Predictors: (Constant), Emphaty, reliability, Assurance, responsive, Tangible

Vibration, and Sound Quality

Assignment 3 solutions

Regression Analysis of Count Data

TABLE OF CONTENTS. Table of contents. Page ABSTRACT ACKNOWLEDGEMENTS TABLE OF TABLES TABLE OF FIGURES

AUTOMATIC SELF-CLEANING TOILET SEAT

A REPORT ON THE STATISTICAL CHARACTERISTICS of the Highlands Ability Battery CD

Workshop on Frame Theory and Sparse Representation for Complex Data June 1, 2017

Linking the North Carolina EOG Assessments to NWEA MAP Growth Tests *

ACRYLICACID AND ACRYLIC ESTERS

Contents INTRODUCTION...

Introduction to Particulate Emissions 1. Gasoline Engine Particulate Emissions Introduction 3. References 7 About the Authors 8

Regulatory Treatment Of Recoating Costs

What s new. Bernd Wiswedel KNIME.com AG. All Rights Reserved.

Featured Articles Utilization of AI in the Railway Sector Case Study of Energy Efficiency in Railway Operations

Battery Power Management

Linking the New York State NYSTP Assessments to NWEA MAP Growth Tests *

Distribution Forecasting Working Group

STUDY OF THE INFLUENCE OF THE TYPE OF FUEL USED IN INTERNAL COMBUSTION ENGINES OVER THE RHEOLOGICAL PROPERTIES OF LUBRICANTS

Electrical Power Systems

Supervised Learning to Predict Human Driver Merging Behavior

Smart Operation for AC Distribution Infrastructure Involving Hybrid Renewable Energy Sources

Embedded Torque Estimator for Diesel Engine Control Application

Contents 1 Introduction Reliability and Quality Mathematics Introduction to Reliability and Quality

Modelling and Analysis of Crash Densities for Karangahake Gorge, New Zealand

DRP DER Growth Scenarios Workshop. DER Forecasts for Distribution Planning- Electric Vehicles. May 3, 2017

Lecture 2. Review of Linear Regression I Statistics Statistical Methods II. Presented January 9, 2018

An Introduction to Partial Least Squares Regression

Automated Driving - Object Perception at 120 KPH Chris Mansley

PN: A. Reactive Steering Kit Installation Manual

The MathWorks Crossover to Model-Based Design

Linking the Virginia SOL Assessments to NWEA MAP Growth Tests *

Linking the Georgia Milestones Assessments to NWEA MAP Growth Tests *

Student-Level Growth Estimates for the SAT Suite of Assessments

Technical Manual for Gibson Test of Cognitive Skills- Revised

Linking the Indiana ISTEP+ Assessments to the NWEA MAP Growth Tests. February 2017 Updated November 2017

Linking the Kansas KAP Assessments to NWEA MAP Growth Tests *

Joint Research Centre

SOC estimation of LiFePO 4 Li-ion battery using BP Neural Network

PUBLICATIONS Silvia Ferrari February 24, 2017

Linking the Alaska AMP Assessments to NWEA MAP Tests

TRY OUT 25 Responden Variabel Kepuasan / x1

THERMOELECTRIC SAMPLE CONDITIONER SYSTEM (TESC)

Intelligent Fault Analysis in Electrical Power Grids

for Commercial Vehicles

Meeting product specifications

INTEGRATED SCHEDULING OF DRAYAGE AND LONG-HAUL TRANSPORT

Understanding the benefits of using a digital valve controller. Mark Buzzell Business Manager, Metso Flow Control

1. INTRODUCTION 3 2. COST COMPONENTS 17

Applying Categorical Data Analysis to Multi-way Contingency Table Location, Accident Type, and Related Factors With Severity

Remote Process Analysis for Process Analysis and Optimization

The wind/solar hybrid controller is control device which can control wind turbine and solar panel at the same time and

Data Mining Approach for Quality Prediction and Improvement of Injection Molding Process

A DIAGNOSTIC MAINTENANCE SYSTEM FOR COMMERICIAL AND NAVAL VESSELS JANE CULLUM

PROCESS ECONOMICS PROGRAM

A DIFFERENCE IN ROLLOVER FREQUENCY BETWEEN CHEVROLET AND GMC TRUCKS. Hans C. Joksch. The University of Michigan Transportation Research Institute

Expected Energy Not Served (EENS) Study for Vancouver Island Transmission Reinforcement Project (Part I: Reliability Improvements due to VITR)

Modeling the Electrically Assisted Variable Speed (EAVS) Supercharger

Fundamentals of Engineering High-Performance Actuator Systems. Kenneth W. Hummel

BUILDING A ROBUST INDUSTRY INDEX BASED ON LONGITUDINAL DATA

Technology for Safe and Lightweight Automobiles

ACOUSTIC EMISSION IN-SERVICE ACTIVE CORROSION MONITORING & ASSESSMENT ON ABOVE GROUND ATMOSPHERIC STORAGE TANK FLOORS

Appendix A.1 Calculations of Engine Exhaust Gas Composition...9

Classifying Fatal Automobile Accidents in the US,

CHAPTER V CONCLUSION, SUGGESTION AND LIMITATION. 1. Independent commissioner boards proportion does not negatively affect

Approach for determining WLTPbased targets for the EU CO 2 Regulation for Light Duty Vehicles

Financial Risk Modelling and. Portfolio Optimization with R. Second Edition. Bernhard Pfaff

Elements of Applied Stochastic Processes

COMP 776: Computer Vision

Transcription:

From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT. Full book available for purchase here. About this Book... ix About the Author... xiii Acknowledgments...xv Chapter 1 Introduction... 1 1.1 Book Overview... 1 1.2 Overview of Credit Risk Modeling... 2 1.3 Regulatory Environment... 3 1.3.1 Minimum Capital Requirements... 4 1.3.2 Expected Loss... 5 1.3.3 Unexpected Loss... 6 1.3.4 Risk Weighted Assets... 6 1.4 SAS Software Utilized... 7 1.5 Chapter Summary... 11 1.6 References and Further Reading... 11 Chapter 2 Sampling and Data Pre-Processing... 13 2.1 Introduction... 13 2.2 Sampling and Variable Selection... 16 2.2.1 Sampling... 17 2.2.2 Variable Selection... 18 2.3 Missing Values and Outlier Treatment... 19 2.3.1 Missing Values... 19 2.3.2 Outlier Detection... 21 2.4 Data Segmentation... 22 2.4.1 Decision Trees for Segmentation... 23 2.4.2 K-Means Clustering... 24

iv 2.5 Chapter Summary... 25 2.6 References and Further Reading... 25 Chapter 3 Development of a Probability of Default (PD) Model... 27 3.1 Overview of Probability of Default... 27 3.1.1 PD Models for Retail Credit... 28 3.1.2 PD Models for Corporate Credit... 28 3.1.3 PD Calibration... 29 3.2 Classification Techniques for PD... 29 3.2.1 Logistic Regression... 29 3.2.2 Linear and Quadratic Discriminant Analysis... 31 3.2.3 Neural Networks... 32 3.2.4 Decision Trees... 33 3.2.5 Memory Based Reasoning... 34 3.2.6 Random Forests... 34 3.2.7 Gradient Boosting... 35 3.3 Model Development (Application Scorecards)... 35 3.3.1 Motivation for Application Scorecards... 36 3.3.2 Developing a PD Model for Application Scoring... 36 3.4 Model Development (Behavioral Scoring)... 47 3.4.1 Motivation for Behavioral Scorecards... 48 3.4.2 Developing a PD Model for Behavioral Scoring... 49 3.5 PD Model Reporting... 52 3.5.1 Overview... 52 3.5.2 Variable Worth Statistics... 52 3.5.3 Scorecard Strength... 54 3.5.4 Model Performance Measures... 54 3.5.5 Tuning the Model... 54 3.6 Model Deployment... 55 3.6.1 Creating a Model Package... 55 3.6.2 Registering a Model Package... 56 3.7 Chapter Summary... 57 3.8 References and Further Reading... 58

v Chapter 4 Development of a Loss Given Default (LGD) Model... 59 4.1 Overview of Loss Given Default... 59 4.1.1 LGD Models for Retail Credit... 60 4.1.2 LGD Models for Corporate Credit... 60 4.1.3 Economic Variables for LGD Estimation... 61 4.1.4 Estimating Downturn LGD... 61 4.2 Regression Techniques for LGD... 62 4.2.1 Ordinary Least Squares Linear Regression... 64 4.2.2 Ordinary Least Squares with Beta Transformation... 64 4.2.3 Beta Regression... 65 4.2.4 Ordinary Least Squares with Box-Cox Transformation... 66 4.2.5 Regression Trees... 67 4.2.6 Artificial Neural Networks... 67 4.2.7 Linear Regression and Non-linear Regression... 68 4.2.8 Logistic Regression and Non-linear Regression... 68 4.3 Performance Metrics for LGD... 69 4.3.1 Root Mean Squared Error... 69 4.3.2 Mean Absolute Error... 70 4.3.3 Area Under the Receiver Operating Curve... 70 4.3.4 Area Over the Regression Error Characteristic Curves... 71 4.3.5 R-square... 72 4.3.6 Pearson s Correlation Coefficient... 72 4.3.7 Spearman s Correlation Coefficient... 72 4.3.8 Kendall s Correlation Coefficient... 73 4.4 Model Development... 73 4.4.1 Motivation for LGD models... 73 4.4.2 Developing an LGD Model... 73 4.5 Case Study: Benchmarking Regression Algorithms for LGD... 77 4.5.1 Data Set Characteristics... 77 4.5.2 Experimental Set-Up... 78 4.5.3 Results and Discussion... 79 4.6 Chapter Summary... 83 4.7 References and Further Reading... 84

vi Chapter 5 Development of an Exposure at Default (EAD) Model... 87 5.1 Overview of Exposure at Default... 87 5.2 Time Horizons for CCF... 88 5.3 Data Preparation... 90 5.4 CCF Distribution Transformations... 95 5.5 Model Development... 97 5.5.1 Input Selection... 97 5.5.2 Model Methodology... 97 5.5.3 Performance Metrics... 99 5.6 Model Validation and Reporting... 103 5.6.1 Model Validation... 103 5.6.2 Reports... 104 5.7 Chapter Summary... 106 5.8 References and Further Reading... 107 Chapter 6 Stress Testing... 109 6.1 Overview of Stress Testing... 109 6.2 Purpose of Stress Testing... 110 6.3 Stress Testing Methods... 111 6.3.1 Sensitivity Testing... 111 6.3.2 Scenario Testing... 112 6.4 Regulatory Stress Testing... 113 6.5 Chapter Summary... 114 6.6 References and Further Reading... 114 Chapter 7 Producing Model Reports... 115 7.1 Surfacing Regulatory Reports... 115 7.2 Model Validation... 115 7.2.1 Model Performance... 116 7.2.2 Model Stability... 122 7.2.3 Model Calibration... 125 7.3 SAS Model Manager Examples... 127 7.3.1 Create a PD Report... 127 7.3.2 Create a LGD Report... 129 7.4 Chapter Summary... 130

vii Tutorial A Getting Started with SAS Enterprise Miner... 131 A.1 Starting SAS Enterprise Miner... 131 A.2 Assigning a Library Location... 134 A.3 Defining a New Data Set... 136 Tutorial B Developing an Application Scorecard Model in SAS Enterprise Miner... 139 B.1 Overview... 139 B.1.1 Step 1 Import the XML Diagram... 140 B.1.2 Step 2 Define the Data Source... 140 B.1.3 Step 3 Visualize the Data... 141 B.1.4 Step 4 Partition the Data... 143 B.1.5 Step 5 Perform Screening and Grouping with Interactive Grouping... 143 B.1.6 Step 6 Create a Scorecard and Fit a Logistic Regression Model... 144 B.1.7 Step 7 Create a Rejected Data Source... 144 B.1.8 Step 8 Perform Reject Inference and Create an Augmented Data Set... 144 B.1.9 Step 9 Partition the Augmented Data Set into Training, Test and Validation Samples... 145 B.1.10 Step 10 Perform Univariate Characteristic Screening and Grouping on the Augmented Data Set... 145 B.1.11 Step 11 Fit a Logistic Regression Model and Score the Augmented Data Set... 145 B.2 Tutorial Summary... 146 Appendix A Data Used in This Book... 147 A.1 Data Used in This Book... 147 Chapter 3: Known Good Bad Data... 147 Chapter 3: Rejected Candidates Data... 148 Chapter 4: LGD Data... 148 Chapter 5: Exposure at Default Data... 149 Index... 151 From Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT : Theory and Application, by Iain Brown. Copyright 2014, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.