Interactive Text Mining of Service Calls to Improve Customer Support Michael Schuh & Ron Zhang Advanced Product Engineering Oshkosh Corporation
Outline Oshkosh Corporation Classification: Restricted Company Intro Problem Scope Data Process Text Mining & Results
Oshkosh Corporation Classification: Restricted Introducing Oshkosh Corporation World s leading manufacturer of specialty vehicles Our vehicles move people and materials at work Most protect people or property Many lift people or property All do so safely and efficiently Headquarters in Oshkosh, Wisconsin Manufacturing in 8 countries with service in 16 additional nations Sales in more than 130 countries Four industry-leading business segments Access Equipment Defense Fire & Emergency Commercial
Oshkosh Corporation Classification: Restricted Oshkosh Moves the World at Work DEFENSE ACCESS EQUIPMENT FIRE & EMERGENCY COMMERCIAL
Problem Scope Oshkosh Corporation Classification: Restricted JLG service center technicians receive calls from the field Expert technicians answer questions, troubleshoot faults, etc. Log all call info, including the problem descriptions and solutions Motivation: Can we streamline this process and enhance the technical knowledgebase Focus on unstructured text fields: Subject & Description 3 Goals: More efficient troubleshooting service procedures; reducing initial service calls; guiding engineering enhancements
Data Overview Oshkosh Corporation Classification: Restricted 3 months of service record data (10 years coming next ) Sept 1 Dec 1, 2016 12,265 total records 12 attributes (dimensions/columns) per record Attributes: Record Info: ID*, Case Owner, Date/Time Vehicle Info: Model, Serial Number, Account Owner Problem Info: Engine Faults, Machine Faults, Component, Component Detail, Component Type, Subject, Description
The Process Oshkosh Corporation Classification: Restricted From raw data to actionable results Exploratory data analysis Time intensive pre-processing Text mining approaches Highlights: Text fields Product model aggregations Fault code parsing JMP utilities: Column Recoding Record Subsetting Table Joining Column Equations Graph Builder And more JMP Text Explorer: Tokenization Clustering Record Selections Raw Data Records Pre-process Text Mining Actions & Insights
Oshkosh Corporation Classification: Restricted Description field 95% unique values 98% are singular records JMP: Recoding needs Clear need for text mining Bag of Words (parsing tokens) Latent Class Analysis (clustering records)
Oshkosh Corporation Classification: Restricted Unexpected Insights: Case Owners Records with blank Subject and Description All other fields except Model and SN are blank too useless JMP: quick select all rows with same (blank) values Almost 80% coming from two case owners: Douglas Campbell Jesse Eyler Imperative that case owners are diligent and meticulous with data entry, especially for automated analysis.
Oshkosh Corporation Classification: Restricted Common problem: long-tailed attributes 282 unique Model values, but top 35 models account for 76% of records Roll-up models to product categories for broader coverage 25 count 75%
Oshkosh Corporation Classification: Restricted Model Aggregation JMP: Table Joins on model field (and some data massaging through recoding model values) Record distributions over model categories: Almost 92% of records are top 3 categories 9 of top 10 models are Boom Lifts
Oshkosh Corporation Classification: Restricted Fault Code Parsing Numeric fault codes drive problems and troubleshooting procedures Three fields: Engine Machine Subject (text) Strip to codes only (regex) Logically combine into single column Machine -> Engine -> Subject
Oshkosh Corporation Classification: Restricted Fault Code Parsing (2) Combine all three together Recode: trim whitespace, group similar (without edits) Standardize values (. : ) 752 unique values (long-tail again) Significant increase in record coverage From 20% to almost 33% of recs linked to a specific fault code
Oshkosh Corporation Classification: Restricted Text Mining Defined Parsing unstructured (free-form) fields of text data to find frequent words and phrases for enhanced meaning/inference of records Term (or token): single word (or number) Stop words: common words excluded ( a, the, etc.) Stemming: removing word endings to combine occurrences (jump, jumping, jumped => jump- ) Phrase: collection of terms Document: each record (single text field) Corpus: all documents (records) Bag of Words: order of terms does not matter Often just as effective as advanced NLP (natural language processing)
Phrases Terms Oshkosh Corporation Classification: Restricted Text Mining - Desc Top 50 terms (0.79%): 38% of tokens and in 92% of recs! This is good.. and bad (lacking specificity of check, replace, etc.) Phrases also frequent (~37% of recs) Very frequent inquiry/response phrases (top 2) Rank Item Count C. Prob C. Rows Row % 1 check 5514 0.0394 4400 0.3587 3 wire 1758 0.0667 6187 0.5044 9 advised 1414 0.1339 8372 0.6826 18 control 1111 0.2139 9738 0.7940 27 found 829 0.2760 10607 0.8648 49 custom 574 0.3840 11316 0.9226 1 suggested tech 387 0.0129 386 0.0315 3 ground module 363 0.0378 1056 0.0861 9 tech check 223 0.0860 1921 0.1566 18 oil pressure 156 0.1387 2855 0.2328 MOVING 27THE service WORLD manual AT WORK 110 0.1776 3599 0.2934 49 need to check 80 0.2450 4550 0.3710
Oshkosh Corporation Classification: Restricted Text mining and Drill Downs Mining the global corpus can be challenging Key Idea: Apply more domain knowledge first Different ways to slice/dice the data Model Categories, Models, Components, Parts, etc. Fault Codes (explicitly defined problems/solutions) Goal: Insight into common problems/solutions Peel away and best explain well-grouped subsets of records Significant number of records each time will add up..
Oshkosh Corporation Classification: Restricted Latent Class Analysis (LCA) Grouping recs into clusters based on terms (categorical vars) Unobserved latent variable groups into levels (latent classes) Requires interpretation of results to best define each cluster/class Bayesian Information Criterion (BIC) Measure of fitness, smaller the better (minimizing -LogLikelihood) Number of clusters required upfront Use BIC and results interpretation to tune param Cluster mixture probability is percentage of recs assigned to each cluster
Oshkosh Corporation Classification: Restricted Fault Codes Distribution 2,430 recs (62.6%) covered by top 90 faults (at least 10 recs each) Most are singular fault codes per record (but not all) Many are primarily found in one model category 50% of top fault recs
Fault 662 Oshkosh Corporation Classification: Restricted Fault dictionary: CANBUS FAILURE PLATFORM MODULE 93 total records: 45 Boom, 38 Sizzor
Fault 662 (2) Oshkosh Corporation Classification: Restricted Most single fault Subject fields are uninformative Just declaring the known fault Often stating the exact dictionary text Further text mining is unhelpful..
Fault 662 (3) Oshkosh Corporation Classification: Restricted The description field text can provide significant inference Some terms vague and non-discriminative ( check ) Found 96 times over 60 of 93 recs Model categories can add additional insights.. Just using top phrases: First, check control cables Then.. Booms: Module/power issues Sizzors: Controls/connection issues Booms Sizzors
Fault 662 (4) Oshkosh Corporation Classification: Restricted Booms Latent Class Analysis (LCA) 60% BIC: 5C = 5,464; 3C = 4,671; 3C (no check ) = 4,616 C1: Check cannon plug connection and wire the cable shorted to ground Pin #111 cannon plug power Broken wire in cable Advised the tech to check the control cable for open circuits and shorts. Gave tech several check to check for voltage C2: Check voltages 33% Having him check the can connections going to J7 connector at ground module. Checked ground module at J7-1, reading 0v. Checked platform module at J7-3, reading 0v. Checked platform E-stop reading 0v. Checked ground key switch reading 12v. Advised replacing control cable. Functions will not work in ground mode. Has 12v at J7-14. 7% C3: Possible platform module issues (only 3 total recs)
Fault 662 (5) Oshkosh Corporation Classification: Restricted Sizzors Latent Class Analysis (LCA) 67% 19% 14% BIC: 5C = 3,200; 3C = 2,731; 3C (no check ) = 2,700 C1: Cable, connections, harness Advised him how to test control cable Checking the cable out and the connections at the plug by the charger Found a mast harness issue C2: Wiring pin connections While checking the 8 pin connector the tech move the harness and the fault cleared. Repair harness. Checked voltage at pins 1,2,3 with 4 as ground on the 8 pin connector. move the coiley cable and then can regain platform funcitons. Recommended replacing the cable from platform box to the deck. C3: More advanced checks.. Checked for power and ground at the 8 pin connector. Red,orang, blue all have 24v using black as ground. Checked resistance between yellow and green, open. Checked E-F at the cannon plug 120ohms. Checked connector under the deck, yellow wire has an open circuit.
Oshkosh Corporation Classification: Restricted Summarized Example: Fault 0010 10% of the service calls is likely related to ground module. The wiring and harness and voltage of the pins need to be checked
Conclusions Oshkosh Corporation Classification: Restricted Problem description text provided inferences towards frequent issues and most probable solutions Importantly, in order of likelihood Subject text increased coverage Critically dependent on human text entry Significant portion of calls are inquiry-related Reduced with better information to customers Many common problems exist across many models and types of machines Better focus engineering improvement time and efforts on most impactful issues
Questions? Oshkosh Corporation Classification: Restricted Thank you! Michael A. Schuh mschuh@oshkoshcorp.com