Salford Analytics and Data Mining Conference

For next Conference details contact info@salforddatamining.com

You are here:Home»Conference»Sessions»A More Transparent Interpretation of Health Club Surveys

A More Transparent Interpretation of Health Club Surveys

Recorded Session
*Full-length presentation requests may be sent to This e-mail address is being protected from spambots. You need JavaScript enabled to view it. .

Traditional interpretation of satisfaction surveys often includes building linear or logistic regression models. In order to overcome data problems, including multi-collinearity and missing data, significant data cleansing and feature creation is required. Once one accomplishes these steps, there is still the issue of interpreting the final results. Logistic regression and even linear regression models are notoriously difficult to interpret except for indicating general trends: they do not tell us why an individual is predicted to have a particular level of satisfaction. Building decision trees using CART on the other hand is a more effective approach because it handles the problem of correlated variables and missing data automatically and provides a transparent interpretation of the behavior of individuals.

In this application, several thousand surveys from the YMCA were analyzed with the purpose of understanding which members were most satisfied with the club, were most likely to renew their membership, were most likely to recommend the club to a friend. The analysis determined which YMCA branch attributes were most associated with each of the target variables of interest.