A problem in insurance fraud analysis is that a clear dependent variable is often not available in the analysis data. Historically this has been addressed in several ways, such as using fraud surrogates or doing outlier analysis. The literature also contains many example that employ unsupervised learning approaches such as cluster analysis. There are some more recent alternatives to classical clustering that we will be exploring and compared to the classical approach, i.e., Random Forests and PRIDIT. The PRIDIT technique originated as a psychological scaling technique. Brockett and Derrig adapted the approach to fraud analysis. Random Forests, a popular technique used for building ensemble trees can also be used for unsupervised learning.
In addition to demonstrating these approaches in fraud analysis on an automobile bodily injury claim research database, we will also illustrate their application on publicly available datasets, such as a dataset of California personal auto underwriting experience.