Not logged in.
Quick Search - Contribution
Contribution Details
Type | Master's Thesis |
Scope | Discipline-based scholarship |
Title | Comparative analysis of Machine Learning methods for the estimation of Probability of Default |
Organization Unit | |
Authors |
|
Supervisors |
|
Language |
|
Institution | University of Zurich |
Faculty | Faculty of Business, Economics and Informatics |
Number of Pages | 68 |
Date | 2021 |
Abstract Text | Machine Learning (ML) is gaining prominence in financial risk management application studies by providing improved modelling flexibility compared to the current state-of-theart parametric approaches. Under the supervised learning framework, various classifiers may contribute to a more accurate estimation of risk parameters in Internal Rating-Based models developed by financial institutions. The main objective of this thesis is to construct and compare various classification models used in credit scoring applications and estimation of Probability of Default (PD). In particular, this study compares the performances of Random Forest (RF), k-Nearest Neighbors (k-NN), XGBoost, and AdaBoost on a realworld credit scoring portfolio made available by Credit Suisse. The portfolio considered in the analysis ranges from 2000 to 2014 and includes all counterparties in Credit Suisse’s corporate portfolio consisting of Swiss corporate small-medium enterprises (SME) and large enterprises (LE). Common issues in credit scoring portfolios such as the low default problem and feature selection are addressed in the analysis by employing oversampling techniques and hybrid feature selection procedures. Models specifically for SMEs and for both SMEs and LEs simultaneously are constructed and compared using AUROC and Brier Score performance metrics. The performance of these models is also compared to the logistic regression, which is the industry benchmark model for such applications. This study confirms the literature findings that ML models outperform traditional approaches (e.g., logistic regression) and supports the superior performance of these models on the Swiss corporate portfolio specifically. Out of the ML models, the best performing model in terms of AUROC is the RF, while the boosting models provide the most accurate probability predictions. k-NN performs worse than the rest of the ML models, but still outperforms the logistic regression. Finally, the effect of model averaging on model performance is assessed and compared to the performance of the single models. Averaging the three best ML models results in increased performance and reduced model risk. The results suggest that ML techniques prove to be important aids in credit risk modelling and should be considered as serious competitors of classical approaches. |
Export | BibTeX |