Journal of Risk Model Validation

Risk.net

Machine learning prediction of loss given default in government-sponsored enterprise residential mortgages

Zilong Liu and Hongyan Liang

  • This research evaluates the performance of various predictive models for residential mortgage Loss Given Default (LGD), highlighting CatBoost and Light Gradient Boost Machine Regressor as the most accurate models in predicting LGD.
  • Variables such as the mark-to-market loan-to-value ratio, unpaid balance at default and the role of mortgage servicers are critical in shaping LGD outcomes and for effective risk management.
  • The study reveals variations in LGD among fintech firms, banks and nonbanks, illustrating the impact of technological advancements and operational strategies on risk assessments in the mortgage market.
  • Our findings establish a benchmark for the validation of LGD models, demonstrating the robustness and accuracy necessary for reliable model validation in the mortgage sector.

In the dynamic field of residential mortgage lending, the accurate estimation of loss given default (LGD) is crucial for effective risk management and model validation. This study investigates LGD using machine learning techniques on a data set of conforming loans from government-sponsored enterprises in the United States between January 2012 and March 2018. The research objectives were threefold: to evaluate the predictive capabilities of various machine learning models for LGD that incorporate a comprehensive set of risk factors; to compare LGD across lender types; and to identify the dominant variables influencing LGD outcomes. Ensemble tree-based methods (specifically the CatBoost, the Gradient-Boosting Regressor and Light Gradient-Boosting Machine regressor algorithms) demonstrated superior accuracy, which was further enhanced through hyperparameter tuning. These models highlighted the importance of loan-specific details and broader economic indicators in LGD estimation. The study found that, unlike banks and nonbanks, fintech firms presented distinct LGD patterns, suggesting diverse risk assessment approaches. Key variables impacting LGD predictions included the mark-to-market loan-to-value ratio, the unpaid balance at default, the liquidation timeline and the mortgage servicer as well as regional economic factors. This paper contributes to the methodological advancement in LGD prediction at the origination stage, which is crucial for proactively managing loan portfolios. It contributes to advancing methodological approaches in LGD prediction and risk model validation, offering valuable insights for lenders and policy makers in refining risk frameworks within the evolving mortgage market.

Sorry, our subscription options are not loading right now

Please try again later. Get in touch with our customer services team if this issue persists.

New to Risk.net? View our subscription options

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here