Journal of Risk Model Validation
ISSN:
1753-9579 (print)
1753-9587 (online)
Editor-in-chief: Steve Satchell
Smoothing algorithms by constrained maximum likelihood: methodologies and implementations for Comprehensive Capital Analysis and Review stress testing and International Financial Reporting Standard 9 expected credit loss estimation
Need to know
Smoothing algorithms for monotonic rating level PD and rating migration probability are proposed. The approaches can be characterized as follows:
- These approaches are based on constrained maximum likelihood, with a fair risk scale for the estimates determined fully by constrained maximum likelihood, leading to a fair and more robust credit loss estimation.
- Default correlation is considered by using the asset correlation and the Merton model.
- Quality of smoothed estimates is assessed by the likelihood ratio test, and the impacted credit loss due to the change of risk scale for the estimates.
- These approaches generally outperform the interpolation method and regression models, and are easy to implement by using, for example, SAS PROC NLMIXED.
Abstract
In the process of loan pricing, stress testing, capital allocation, modeling of probability of default (PD) term structure and International Financial Reporting Standard 9 expected credit loss estimation, it is widely expected that higher risk grades carry higher default risks, and that an entity is more likely to migrate to a closer nondefault rating than a more distant nondefault rating. In practice, sample estimates for the rating-level default rate or rating migration probability do not always respect this monotonicity rule, and hence the need for smoothing approaches arises. Regression and interpolation techniques are widely used for this purpose. A common issue with these, however, is that the risk scale for the estimates is not fully justified, leading to a possible bias in credit loss estimates. In this paper, we propose smoothing algorithms for rating-level PD and rating migration probability. The smoothed estimates obtained by these approaches are optimal in the sense of constrained maximum likelihood, with a fair risk scale determined by constrained maximum likelihood, leading to more robust credit loss estimation. The proposed algorithms can be easily implemented by a modeler using, for example, the SAS procedure PROC NLMIXED. The approaches proposed in this paper will provide an effective and useful smoothing tool for practitioners in the field of risk modeling.
Introduction
1 Introduction
Given a risk-rated portfolio with ratings , we assume that rating is the best quality rating and is the worst, ie, the default rating. It is widely expected that higher risk ratings carry higher default risk, and that an entity is more likely to be downgraded or upgraded to a closer nondefault rating than a more distant nondefault rating. The following constraints are therefore required:
(1.1) | |||
(1.2) | |||
(1.3) |
where , , denotes the probability of default (PD) for rating , and , , is the migration probability from a nondefault initial rating to a nondefault rating .
Estimates that satisfy the above monotonicity constraints are called smoothed estimates. Smoothed estimates are widely expected for rating-level PD and rating migration probability in the process of loan pricing, capital allocation, Comprehensive Capital Analysis and Review (CCAR) stress testing (Board of Governors of the Federal Reserve System 2016), modeling of PD term structure and International Financial Reporting Standard 9 expected credit loss (ECL) estimation (Ankarath et al 2010).
In practice, sample estimates for rating-level PD and rating migration probability do not always respect these monotonicity rules. This calls for smoothing approaches. Regression and interpolation methods have been widely used for this purpose. A common issue with these approaches is that the risk scale for the estimates is not fully justified, leading to a possible bias estimate for the credit loss.
In this paper, we propose smoothing algorithms based on constrained maximum likelihood (CML). These CML-smoothed estimates are optimal in the sense of constrained maximum likelihood, with a fair risk scale determined by constrained maximum likelihood, leading to a fair and more justified loss estimation. As shown by the empirical examples for rating-level PD in Section 2.3, the CML approach is more robust than the logistic and log-linear models, with quality being measured based on the resulting likelihood ratio, the predicted portfolio level PD and the impacted ECL.
This paper is organized as follows. In Section 2, we propose smoothing algorithms for smoothed rating-level PD, for the cases with and without default correlation. A smoothing algorithm for multinomial probability is proposed in Section 3. Empirical examples are given accordingly in Sections 2 and 3, and in Section 2 we benchmark the CML approach for rating-level PD with a logistic model proposed by Tasche (2013) and a log-linear model proposed by van der Burgt (2008). Section 4 concludes.
2 Smoothing rating-level probability of default
2.1 The proposed smoothing algorithm for rating-level PD assuming no default correlation
Cross-section or within-section default correlation may arise due to some commonly shared risk factors. In which case, we assume that the sample is at a point in time, given the commonly shared risk factors, and that defaults occur independently given the commonly shared risk factors.
Let and be the observed default and nondefault frequencies, respectively, for a nondefault risk rating . Let denote the PD for an entity with a nondefault initial rating . With no default correlation, we can assume that the default frequency follows a binomial distribution. Then the sample loglikelihood is given by
(2.1) |
up to a summand given by the logarithms of the related binomial coefficients, which are independent of . By taking the derivative of (2.1) with respect to and setting it to zero, we have
Therefore, the unconstrained maximum likelihood estimate for is just the sample default rate .
We propose the following smoothing algorithm for the case when no default correlation is assumed.
Algorithm 2.1 (Smoothing rating-level PD assuming no default correlation).
2.2 The proposed smoothing algorithms for rating-level PD assuming default correlation
Default correlation can be modeled by the asymptotic single risk factor (ASRF) model using asset correlation. Under the ASRF model framework, the risk for an entity is governed by a latent random variable , called the firm’s normalized asset value, which splits into the following two parts (Miu and Ozdemir 2009):
(2.4) |
where denotes the common systematic risk and is the idiosyncratic risk independent of . The quantity is called the asset correlation. It is assumed that there exist threshold values (ie, the default points) such that an entity with an initial risk rating will default when falls below the threshold value . The long-run PD for rating is then given by , where denotes the standard normal cumulative distribution function (CDF).
Let denote the PD for an entity with an initial risk rating given the systematic risk . It is shown in Yang (2017) that
(2.5) |
where
Let and denote, respectively, the number of entities and the number of defaults at time for . Given the latent factor , we propose the following smoothing algorithm for rating-level-correlated long-run PDs by using (2.5).
Algorithm 2.2 (Smoothing rating-level-correlated long-run PDs given the latent systematic risk factor).
- (a)
Parameterize for a nondefault rating by (2.5) with
(2.6) where, for a given constant , the following constraints are satisfied:
(2.7) - (b)
Optimization with a random effect can be implemented by using, for example, SAS PROC NLMIXED (SAS Institute 2009).
When some key risk factors , common to all ratings, are observed, we assume the following decomposition for the systematic risk factor :
where the common index is a linear combination of variables , with and being the mean and standard deviation of .
Let denote the PD given a scenario . Assume that is standard normal independent of . Then we have (Yang 2017, Theorem 2.2)
(2.9) |
for some .
Let denote the value of at time for . Given , we propose the following smoothing algorithm for rating-level-correlated long-run PDs and rating-level point-in-time PDs by using (2.9).
Algorithm 2.3 (Smoothing rating-level-correlated PDs given the common index ).
- (a)
Parameterize for a nondefault rating by (2.6) with
(2.10) where, for a given constant , the following constraints are satisfied:
(2.11) - (b)
2.3 Empirical examples: smoothing of rating-level PDs
Example 1: smoothing rating-level long-run PDs assuming no default correlation
Table 1 shows the record count and default rate (DF rate) for a sample created synthetically with six nondefault risk ratings.
Algorithm 2.1 will be benchmarked by the following methods.
Risk rating | |||||||||
Portfolio | |||||||||
1 | 2 | 3 | 4 | 5 | 6 | level | |||
DF | 1 | 11 | 22 | 124 | 62 | 170 | 391 | ||
Count | 5 529 | 11 566 | 29 765 | 52 875 | 4 846 | 4 318 | 108 899 | ||
DF rate (%) | 0.0173 | 0.0993 | 0.0739 | 0.2352 | 1.2833 | 3.9442 | 0.3594 |
- LGL1:
-
with this approach, the PD for rating is estimated by , where denotes the index for rating , ie, for rating . Parameters and are estimated by a linear regression of the form below, using the logarithm of the sample default rate for a rating:
A common issue with this approach is the unjustified uniform risk scale (in the log space) for all ratings. In addition, this approach generally causes the portfolio level PD to be underestimated, due to the convexity of the exponential function (the second derivative of the function is positive):
- LGL2:
-
like method LGL1, rating-level PD is estimated by . However, parameters and are estimated by maximizing the loglikelihood given in (2.1). With this approach, the bias for portfolio PD can generally be avoided, though the issue with the unjustified uniform risk scale remains.
- EXP-CDF:
-
this method was proposed by van der Burgt (2008). With this approach, the rating-level PD is estimated by , where denotes, for rating , the adjusted sample cumulative distribution,
(2.13) Instead of estimating parameters via a cap ratio (van der Burgt 2008), we estimate parameters by maximizing the loglikelihood given in (2.1).
- LGST-INVCDF:
Estimation quality is measured by the following.
- -value:
-
this is the -value calculated from the likelihood ratio chi-squared test with degrees of freedom equal to the number of restrictions. A higher -value indicates a better model.
- ECL ratio:
-
this is the ratio of expected credit loss based on the smoothed rating-level PDs to that based on the realized rating-level PDs, given the exposure at default and loss given default parameters for each rating. A significantly lower ECL ratio value indicates a possible underestimation of the credit loss.
- PD ratio:
-
the ratio of the portfolio level PD aggregated from the smoothed rating-level PDs is relative to the portfolio level PD aggregated from the realized rating-level PDs. A value significantly lower than 100% for the PD ratio indicates a possible underestimation for the PD at portfolio level.
Table 2 shows the results for Algorithm 2.1 (labeled “CML”) when along with the benchmarks, where the smoothed rating-level PDs are listed in columns P1–P6.
Portfolio level | |||||||||
---|---|---|---|---|---|---|---|---|---|
ECL | PD | ||||||||
Method | P1 | P2 | P3 | P4 | P5 | P6 | -value | ratio | ratio |
CML | 0.0173 | 0.0810 | 0.0810 | 0.2352 | 1.2833 | 3.9442 | 95.92 | 99.91 | 100.00 |
LGL1 | 0.0165 | 0.0416 | 0.1053 | 0.2663 | 0.6732 | 1.7022 | 0.00 | 46.09 | 72.57 |
LGL2 | 0.0032 | 0.1468 | 0.2901 | 0.4333 | 0.5763 | 0.7191 | 0.00 | 27.58 | 100.07 |
EXP-CDF | 0.0061 | 0.0086 | 0.0294 | 0.3431 | 1.9081 | 2.5057 | 0.00 | 72.92 | 100.21 |
LGST-INVCDF | 0.0104 | 0.0188 | 0.0585 | 0.2795 | 1.5457 | 3.4388 | 0.00 | 90.46 | 100.00 |
Portfolio level | |||||||||
---|---|---|---|---|---|---|---|---|---|
ECL | PD | ||||||||
P1 | P2 | P3 | P4 | P5 | P6 | -value | ratio | ratio | |
0.0 | 0.0173 | 0.0810 | 0.0810 | 0.2352 | 1.2833 | 3.9442 | 95.92 | 99.91 | 100.00 |
0.1 | 0.0173 | 0.0753 | 0.0832 | 0.2352 | 1.2833 | 3.9442 | 89.06 | 99.88 | 100.00 |
0.5 | 0.0173 | 0.0552 | 0.0910 | 0.2352 | 1.2833 | 3.9442 | 36.63 | 99.79 | 100.00 |
1.0 | 0.0120 | 0.0327 | 0.0890 | 0.2419 | 1.2833 | 3.9442 | 2.54 | 99.63 | 100.00 |
These results show that Algorithm 2.1 outperforms the other benchmarks significantly by -value, impacted ECL and aggregated portfolio-level PD. The first log-linear model (LGL1) underestimates the portfolio level PD significantly. All log-linear models (LGL1, LGL2 and EXP-CDF) underestimate the ECL significantly.
Example 2: smoothing rating-level long-run PDs in the presence of default correlation
Risk rating | |||||||
Portfolio | |||||||
1 | 2 | 3 | 4 | 5 | 6 | level | |
Long-run AVG PD | 0.0215 | 0.1027 | 0.0764 | 0.2731 | 1.1986 | 3.8563 | 0.3818 |
Overall distribution | 5.07 | 10.61 | 27.47 | 48.32 | 4.52 | 4.01 | 100.00 |
The sample created synthetically contains the quarterly default count by rating for a portfolio with six nondefault ratings between 2005 Q1 and 2014 Q4. The (rating-level or portfolio-level) point-in-time default rate is calculated for each quarter and then averaged over the sample window by dividing by the number of quarters (forty-four) to obtain the estimate for the long-run average realized PD (labeled “AVG PD”). Sample distribution (labeled “overall distribution”) by rating is calculated by combining all forty-four quarters. Table 4 displays sample statistics (with a heavy size concentration at rating ).
Portfolio | |||||||||
long-run PD | |||||||||
AVG | PD | ||||||||
P1 | P2 | P3 | P4 | P5 | P6 | AIC | PD | ratio | |
0.0 (no correl) | 0.0179 | 0.0836 | 0.0836 | 0.2371 | 1.3076 | 4.0372 | 694.02 | 0.3710 | 97.17 |
0.0 ( correl) | 0.0183 | 0.0828 | 0.0828 | 0.2545 | 1.1951 | 3.9340 | 594.62 | 0.3843 | 100.66 |
0.1 ( correl) | 0.0183 | 0.0483 | 0.0966 | 0.2541 | 1.1942 | 3.9318 | 600.79 | 0.3842 | 100.64 |
0.2 ( correl) | 0.0035 | 0.0176 | 0.0754 | 0.2775 | 1.1859 | 3.9237 | 617.96 | 0.3842 | 100.64 |
0.3 ( correl) | 0.0010 | 0.0086 | 0.0560 | 0.2905 | 1.1961 | 3.9342 | 637.25 | 0.3845 | 100.71 |
Table 5 shows the smoothed correlated rating-level long-run PD for all six nondefault ratings obtained by using Algorithm 2.2.
Estimation quality is measured by the following.
- AIC:
-
the Akaike information criterion. A lower AIC indicates a better model.
- PD ratio:
-
the ratio of the long-run average predicted portfolio-level PD (labeled “AVG PD”) to the long-run average realized portfolio level PD. A value significantly less than 100% for this ratio indicates a possible underestimation for the PD at portfolio level.
The first row in Table 5 shows results for the case when no default correlation is assumed (labeled “no correl”) and is chosen to be 0, while the second row shows those for the case when default correlation is assumed (labeled “ correl”) and .
The results in the first row show that the estimated long-run portfolio level PD for the case assuming no default correlation is lower than that for the case when default correlation is assumed (second row), which suggests we may have underestimated the long-run rating-level PD when assuming no default correlation. The high AIC value in the first row implies that the assumption of no default correlation may not be appropriate.
Note that, when applying Algorithm 2.2 to the sample used in example 1, assuming no default correlation, we got exactly the same estimates as in example 1.
3 Smoothing algorithms for multinomial probability
3.1 Unconstrained maximum likelihood estimates for multinomial probability
For independent trials, where each trial results in exactly one of fixed outcomes, the probability of observing frequencies , with frequency for the th ordinal outcome, is
(3.1) |
where is the probability of observing the th ordinal outcome in a single trial, and
The loglikelihood is
(3.2) |
up to a constant given by the logarithm of some multinomial coefficient independent of parameters . By using the relation and setting to zero the derivative of (3.2) with respect to , , we have
Since this holds for each and for the fixed , we conclude that the vector is in proportion with . Thus, the maximum likelihood estimate for is the sample estimate
(3.3) |
3.2 The proposed smoothing algorithm for multinomial probability
We next propose a smoothing algorithm for multinomial probability under the following constraint:
(3.4) |
Algorithm 3.1 (Smoothing multinomial probability).
In the case when , let . Then is the maximum lower bound for all the ratios .
3.3 An empirical example: smoothing transition probability matrix
(a) Transition probability before smoothing | ||||||
p1 | p2 | p3 | p4 | p5 | p6 | p7 |
0.97162 | 0.01835 | 0.00312 | 0.00554 | 0.00104 | 0.00017 | 0.00017 |
0.00621 | 0.94528 | 0.03071 | 0.01284 | 0.00215 | 0.00257 | 0.00025 |
0.00071 | 0.01028 | 0.93803 | 0.04089 | 0.00659 | 0.00277 | 0.00074 |
0.00024 | 0.00069 | 0.01260 | 0.96726 | 0.01261 | 0.00543 | 0.00118 |
0.00039 | 0.00118 | 0.00790 | 0.07996 | 0.82725 | 0.07048 | 0.01283 |
0.00022 | 0.00133 | 0.00266 | 0.04498 | 0.01197 | 0.89940 | 0.03944 |
(b) Transition probability after smoothing | ||||||
p1 | p2 | p3 | p4 | p5 | p6 | p7 |
0.97162 | 0.01835 | 0.00433 | 0.00433 | 0.00104 | 0.00017 | 0.00017 |
0.00621 | 0.94528 | 0.03071 | 0.01284 | 0.00236 | 0.00236 | 0.00025 |
0.00071 | 0.01028 | 0.93803 | 0.04089 | 0.00659 | 0.00277 | 0.00074 |
0.00024 | 0.00069 | 0.01260 | 0.96726 | 0.01261 | 0.00543 | 0.00118 |
0.00039 | 0.00118 | 0.00790 | 0.07996 | 0.82725 | 0.07048 | 0.01283 |
0.00022 | 0.00133 | 0.00266 | 0.02847 | 0.02847 | 0.89940 | 0.03944 |
Rating migration matrix models (Miu and Ozdemir 2009; Yang and Du 2016) are widely used for International Financial Reporting Standard 9 ECL estimation and CCAR stress testing. Given a nondefault risk rating , let be the observed long-run transition frequency from to at the end of the horizon, and let . Let be the long-run transition probability from to . By (3.3), the maximum likelihood estimate for observing the long-run transition frequencies for a fixed is
(3.7) |
It is widely expected that higher risk grades carry greater default risk, and that an entity is more likely to be downgraded or upgraded to a closer nondefault rating than a more distant nondefault rating. The following constraints are thus required:
(3.8) | |||
(3.9) | |||
(3.10) |
The constraint (3.10) is for rating-level PD, which was discussed in Section 2.
Smoothing the long-run migration matrix involves the following steps.
- (a)
- (b)
Find the CML-smoothed estimates by using Algorithm 2.1 for the rating-level default rate. Keep these CML default rate estimates unchanged and rescale, for each nondefault rating , the nondefault migration probabilities so that the entire row sums to 1.
Table 6 shows empirical results using Algorithms 2.1 and 3.1 for smoothing the long-run migration matrix, where for Algorithm 3.1 all are set to zero.
The sample used here is created synthetically. It consists of the historical quarterly rating transition frequency for a commercial portfolio from 2005 Q1 to 2015 Q4. There are seven risk ratings, with being the best quality rating and being the default rating.
Part (a) shows sample estimates for long-run transition probabilities before smoothing, while part (b) shows CML-smoothed estimates. There are three rows, as highlighted in bold in part (a), where sample estimates violate (3.8) or (3.9) (but (3.10) is satisfied). Rating-level sample default rates (the column labeled “p7”) do not require smoothing.
As shown in the table, the CML-smoothed estimates are the simple average of the relevant nonmonotonic sample estimates. (For the structure of CML-smoothed estimates for multinomial probabilities, we show theoretically in a separate paper that the CML-smoothed estimate for an ordinal class is either the sample estimate or the simple average of the sample estimates for some consecutive ordinal classes including the named class.)
4 Conclusions
Regression and interpolation approaches are widely used for smoothing rating transition probability and rating-level probability of default. A common issue with these methods is that the risk scale for the estimates does not have a strong mathematical basis, leading to possible bias in credit loss estimation. In this paper, we propose smoothing algorithms that are based on constrained maximum likelihood for rating-level PD and for rating migration probability. These smoothed estimates are optimal in the sense of constrained maximum likelihood, with a fair risk scale determined by constrained maximum likelihood, leading to a fair and more justified credit loss estimation. These algorithms can be implemented by a modeler using, for example, the SAS PROC NLMIXED package.
Declaration of interest
The author reports no conflicts of interest. The author alone is responsible for the content and writing of the paper. The views expressed in this paper are not necessarily those of the Royal Bank of Canada or any of its affiliates.
Acknowledgements
The author thanks both referees for suggesting extended discussion to cover both the case when default correlation is assumed and the likelihood ratio test for the constrained maximum likelihood estimates. Special thanks to Carlos Lopez for his consistent input, insights and support for this research. Thanks also go to Clovis Sukam and Biao Wu for their critical reading of this manuscript, and Zunwei Du, Wallace Law, Glenn Fei, Kaijie Cui, Jacky Bai and Guangzhi Zhao for many valuable conversations.
References
- Ankarath, N., Ghost, T. P., Mehta, K. J., and Alkafaji, Y. A. (2010). Understanding IFRS Fundamentals. Wiley.
- Board of Governors of the Federal Reserve System (2016). Comprehensive Capital Analysis and Review 2016: summary instructions. Report, January, Federal Reserve Bank.
- Miu, P., and Ozdemir, B. (2009). Stress testing probability of default and rating migration rate with respect to Basel II requirements. The Journal of Risk Model Validation 3(4), 3–38 (https://doi.org/10.21314/JRMV.2009.048).
- SAS Institute (2009). SAS 9.2 user’s guide: the NLMIXED procedure. SAS Institute Inc., Cary, NC.
- Tasche, D. (2013). The art of probability-of-default curve calibration. The Journal of Credit Risk 9(4), 63–103 (https://doi.org/10.21314/JCR.2013.169).
- van der Burgt, M. J. (2008), Calibrating low-default portfolios, using the cumulative accuracy profile. The Journal of Risk Model Validation 1(4), 17–33 (https://doi.org/10.21314/JRMV.2008.016).
- Yang, B. H. (2017). Point-in-time probability of default term structure models for multiperiod scenario loss projection. The Journal of Risk Model Validation 11(1), 73–94 (https://doi.org/10.21314/JRMV.2017.164).
- Yang, B. H., and Du, Z. (2016). Rating-transition-probability models and Comprehensive Capital Analysis and Review stress testing. The Journal of Risk Model Validation 10(3), 1–19 (https://doi.org/10.21314/JRMV.2016.155).
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net