Welcome to my website...
Here you will find my views on actuarial issues of the day along with background and articles provided from my works. 

Informational Center

                     Risk Analysis, Management and Modeling
                   for the Insurance and Financial Industries

        Gary Venter

      President of the Gary Venter Company
        Actuary in Residence at Columbia University

              Associate Editor of the
     Casualty Actuarial Society’s publication, Variance

                    On the editorial boards of the North American Actuarial Journal of the
                   Society of Actuaries and the International Actuarial Society’s Astin Bulletin


                  MAAA, Member of the American Academy of Actuaries
          FCAS, Fellow of the Casualty Actuarial Society
      ASA, Associate of the Society of of Actuaries
CERA, Chartered Enterprise Risk Analyst





Asset Modeling Issues

ERM models use so-called economic scenario generators to simulate possible asset price movements. Often third-party ESGs are used and relied on uncritically, under the assumption that the modelers must know what they are doing. However most of the asset models in the actuarial literature are quite a bit behind the state of the art in asset risk modeling. More advanced models and model testing methodology is discussed in "Advances in Modeling of Financial Series," from the 2010 ERM Symposium, but even these models are limited to those that can be simulated fairly easily, which limits the models used, perhaps too much. Discussion of paths to models that are a bit more complex but that can handle issues like stochastic volatility is included.+ Expand


Much of the asset-modeling literature is directed towards pricing derivatives, but modeling for risk management is a bit different. A discussion of methodology for testing the set of scenarios generated by an ESG from a risk-management perspective can be found in "Testing Distributions of Stochastically Generated Yield Curves"  from the May 2004 ASTIN Bulletin, a winner of the Bob Alting von Geusau Memorial Prize.

Underwriting Cycle

Profitability cycles have been a feature of the insurance industry for as long as anybody can remember. Numerous academic and actuarial papers have tried to account for this, using concepts such as information lags, rational expectations, regulatory conflicts, capital cycles, loss shocks, investment opportunities, cycles in jurisprudence, etc. The evidence accumulating suggests that there are numerous drivers, and the prime deter-minants can change from cycle to cycle.   + Expand


Personal lines have managed to gain better control of the cycle than commercial lines. Some theories of that include a potential oligopolistic market structure, shorter policy terms, which shorten the information lags, and more direct management control of pricing. Improving information systems in the commercial lines have the potential for better monitoring of underwriter pricing adjustments, which could help match the pro-gress made in the personal lines.

One of the more difficult issues to model in the commercial cycles is the degree of price elasticity and cor-responding brand loyalty, as well as how much having a strong balance sheet can attract new business when a tight market arises. The role of an acquisition strategy tied to the cycle is also worth exploring The Manage the Cycle Series at GC Capital Ideas provides a magazine-level discussion of some of the cycle issues.

View Slide Presentation on Cycles

Robust Estimation

George Box’s adage that all models are wrong but some are useful (to which Georgio Armani heartily concurs) is at odds with much of statistical theory. When evaluating a fitting methodology, standard practice is to assume that the sample is being drawn from the model being fit. If so, MLE looks quite good as an es-timation methodology. However if the data is being drawn from a more complex process, and the model is just a simplified representation, there are several ways that MLE can fall apart. It is not at all robust to con-tamination of the data by unusual cases. Also sometimes a small number of cases are a strain for the model to fit, and end up having undue influence on the parameters.+ Expand


Robust estimation has approaches for these problems. Some alternative estimators to MLE are nearly as efficient as MLE with well-behaved data but are not as distorted by outliers. For instance when fitting distributions, doing MLE on every subset of 4 observations and then taking the median of those estimators is a reasonably good robust estimator, although more computationally intensive.

Another part of robust estimation is measuring the impact of the individual observations on the parameters. If a few observations are having a very large impact, alternative models can be tried to find some without that problem. This is the approach taken in "Robustifying Reserving,"  from the fall 2008 CAS Forum, which applies this aspect of robust estimation to building loss reserving models. It appears that models that are better overall can often be found by trying to avoid large impacts of individual cells in a triangle.

  Liquidity Risk

In its extreme form, liquidity risk gets to assets that are inaccessible because the market for them has dried up. A step in that direction is when the value that can be realized is temporarily very low due to the market being limited. It is just as critical to manage liquidity risk as asset, liability and credit risk because if the values are not realizable when the cash is needed financial distress can occur.

+ Expand


Several studies have found that there is a liquidity premium in the prices of many securities, such as corporate bonds, but to a lesser degree credit default swaps and even some treasury notes. Ignoring the liquidity risk can make it appear that arbitrage is possible, when in fact the positions can be quite risky.

There are a number of measures for quantifying the liquidity of various securities and these can be accumulated to estimate the liquidity position of the firm or even of the economy.

For a discussion of quantification and management of liquidity risk, see "Modeling and Managing Liquidity Risk" from the SOA essay series and the presentation, Liquidity Risk.

  Value of Risk Transfer

Standard finance theory, based on modern finance of the 1950s, holds that it is not worth paying anything for a widely-held corporation to transfer risk, because it is more efficient for shareholders to spread the risk through diversification of their holdings. One problem with this theory is that it depends on the assumption that corporations can always lend and borrow at the same rate. However when firms are in financial distress, it can become very expensive to raise new capital. This makes it worthwhile for firms to reduce the chance of going into financial distress by buying risk transfer.+ Expand


This is a special case of a more general objection: there are instances where reducing risk can increase expected earnings. For insurance companies, carrying too much risk can turn off potential customers. Building up financial strength is costly but in many cases can increase earnings or at least prevent a decrease.

Attach study note.


Loss Modeling

Modeling losses for ERM is fairly advanced, but there are a few difficult technical issues where improvements to most models can be made. These include modeling of parameter risk, modeling association among risk sources, and capturing the multi-year risks of liability runoff.

Parameter Risk

For high-frequency lines of business, the collective risk model implemented in most ERM models can end up showing little risk for companies with large books of business. Basically the variance of aggregate losses decreases as one over the increase in the expected number of claims.+ Expand

This can end up showing too little risk for these lines, as there are loss elements that do not diversify the way the collective risk model assumes. For instance if the price is wrong, you do not make that up by adding volume. Uncertain inflation is one of the sources of parameter risk that does not diversify with volume. Such risks can actually be the largest source of uncertainty for large companies, particularly if the catastrophe risk has been reinsured. Many internal models understate the parameter risk component.

View Slide Presentation on Parameter Risk


Modeling Association

This is sometimes called correlation or dependency, but both those terms are misleading. People tend to think of correlation in terms of correlation coefficients, which are very limited ways to express association among risks. Dependency seems to imply that one risk is driving another, which might not be the case at all, as in the old adage "correlation does not imply causation".+ Expand

Whether associating lines of business, asset risks, credit risks, or whatever, besides the strength of the association, another critical factor is where in the probability distribution the association takes place. Some lines might be largely independent of each other but could both be hit by the same catastrophic events, for in-stance. Bonds might be largely uncorrelated except in hard economic times, etc. In both these cases the asso-ciation is strongest in the tails of the distributions. Failure to model the potential for high correspondence in extreme events can, and has, severely understated the tail of the distribution of the sum of the risks.

Scalar measures like tail association coefficients are an attempt to quantify where the correlation is strong, but they are still single numbers, which can only give a limited picture of the linkages. Descriptive functions that show properties of the relationships across the whole probability spectrum provide more powerful analysis.

"Tails of Copulas" from the 2002 PCAS, introduces copulas and gives examples of a number of bivariate copulas and illustrates how descriptive functions can help decide among the copulas. This paper won the Dorweiler prize.

"Quantifying Correlated Reinsurance Exposures with Copulas", a Ferguson prize paper from the 2003 CAS Spring Forum, discusses the t-copula, which is a multivariate copula that provides complete flexibility for the correlation matrix as well as allowing for control over the strength of association in the tails. Multivariate correlation coefficients and descriptive functions are introduced and examples given.

However the t-copula strongly links the tail association with the correlation coefficient. This is not always appropriate with real data, so more choices for multivariate copulas are needed. A step in that direction is provided in "Multivariate Copulas for Financial Modeling" from Variance 1:1. A more flexible version of the t-copula is introduced and some other multivariate copulas are reviewed. Still, more work on finding useful multivariate copulas is called for.

View Slide Presentation on Modeling Association

Liability Runoff Risk

There are multiple issues surrounding stochastic modeling of reserves, which is a huge topic with many papers in the literature. One continuing problem in practice is failure to test the assumptions of the models against the data. For a detailed discussion of methodology for testing the popular chain-ladder method of age-to-age development factors see "Testing Assumptions of Age-to-Age Factors," PCAS 1998.+ Expand


Probably the biggest weakness of using age-to-age factors is that they miss calendar-year specific effects, such as years with high inflation, or years with a slowdown in claims processing, etc. Regression methodology applied to the loss reserving problem is capable of modeling such effects. However actuaries tend to have had little practical experience in building regression models. Typical modeling issues like choice of explanatory variables, eliminating insignificant parameters and other parameter-reduction techniques make quite a difference in the goodness of fit and the standard errors of the model. The paper "Refining Reserve Runoff Ranges," from the 2007 Spring Forum, illustrates how these regression methods can produce more parsimonious reserve models that reduce runoff ranges by reducing parameter uncertainty.

An emerging topic in reserving is simultaneous estimates of outstanding losses using both paid and incurred development triangles. Formalized methods have been designed for this problem. The paper "Distribution and Value of Reserves Using Paid and Incurred Triangles," from the 2008 Fall Forum, illustrates the application of regression model-building to this problem, as well as carrying the results into estimation of runoff ranges.

Even with good models of the underlying reserving process, the risk of future inflation on the runoff is not readily captured by the parameters fit. Another model of the impact of future inflation risk usually has to be superimposed on the runoff. However care must be taken to understand the inflation impacts that are already in the model, which are often not explicit, in order to avoid double counting. See "Stochastic Trend Models in Casualty and Life Insurance," from the 2009 ERM Symposium for a discussion of some of the methodology involved.

Ratemaking Stories

  Credibility Theory

A few decades ago, Stuart Klugman told me that a stat colleague of his had pointed out that all of least-squares credibility theory follows immediately from the standard stat method of weighting different estimates of the same thing inversely to their variances - the means of two samples, for instance. This is done because it minimizes the estimation variance. I eventually expanded on this comment in "Credibility Theory for Dummies," in the 2003 Winter Forum.

+ Expand


However this simplified approach to credibility also simplifies some of the generalizations. For instance when estimating a vector of correlated variables, like frequency of the four most serious injury types for workers compensation, using the correlations improves the estimates, and a fairly simple formula that looks like credibility but with covariance matrices can be derived directly by minimizing the estimation variance. This theory is worked out in "Structured Credibility in Applications - Hierarchical, Multi-dimensional and Multivariate Models," in ARCH, 1985. Jose Couret and I apply this to excess workers compensation pricing to get excess prices to a class level instead of just a hazard group level in "Using Multi-Dimensional Credibility to Estimate Class Frequency Vectors in Workers Compensation," in ASTIN 2008 . There it is shown that selecting risks based on this methodology can produce a preferred risk group.

In the original 1990 CAS textbook "Foundations of Casualty Actuarial Science" I included some generalizations of the standard theory in the credibility chapter. Much of this was already printed in the 1987 Fall Forum.

One generalization was to look at the case where individual risk variance does not decrease inversely with volume. This had been found years earlier by Charles Hewitt, summarized in the maximum that a large risk is not an independent combination of smaller risks. Basically risk conditions change and there are common influences, producing dependence across an organization and increasing risk. The implication for credibility is that full credibility is never achieved. Related size effects result if large risks display less variability from the average. The text worked out the formulas in the case of a single year of observation, as they got very messy otherwise. Howard Mahler later worked out all the gory details of the multi-year case.

The comparison to empirical Bayes approaches, worked out earlier by Morris and Van Slyke, was also extended. Credibility is regarded as non-parametric but uses the minimization of squared error, which is the parametric result of normal distribution assumptions. Empirical Bayes just assumes normality to begin with. In practice the insistence on non-parametric methods creates a problem for credibility theory when estimating the parameters. As Jose Garrido once aptly described it,  the methodology first pretends you know the variances and finds an unbiased estimator of the credibility factor in that case. Then you plug in unbiased estimators of the variances. The problem is that one of those variances is in the denominator, and the reciprocal of an unbiased estimator is not an unbiased estimator of the reciprocal. This yields biased credibility formulas, which empirical Bayes avoids by parametric methods.

The adjustment in the normal case is to increase 1 - Z by (N-1)/(N-3) where N is the sample size. I argue that for other distributions the correction is likely to be even greater than this, so being non-parametric is not a valid excuse for not doing it. However there are cases where the fact that 1 - Z would be capped to stay in [0,1] reduces the bias and makes the correction unnecessary. This could probably use further research.

Another connection made in that chapter is the relationship of credibility to regression. When you estimate an individual mean as Z times its sample value plus 1 - Z times an overall mean, that is a regression for the next observation. You could redo the regression after the next observations come in, and compare the credibility regression based on structural assumptions to the actual regression after the data is available. In that chapter I do this for an empirical Bayes example with batting averages.

The textbook chapter contained a section on why credibility does not work for heavy-tailed distributions and shows that taking logs first helps a lot in that case. Think of workers comp rates that vary from 10¢ per $100 of payroll for architects to $100 per $100 for window washers on the skyscrapers they design. Credibility weighting with an average rate of $5 is not going to be that useful, but in logs you are minimizing squared relative errors, which would be more meaningful. That portion did not make it into the Forum article, but survives today in the Loss Models textbook. In the 3rd edition, examples 20.29 and 20.30 are based on this work, but have actually worked out some exact distributions that I had simulated.

A credibility approach to loss reserving is found in my 1989 IME paper "A Three-way Credibility Approach to Loss Reserving." However subscriber access to IME is needed for this. Basically it credibility weights the pegged incurred from ratemaking with both the chain ladder and Bornheutter-Ferguson estimates. It is a generalization of Ira Robbin’s similar paper on claim counts, which I had partially generalized in my review in the 1986 PCAS.

One application of credibility is experience rating. In a 1987 article "Experience Rating - Equity and Predictive Accuracy,"  in NCCI Digest, I argue that it is fair to charge insureds for past experience to the extent it is predictive of future experience. An empirical example is found in "A Comparative Analysis of Most European and Japanese Bonus-malus Systems: Extension," in the 1991 Journal of Risk and Insurance. Basically I find that the auto bonus-malus plans studied actually did not give enough weight to past accident experience.

All of that is from the least-squares approach to credibility. My paper "Classical Partial Credibility with Application to Trend," (Dorweiler Prize winner) from the 1986 PCAS, addresses a number of issues in the limited fluctuation approach, including a probabilistic interpretation of the square-root rule, adjusting credibility for the NP instead of the normal approximation, and application of the paradigm to trend credibility.

  Loss Distributions

In 1983 when I published my PCAS paper on transformed beta and gamma distributions, James McDonald published the same families of distributions in an economics journal, using the more catchy moniker GB2 for what I had called the transformed beta. We had many of the same results but he had one key one I did not: the lognormal is a limiting case of all the general families. My terminology made it into "Loss Models," but much of academia knows only of the GB2. + Expand


Later on, Rodney Kreps reparameterized these distributions in a way that made the distribution functions more awkward but the interpretation of the parameters more meaningful. Where Loss Models would have the GB2 density f(x) expressed with the incomplete beta function as β(τ, α; 1/u), with u = 1 + (θ/x)γ, Rodney had it as β(τ/γ ,α/γ ; 1/u) with the same u. What this does is make the range of k for which the kth moment exists =   (-τ, α). Thus α now by itself gives the tail strength, and τtells how many negative moments exist. The number of negative moments turns out to determine the basic shape of the distribution. This is all laid out in my 2003 Winter Forum paper "Effects of Parameters of Transformed Beta Distributions." Besides these effects of α and τ, γmoves around the middle of the distribution and θis a scaling parameter.        

Kreps parameterization also makes it easy to find the limiting distribution when γ→ ∞, which I had missed and McDonald may have too. This limit turns out to be a power curve that meets an inverse power curve (i.e., simple beta meets simple Pareto). It also makes it clear that the simple Pareto is a limiting case of the GB2, but that can also be seen by taking a limit of the inverse transformed gamma.

A practical problem in fitting distributions to data is that actuaries typically have data excess of a number of different retentions. We have been taught how to adjust the likelihood function in this case, basically by dividing each observation’s probability by the survival function at its retention. This procedure can be improved upon, however, if we also know the exposures at each retention. Then a joint likelihood for frequency and severity can incorporate the additional information. This is presented in my 2003 Winter Forum paper "MLE for Claims with Several Retentions."  There may be typos in the formulas however.

When there is little or no data, actuaries have often used improvised rules for increased limits factors. A classic is Riebesell’s method, which uses geometrically increasing factors. Mack and Fackler found that this is consistent with an almost Pareto severity. The paper "Riebesell Revisited" lays out some of the issues. (Link to - will send)

  Risk Loads in Ratemaking

My first CAS paper was "Profit/Contingency Loadings and Surplus: Ruin and Return Implications," in the 1979 call papers. Based on an idea of Charles Hachemeister, I worked out the capital and profit load that would simultaneously meet a return and a solvency criterion. This was conceptually similar to a later paper of Kreps. In the 1983 PCAS I took a look at finding realistic utility functions for pricing insurance in "Utility with Decreasing Risk Aversion."+ Expand


There are problems with using utility theory for corporations, however, including the fact that they tend to have diverse and diversified ownership. In 1991 I applied the arbitrage-free pricing approach of using means under transformed probabilities for pricing purposes. The paper "Premium Calculation Implications of Reinsurance without Arbitrage," seemed to arose some controversy, although it was hardly the first actuarial paper to suggest such an approach.

One of the objections was that the insurance market is incomplete (you can’t short others’ policies, for instance) so you do not have to worry about arbitrage. This is incorrect, however. In a complete market the no-arbitrage condition uniquely determines prices, which it does not in an incomplete market, but vigorous competition in itself is enough to prevent arbitrage from being widely available.

Another objection was that arbitrage-free prices are additive, and so violate the benefits of pooling risk. I would counter, however, that in a competitive situation, the benefits of pooling go mostly to the customers. Imagine, for instance, that a fleet of 100 cars gets less profit load than a portfolio of 100 cars built up individually. Then the insurer can reinsure the portfolio completely for an arbitrage profit. If this were possible, soon the profit loads on the individual policies would be eaten away to get the arbitrage, eventually resulting in no possible arbitrage. However, to the extent that pooling makes the insurer more financially sound, the customers are getting better risk protection, and the insurer should get some of the benefit of pooling for providing a more valuable product.

A third objection, attributed eventually to Thomas Mack, was that when you move probabilities towards the more adverse outcomes to get a higher mean, you are moving it away from the lower losses, and you can thus construct contracts that would get negative risk loads. The buyback of a franchise deductible turns out to be such a contract. Although not likely to be a problem in practice, it is a bit disconcerting to have a methodology that is always faced with some negative risk loads. This problem has now been solved by doing joint transforms of frequency and severity probabilities, discussed below. Basically, mean frequency is increased more than any severity probability is decreased, so no negative loads are possible.

My paper used scale transforms as an example, but that is not a good transform for pricing. Shaun Wang did a number of papers introducing other transforms, such as the proportional hazards transform and the Wang transform. A paper by Møller discussed joint transforms of frequency and severity, such as the minimum martingale transform and the minimum entropy martingale transform. In the 2004 PCAS my discussion paper "Discussion of Distribution-Based Pricing Formulas Are Not Arbitrage-Free," based on a paper of David Ruhm, reviewed the joint transforms as well as discussed a point that Ruhm had helped clarify: you do not get arbitrage-free prices by transforming the distribution of each contract’s results - rather you have to transform the underlying distribution of basic losses, then apply the transformed probabilities to each contract.

In the 2004 AFIR Colloquium paper "Market Value of Risk Transfer: Catastrophe Reinsurance Case", the minimum martingale and minimum entropy martingale transforms are compared to actual reinsurance pricing. The minimum entropy transform seems to work a bit better. Still market price is not the end of the story, as company reservation prices might be different, especially because specific risk is important in insurance. That is why I recommend calibrating the probability transforms to company risk, not market risk, as discussed in "Strategic Planning, Risk Pricing and Firm Value", from the 2009 Astin Colloquium.



  Mortality Risk Modeling

Mortality patterns are evolving so projection of mortality curves is necessary, but risky, for lines of business that involve mortality patterns, including workers compensation. It turns out that modeling mortality risk is a lot like modeling casualty loss reserve risk. The data can be arranged in a triangle where the rows are year of birth, the columns are age at death, and the SW - NE diagonals are year of death, which is exactly analogous to a loss development triangle. However it is more typical to put the data into a rectangle, with the rows the year of death, which makes the NW - SE diagonals the year of birth, which is called "cohort" in the mortality literature. (Technically the cohort is year of death minus age at death. Due to differences in when in the year the person was born and died, the cohorts can actually range over two calendar years. In casualty triangles this vagueness typically applies to the payment lag, as accident year and calendar year are usually precise.)+ Expand


A standard model for mortality rates is the Lee-Carter (LC) model, which postulates a fixed mortality curve, to which a calendar-year level parameter is applied. The changes in the calendar-year levels are the trends. The levels are modified by an age-at-death factor, which allows the mortality trend to be faster or slower at different ages. However it appears that the mortality patterns have been changing in more complex ways than the LC model can account for. In the US this is particularly the case for males. A modification of the LC model includes cohort effects. This is directly analogous to models like Zehnwirth’s for trends in all three directions.

"Mortality Trend Risk" from the 2010 ERM Symposium, looks at the LC model with and without cohort effects, applied to US male and female mortality patterns. The LC model has problems with the changes in shape of the mortality curve, but adding cohort effects leads to much better fits. Unfortunately however, it appears that the cohort parameters are misleading. Many of the oldest and newest cohorts have only a few observations, affecting points in the NE and SW corners of the rectangle, respectively. There is nothing to constrain the model from using fairly extreme values for those cohort parameters, which are then able to capture the change in shape of the mortality curve, but seem very unlikely to apply to the unobserved ages of death for the cohorts.

The paper also looks at distributions of the residuals of the fit, and finds that the Poisson distribution is not a good fit. More highly dispersed distributions, including various forms of the negative binomial and Sichel distributions, are found to fit much better. Parameter estimation error distributions and correlations are estimated by the inverse of the information matrix from MLE. It is well known that calendar-year trend projects a trend in both the other directions, so the calendar-year, cohort, and morality-by-age parameters are correlated. For female mortality these correlations are so strong that the individual parameters are virtually meaningless when cohort effects are included, and the majority of the parameters are not statistically significant, even though a sizable increase in the loglikelihood over the LC model is produced. Possibilities for other models that might be able to address these issues are included, but this remains an area where more research is needed to have useful models.

  ERM Model Application Issues

One of the applications of ERM modeling is determining the level of capital to hold. However as policyholders have a degree of capital sensitivity in their risk aversion attitudes, competitive pressures also play a role in finding the right level of capital. More and more companies are using their internal capital models to quantify risk levels of business units, such as lines of business, chiefly through allocating risk measures and capital to business units. This can be used, for instance, to analyze risk transfer (e.g., reinsurance ceded) alternatives. Returns on allocated capital are also used to compare the value added of the business units, which leads to setting target returns by unit.+ Expand


Thus capital allocation broadens the modeling focus from overall risk of the portfolio to a pricing function involving the value of risk. This has implications for the risk measures used and the allocation methodology. Tail risk measures, like VaR and TVaR, provide informative benchmarks for overall capital. For instance, capital that is 3 times VaR99% or 2 times TVaR99.6% might be regarded as strong. In any case, such multiples can be used to benchmark capital strength. On the other hand, tail measures are not all that useful for risk pricing. Any risk taken can end up being painful, and there should be some charge associated with it. This suggests that risk measures used for pricing should take into account any potential variability from expected loss levels.

TVaR is the expected loss conditional on the losses exceeding some threshold. One of its weaknesses is that it is linear in such losses, which does not reflect typical risk aversion. An alternative is risk-adjusted TVaR, which is the conditional excess mean plus a portion of the conditional excess standard deviation. This treats larger losses more adversely, which is more realistic. Taken excess of target profit levels, it can include all loss potential while still weighting bigger losses more.

For very large losses, however, second moment risk measures fail to capture the level of risk aversion seen in market prices. An alternative is distortion measures, which use the mean of risk-adjusted probability measures. These can be tuned to provide higher charges for the most extreme loss levels. They also fit into risk pricing theory. However it turns out that it is not right to tune these measures to market prices, as each company’s own risk situation is important. Both customer risk aversion and the high cost of raising new capital make company-specific risk financially consequential. Thus the risk adjustments have to be tuned to company risk levels and profit targets, not market levels.

A more detailed discussion can be found in "Strategic Planning, Risk Pricing and Firm Value"), from the 2009 Astin Colloquium.

  Beyond GLM

General linear models (GLM) provide a useful tool for many actuarial applications, but they have distributional restrictions (exponential family) that often lead to suboptimal fits. Users of GLM software typically have no way to discern the effects of such constraints. As computer speeds have increased and flexible optimization routines have been released, the calculation advantages of the GLM form have become less important, which makes it possible to go beyond the exponential family to find distributions that provide better fits. In many cases the parameters of whatever covariates are used to fit the means of the data cells do not change a great deal when this is done, but ranges of outcomes are increasingly important in risk analysis, and getting the right distributional assumptions can make a significant difference in the ranges estimated. + Expand


The distributions in the exponential family are characterized by the relationship between the variance and mean of the cells. In some of the well-known distributions, the variance is a multiple of a power of the mean, and that power determines the member of the exponential family. For instance, the power zero indicates a normal distribution, the power 1 is a compound Poisson with constant severity, 2 is a gamma, 3 is an inverse Gaussian, and a power between 1 and 2 is a compound negative binomial with a gamma severity (with a relationship forced between the frequency and severity means). The reason that this can be unduly limiting is that each of these distributions has a particular shape, and there is no way within the exponential family to get the shape of one distribution with the variance-mean relationship of another. In fact for all the distributions noted above, the ratio of the coefficient of skewness to the coefficient of variation is exactly the power of the mean that gives the variance.

In the ASTIN Bulletin 2007 paper "Generalized Linear Models beyond the Exponential Family with Loss Reserve Applications", methods are laid out for starting with any distributional form for each cell but altering the relationship of the parameters among the cells in order to get the variance proportional to any power of the mean. The power in fact can be fit as part of the MLE algorithm. Thus for instance each cell may have a lognormal distribution, but the mean and variance may be independent, as they are in the normal distribution. Usually the lognormal variance is proportional to the square of the mean, which implies that it is not in the exponential family, since that mean-variance relationship is taken by the gamma.

The fitting must be done by MLE instead of using GLM software, but this is very easily done with modern software. For a novice it may in fact be easier to learn the optimization routines than the GLM programs.

Loss reserving is a good application of this approach. Since paid or incurred losses typically follow a compound frequency-severity distribution, it is fairly usual to find the variance of a cell to be somewhere around the mean raised to a power well below 2. But the skewness is sometimes relatively high, which might usually indicate a gamma, inverse Gaussian, or lognormal distribution. With this methodology you could combine a distribution like that with whatever mean-variance relationship that fits.

A version of this approach for discrete distributions, emphasizing the negative binomial, Poisson-inverse Gaussian, and Sichel distributions, is discussed in Appendix 1 of the paper "Mortality Trend Risk".








Copyright 2010.  All Rights Reserved.  Gary Venter. Web design by eMarket 2.0