ERM Flashcards — Part 3 FINAL

Risk Modeling

JJ
11 min readAug 28, 2017

Final versions: [Pt.1] [Pt.2] [Pt.3] [Pt.4] [Pt.5] [Pt.6]

  • [15] Dynamic financial analysis: Models enterprise risks to output cashflows for projecting balance sheets and profit/loss accounts
  • [15] Financial condition reports: Considers risks and new business expectations to report the current and future solvency positions of a company
  • [15] ALM: Asset liability matching is a method of projecting assets and liabilities within the same model, using consistent assumptions, to assess how well the assets match the liabilities and to understand the probable evolution of cashflows
  • [15] Black swan event (3): 1. event is a surprise; 2. has a major impact; 3. rationalized in hindsight, relevant data was available but unaccounted for
  • [15] Possible limitations of data for risk assessment (4): 1. limited in volume; 2. heterogeneous (not applicable); 3. biased positive (survivorship bias); 4. alternative data sources could introduce new issues (lower reliability)
  • [15] Pearson’s rho: Measure of linear dependence between variables, takes values in the range [-1,1]
  • [15] Advantage of linear correlation: Value is unchanged under the operation of strictly increasing linear transformations; rho(a+b*X, c+d*Y)=rho(X,Y)
  • [15] Limitations of linear correlation (5): 1. value changes for non-strictly increasing linear transformations; 2. valid only if marginal distributions are jointly elliptical; 3. not defined if var(X) or var(Y) is infinite; 4. variables can be linearly uncorrelated but non-linearly correlated; 5. joint distribution may not be attainable given marginal distributions of X and Y and rho
  • [15] Advantage of rank correlation over linear correlation: Linear correlation is dependent on both the joint distribution and marginal distributions, rank correlation of a bivariate distribution is independent of marginal distributions
  • [15] Deterministic model: Model that uses assumptions that predetermine variable values (no random element)
  • [15] Reasons to run sensitivity analysis (3): 1. develop understanding of risks; 2. provide insight into the dependence of outputs on subjective assumptions; 3. satisfy supervisory authority requirements
  • [15] Key limitation of sensitivity analysis: No probabilities are assigned to options used
  • [15] Meaning of scenario in the context of scenario analysis: Set of model inputs that represents a plausible and internally consistent set of future conditions
  • [15] ERM scenario analysis (4): 1. decide top-down on the scenarios to be modeled; 2. establish the impact on risk factors (model inputs) and run models; 3. take action based on results (mitigation strategies or early warning indicators); 4. review scenarios to ensure relevance
  • [15] Advantages of scenario analysis (4): 1. facilitates evaluation of potential impacts of plausible future events; 2. not restricted to past events; 3. provides useful information to supplement traditional models; 4. facilitates production of action plans
  • [15] Disadvantages of scenario analysis (4): 1. potential complexity as a process; 2. reliance on gathering hypothetical extreme but plausible events; 3. uncertainty of whether scenarios are comprehensive; 4. absence of probability assignments
  • [15] Advantages of stress testing (3): 1. ability to compare impact of same stresses on different companies; 2. explicit examination of extreme events that might otherwise not be considered; 3. can use to assess suitability of response strategies
  • [15] Disadvantages of stress testing (3): 1. subjective which assumptions to stress and degree of stress; 2. assigns no probabilities; 3. only looks at extreme events, needs to be coupled with other methods
  • [15] Business continuity management: A program in most businesses to ensure they can continue to operate in the face of disaster or extreme events (usually in the context of operational risks)
  • [15] Key benefit of stochastic modeling: Provides a probability distribution for model outputs (model is run repeatedly with uncertain inputs)
  • [15] Advantages of historical simulation (4): 1. applicable to many situations; 2. does not require large amounts of past data (sample with replacement); 3. does not require specifying probability distribution for inputs; 4. reflects characteristics of past (real) data
  • [15] Disadvantages of historical simulation (4): 1. need relevant past data; 2. assumes past is indicative of future; 3. does not account for inter-temporal links (auto-correlations); 4. may underestimate uncertainty (what actually happened is a subset of what could have happened)
  • [15] Advantages of Monte Carlo simulation (5): 1. widely available computer packages can do most of the work; 2. increasing the number of simulations will improve accuracy; 3. possible to simulate interdependence of risks; 4. widely understood technique using relatively simple math; 5. can model complex financial instruments
  • [15] Disadvantages of Monte Carlo simulation (2): 1. random selection of parameter values may lead to simulations that are not representative of all possibilities; 2. large number of simulations can be time consuming
  • [15] Pros and cons of factor-based approach: Causal links between variables are explicitly described within the model so this aids in understanding what drives key variables, but the additional effort required may not be justifiable
  • [15] Desirable characteristics of pseudo-random numbers for simulation (4): 1. be replicable; 2. repeat only after a long period; 3. be uniformly distributed over a large number of dimensions; 4. exhibit no serial correlation
  • [16] Univariate discrete distributions (3): 1. binomial; 2. negative binomial; 3. Poisson
  • [16] Univariate continuous distributions with values from -Inf to Inf (4): 1. normal distribution; 2. normal mixture distribution; 3. student’s t-distribution; 4. skewed t-distribution
  • [16] Univariate continuous distributions with only non-negative values (9): 1. lognormal; 2. Wald; 3. chi-squared; 4. gamma and inverse gamma; 5. generalized inverse gamma; 6. exponential; 7. Frechet; 8. Pareto; 9. generalized Pareto
  • [16] Univariate continuous distributions with finite range of positive and/or negative values (2): 1. uniform; 2. triangular
  • [16] Binomial distribution: Bin(n,p) is the number of successes of n independent and identical Bernoulli trials with probability of success p (limit distribution as n approaches Inf is the normal distribution)
  • [16] Type 1 negative binomial distribution: Number of trial on which the rth success occurs
  • [16] Type 2 negative binomial distribution: Number of failures before rth success occurs
  • [16] Poisson distribution: Number of events that occur in a specified interval of time, assuming events occur one after another in time, in a well-defined manner, and at a constant rate
  • [16] Types of normality tests (2): 1. graphical (Q-Q plots); 2. statistical (Jarque-Bera, Anderson-Darling, Shapiro-Wilk, D’Agostino)
  • [16] Key benefit of the normal mean-variance mixture distribution over normal distribution: Introduces randomness into mean and variance
  • [16] Wald distribution: Time taken for a Brownian motion process to reach a given value (has good aggregation properties)
  • [16] Chi-squared distribution: Sum of gamma-squared independent variables taken from a standard normal distribution
  • [16] Chi-squared test: Examines if a set of assumed probabilities is correct compared to actual number of observations in each category, where observations can each be associated with one of N categories
  • [16] Exponential distribution: Expected waiting times between events of a Poisson process
  • [16] Features of exponential distribution that limit its ERM application (3): 1. monotonically decreasing nature; 2. single parameter; 3. low probabilities associated with extreme values
  • [16] Features of Pareto distribution (2): 1. monotonically decreasing; 2. tails follow a power law with shape parameter determining the power
  • [16] Features of triangular distribution (3): 1. useful in cases where upper bound, lower bound and most likely values are known; 2. mean is the average of the parameter values; 3. can be positively or negatively skewed
  • [16] Limitations of using multivariate normal distributions for ERM (2): 1. tails are too thin (not enough weight to univariate marginal distributions and joint extreme outcomes); 2. strong form of symmetry (elliptical)
  • [16] Significance of shape parameter for multivariate t-distribution: Fatness of tails is determined by number of degrees of freedom selected (smaller gamma leads to fatter tails, as gamma approaches Inf distribution tends to multivariate normal)
  • [16] Multivariate spherical distribution: Marginal distributions are identical, symmetric and uncorrelated with each other
  • [16] Multivariate elliptical distribution: Fixed probabilities can be described by an elliptical relationship between variables, special case where correlation is zero results in a spherical distribution
  • [16] Uses of Pareto distribution: Model variables where the probability of an event falls in proportion to the magnitude of the event raised to a given power
  • [17] Importance of stationarity: Statistical properties (moments and relationship between observations in different periods) stay the same over time, so it means that past data can be used to model the future
  • [17] Weak stationarity of order n: Moments of subsets of the process are equal and finite up to the nth moment
  • [17] Covariance stationarity: Weakly stationary of order 2 (mean and variance of the process are constant and covariance depends only on the time difference)
  • [17] White noise process: Every element is uncorrelated with any previous observation and process oscillates randomly around zero
  • [17] Trend stationary: Observations oscillate randomly around a trend line that is a function of time only
  • [17] Influence of alpha1 on an AR(1) process: If |alpha1|<1, process is mean reverting (thus covariance stationary); If alpha1=1, process becomes a random walk; If |alpha1|>1, process becomes unstable
  • [17] AR(p): Each observation is a linear combination of the p previous values plus a random error
  • [17] ACF/PACF for AR(p): ACF tails off and PACF cuts off for lags>p
  • [17] ACF/PACF for MA(q): PACF tails off and ACF cuts off for lags>q
  • [17] ACF/PACF for ARMA(p,q): ACF tails off with kink at lag q and PACF tails off with kink at lag p
  • [17] Chow test: Fit model to whole series and also two sub-series (either side of suspected break). Test statistic based on sum of squared residuals has an F-distribution with null hypothesis that there is no structural break
  • [17] Heteroskedastic: Variance changes over time
  • [17] ARCH: Autoregressive conditional heteroskedastic models are based on strictly stationary white noise process with zero mean and unit standard deviation, but constructed so that standard deviation varies over time (exhibits conditional heteroskedasticity and volatility clustering)
  • [17] Volatility clustering: Large change in value is often followed by a period of high volatility
  • [17] GARCH: Generalized ARCH, volatility is allowed to depend on previous values of volatility in addition to previous values of the process (periods of high volatility tend to last a long time)
  • [17] Flexible structure using ARMA and GARCH models: Fit ARMA model first and then fit a GARCH to residuals (if residuals are not just white noise)
  • [17] Most common technique for fitting GARCH models: Method of maximum likelihood
  • [17] Square root of time rule: Annual VaR is approximately square root of 12 times monthly VaR (scaling is inaccurate if data is not iid normally distributed, in this case better to parameterize a stochastic model for a shorter timescale then derive longer timescale statistics from simulations)
  • [18] Marginal distribution: In the context of joint distribution functions, the individual distribution of each of the factors in isolation
  • [18] Copula: Multivariate cumulative distribution function expressed in terms of the individual marginal cumulative distributions
  • [18] Key benefit of copulas: Each component can potentially be adjusted independently of the others
  • [18] Basic properties of copulas (3): 1. increasing the range of values for a variable must increase the probability of observing the combination within that range; 2. integrating out all but one variable will result in the marginal distribution of that variable; 3. a valid probability is produced for any valid combination of parameters
  • [18] Sklar’s theorem: If marginal cumulative distributions are continuous, then a copula exists and is unique
  • [18] Scarsini’s properties of a good measure of concordance / association (7): 1. completeness of domain; 2. symmetry; 3. coherence; 4. unit range [-1,1]; 5. independence (if X and Y are independent, concordance=0); 6. consistency (if X=-Z then M(x,y) = -M(z,y)); 7. convergence
  • [18] Main types of copulas (3): 1. fundamental; 2. explicit; 3. implicit
  • [18] Fundamental copulas: Represent three basic dependencies that a set of variables can display (independence, perfect positive dependence, perfect negative dependence)
  • [18] Explicit copulas: Have simple closed form expressions, e.g. Archimedean
  • [18] Implicit copulas: Based on well known multivariate distributions but no simple closed form expressions exist, e.g. Gaussian
  • [18] Examples of fundamental copula (3): 1. independence; 2. co-monotonicity; 3. counter-monotonicity
  • [18] Upper and lower tail dependence in the product copula: Variables are independent so no tail dependence (=0)
  • [18] Upper and lower tail dependence in the minimum copula: Perfect positive dependence so tail dependence =1
  • [18] Upper and lower tail dependence in the maximum copula: There is no special relationship when both are low or both are high (=0), perfect negative dependence so tail dependency between variables will only be when variables are at opposite ends
  • [18] Frechet-Hoffding bounds: The co-monotonicity and counter-monotonicity copulas represent the extremes of the possible levels of association, they are therefore the upper and lower boundaries of all copulas
  • [18] Examples of Archimedean copulas (4): 1. Gumbel; 2. Frank; 3. Clayton; 4. Generalized Clayton
  • [18] Key advantage of Archimedean copulas: Relatively simple to use, because they define a closed form probability distribution and so avoid the need for integration
  • [18] Key disadvantage of Archimedean copulas: The small number of parameters involved means their application to heterogeneous groups of variables is limited
  • [18] Application of Gumbel copula: Only upper tail dependency makes it suitable for modeling situations where associations increase for extreme high values (e.g. losses from a credit portfolio, where losses are recorded as positive values)
  • [18] Applications of Frank copula: No tail dependency and symmetric form, can be used to model joint and last survivor annuities and exchange rate movements
  • [18] Applications of Clayton copula: No upper tail dependency but potential lower tail dependency (when alpha>0) makes it suitable for modeling situations where associations increase for extreme lower values but not extreme higher values (e.g. returns from portfolio of investments)
  • [18] Comprehensive copula: Joint distribution can reflect any dependence from perfect positive dependence to perfect negative dependence
  • [18] Tail dependencies for the bivariate Gaussian copula: If |rho|<1, then copula has zero tail dependencies
  • [18] Disadvantages of Gaussian copula (2): 1. lack of tail dependency; 2. defined by only a single parameter
  • [18] Advantages of Student’s t-copula over Gaussian copula: The two parameters in a Student’s t-copula enable the degree of dependence at the extremes to be controlled independently of the correlation matrix by varying the number of degrees of freedom
  • [18] Tail dependencies of the Student’s t-copula: For finite gammas, copula has both upper and lower symmetrical tail dependencies (small gamma means greater associations at all four extreme corners); For infinite gammas, this becomes the Gaussian copula
  • [19] Ways to fit a distribution or copula to data (2): 1. maximum likelihood; 2. method of moments
  • [19] Ways to fit a model to data (4): 1. OLS; 2. GLS; 3. SVD; 4. PCA
  • [19] Method of moments: Establish parameters empirically by equating sample moments to population moments
  • [19] Advantage of method of moments: More straightforward to use than alternatives
  • [19] Disadvantages of method of moments (2): 1. parameters are not necessarily the most likely ones; 2. parameter values may be outside their acceptable ranges
  • [19] Advantages of MLE (3): 1. only generates parameter values that are within acceptable ranges; 2. bias in parameter estimates reduce as number of observations increase; 3. distribution of parameter estimates tends toward normal distribution
  • [19] OLS process: Minimize sum of squared error terms, there is a closed form solution to this minimization problem
  • [19] OLS assumptions (6): 1. there is a linear relationship between variables; 2. inverse of data exists (no column is a linear transformation of other columns); 3. explanatory variables should not be correlated with errors; 4. error terms are not correlated with each other; 5. error terms have constant and finite variance; 6. error terms are normally distributed (necessary for valid significance tests)
  • [19] GLS versus OLS error terms: Error terms can have non-constant variance and be correlated with each other
  • [19] Tests of overall model fit (2): 1. coefficients of determination (R-squared or adjusted R-squared); 2. if error terms are normally distributed, can use F-test with null hypothesis that regression coefficients are 0
  • [19] Tests of individual regression coefficient fits: If error terms are normally distributed, variance of errors is calculated and can use t-test with null hypothesis that the individual regression coefficient is 0
  • [19] Likelihood ratio test: Used to test nested models for whether additional variables result in significantly improved explanatory power
  • [19] Likelihood ratio test versus Information criteria: IC is not restricted to nested models, but IC can only rank alternative models and do not quantify the statistical significance of differences between models
  • [19] AIC versus BIC (2): 1. lower values indicate better fit for both; 2. BIC penalized additional variables more so it tends to result in less complex models being selected
  • [19] Advantage of PCA for modeling: Readily facilitates stochastic projections
  • [19] Disadvantage of PCA for modeling: Model parameters do not necessarily have any intuitive interpretations and so explanatory powers are limited
  • [19] SVD versus PCA (3): 1. both assume linear relationships between variables; 2. PCA requires identification of covariance matrix, SVD does not; 3. key advantage of SVD is that it operates on original data with no requirement to identify independent variables for regression
  • [19] Graphical diagnostic test for model selection (4): 1. Q-Q plots; 2. histograms with superimposed fitted density functions; 3. empirical CDFs with superimposed fitted CDFs; 4. auto-correlation functions of time series

--

--