Difference between Model Validation and Model Evaluation?

Yogesh Khurana
Yogesh Khurana’s Blogs
4 min readOct 13, 2019
Model Validation vs Model Evaluation

Model Validation

  • Model validation is defined within regulatory guidance as “the set of processes and activities intended to verify that models are performing as expected, in line with their design objectives, and business uses.” It also identifies “potential limitations and assumptions, and assesses their possible impact.”
  • Generally, validation activities are performed by individuals independent of model development or use. Models, therefore, should not be validated by their owners as they can be highly technical, and some institutions may find it difficult to assemble a model risk team that has sufficient functional and technical expertise to carry out independent validation. When faced with this obstacle, institutions often outsource the validation task to third parties.
  • In statistics, model validation is the task of confirming that the outputs of a statistical model are acceptable with respect to the real data-generating process. In other words, model validation is the task of confirming that the outputs of a statistical model have enough fidelity to the outputs of the data-generating process that the objectives of the investigation can be achieved.

The Four Elements

Model validation consists of four crucial elements which should be considered:

1. Conceptual Design

The foundation of any model validation is its conceptual design, which needs documented coverage assessment that supports the model’s ability to meet business and regulatory needs and the unique risks facing a bank.

The design and capabilities of a model can have a profound effect on the overall effectiveness of a bank’s ability to identify and respond to risks. For example, a poorly designed risk assessment model may result in a bank establishing relationships with clients that present a risk that is greater than its risk appetite, thus exposing the bank to regulatory scrutiny and reputation damage.

A validation should independently challenge the underlying conceptual design and ensure that documentation is appropriate to support the model’s logic and the model’s ability to achieve desired regulatory and business outcomes for which it is designed.

2. System Validation

All technology and automated systems implemented to support models have limitations. An effective validation includes: firstly, evaluating the processes used to integrate the model’s conceptual design and functionality into the organisation’s business setting; and, secondly, examining the processes implemented to execute the model’s overall design. Where gaps or limitations are observed, controls should be evaluated to enable the model to function effectively.

3. Data Validation and Quality Assessment

Data errors or irregularities impair results and might lead to an organisation’s failure to identify and respond to risks. Best practise indicates that institutions should apply a risk-based data validation, which enables the reviewer to consider risks unique to the organisation and the model.

To establish a robust framework for data validation, guidance indicates that the accuracy of source data be assessed. This is a vital step because data can be derived from a variety of sources, some of which might lack controls on data integrity, so the data might be incomplete or inaccurate.

4. Process Validation

To verify that a model is operating effectively, it is important to prove that the established processes for the model’s ongoing administration, including governance policies and procedures, support the model’s sustainability. A review of the processes also determines whether the models are producing output that is accurate, managed effectively, and subject to the appropriate controls.

If done effectively, model validation will enable your bank to have every confidence in its various models’ accuracy, as well as aligning them with the bank’s business and regulatory expectations. By failing to validate models, banks increase the risk of regulatory criticism, fines, and penalties.

The complex and resource-intensive nature of validation makes it necessary to dedicate sufficient resources to it. An independent validation team well versed in data management, technology, and relevant financial products or services — for example, credit, capital management, insurance, or financial crime compliance — is vital for success. Where shortfalls in the validation process are identified, timely remedial actions should be taken to close the gaps.

Model Evaluation

  • Model Evaluation is an integral part of the model development process. It helps to find the best model that represents our data and how well the chosen model will work in the future. Evaluating model performance with the data used for training is not acceptable in data science because it can easily generate overoptimistic and overfitted models. There are two methods of evaluating models in data science, Hold-Out and Cross-Validation. To avoid overfitting, both methods use a test set (not seen by the model) to evaluate model performance.
  • Hold-Out: In this method, the mostly large dataset is randomly divided to three subsets:
  1. Training set is a subset of the dataset used to build predictive models.
  2. Validation set is a subset of the dataset used to assess the performance of model built in the training phase. It provides a test platform for fine tuning model’s parameters and selecting the best-performing model. Not all modelling algorithms need a validation set.
  3. Test set or unseen examples is a subset of the dataset to assess the likely future performance of a model. If a model fit to the training set much better than it fits the test set, overfitting is probably the cause.
  • Cross-Validation: When only a limited amount of data is available, to achieve an unbiased estimate of the model performance we use k-fold cross-validation. In k-fold cross-validation, we divide the data into k subsets of equal size. We build models ktimes, each time leaving out one of the subsets from training and use it as the test set. If k equals the sample size, this is called “leave-one-out”.

Model evaluation can be divided to two sections:

  • Classification Evaluation
  • Regression Evaluation

--

--