The Model Development Lifecycle (MDLC): Building and Testing the Model

by | Oct 3, 2024 | Model Development | 0 comments

This blog series explores the 10 steps of the Model Development Lifecycle so that you understand the components of model development and maintenance as well as where potential sources of model risk can reside. Read the first post here for an introduction to the series and why we are likening this process to the creation of a cookie recipe as a “real world” example.

In this blog about the MDLC the focus is on: Model Development.

What is Model Development?

Model development refers to the iterative process of finding explanatory variables and then building and testing multiple models until a single model that best meets the desired criteria is created.

Cookie’s Story: Recipe R&D

Cookie scanned her baking table. Before her were the ingredients that had produced a well-textured and flavorful cookie dough. They appeared to work well together and should do a good job masking each other’s lower quality should ingredient swaps be required.

Now that she had the ingredients, her next step was to create the recipe. This was going to be an iterative process and, after glancing at the kitchen clock, she figured it would be a long day.

While she had made many recipes in the past, this one was going to be different because its purpose was different. She needed to find the right amount of ingredients that ensured a high-quality chocolate chip cookie even when some of the ingredients might be low quality. She grimaced. Thinking that still made her uncomfortable despite telling herself over the past few hours she was only creating a “break glass in case of emergency” recipe.

She mixed, baked, and evaluated the results. Some cookies looked good but tasted like sawdust. Others tasted good but spread too much. Others burnt around the edges but left a gooey center. She kept baking and taking notes.

Late afternoon turned to dusk and dusk to evening. Around 8 PM she pulled a tray of cookies from the oven and felt a quake of excitement. After letting them cool, she tasted a cookie and then knew she had the right recipe. The flavor was exceptional. The appearance—color and spread—spot on. And, most importantly, the amount of ingredients made sense. It wasn’t calling for something goofy like a cup of vanilla and a tablespoon of sugar.

She transferred the ingredients and their amounts to a fresh sheet of paper. It was now time to see how it performed using different quality ingredients. She made a circuit of her kitchen, grabbing Grade A eggs and butter, Grade B vanilla, white sugar with lower purity, and other ingredients.

She baked more cookies using the same recipe but varying the quality of ingredients. With each batch, she tested some and bagged the remainders, which she stored in the freezer. The frozen ones would be useful later when she came up with her comparison metrics.

So far, the recipe did well on the ingredients in her pantry and would do well for the other bakeries. After all, they used the same suppliers. However, a supply chain disruption would force them to go to different suppliers. She needed to know the recipe could support ingredients other than those delivered to her bakeries.

An hour later she was making a batch of cookies with ingredients pulled from a local grocery store. While she was shopping, she had looked for different brands of lower-quality eggs and butter. She also grabbed some off-brand flour, sugars, and chocolate chips to really stress the recipe. She needed confidence that the recipe could handle different food grades and qualities of ingredients it had not been built on.

Just after midnight she pulled what she hoped was the final tray of cookies from the oven. After letting them cool, she sampled one. A smile broke across her face.

“Nailed it!” she exclaimed to an empty kitchen and then allowed herself to do a fist pump.

 She sighed, her joy fleeting. She needed to thoroughly document the recipe, what she used, her assumptions, the temperature and the time it took to bake. All that information would be needed for the recipe tester. The night wasn’t over yet.

Relating it to the MDLC

The fourth stage of the MDLC involves building and testing the model.

The goal at this stage is to find a conceptually sound model that best predicts an outcome based on the data provided. This stage involves model fitting, model testing, and model documentation.

  • Model fitting
    • Takes a model structure—e.g., linear regression—and identifies the variables that provide the best predictive power and, typically, make business sense.
  • Model testing
    • Involves running the model on data that wasn’t used during the fitting to make sure the predictive power remains at expected level.
  • Model documentation
    • Brings together all the information—e.g., the model structure, selected variables, test results, data and model limitations and assumptions—so that a person unfamiliar with the model can understand how it was created.

What data is used to test models?

Historical data is used to test the model but, typically, it’s different than the data used to build the model (i.e., “in sample”). Data not used for model building is classified as “out-of-sample”. It can be further categorized as “out of time” if the data comes from a historical period that was not used during model development.

Cookie created a new recipe (i.e., model development). She:

  • Defined the baseline relationships of the ingredients to ensure she produced a cookie that looked pleasing and was palatable (i.e., model fit).
  • Tested the recipe using the ingredients from another location (i.e., out-of-sample testing).
  • Documented the recipe and assumptions (i.e., model documentation).

After you develop a model, you need to make sure it works. Our next blog will explore the model validation and deficiency assessment stages of the MDLC, and see how Cookie tests her new recipe to ensure it holds up.

Jonathan Leonardelli, FRM, Director of Business Analytics for the FRG, leads the group responsible for business analytics, statistical modeling and machine learning development, documentation, and training. He has more than 20 years’ experience in the area of financial risk.

RELATED:

What is the Model Development Lifecycle, or, What’s Baking at FRG?

The Model Development Lifecycle (MDLC): Defining the Business and Model Objectives

The Model Development Lifecycle (MDLC): Data Assessment